Machine learning approach to estimate species composition of unidentified sea turtles that were recorded on the Japanese longline observer program
Also published in 2019 as IOTC-2019-WPEB15-42.
Unidentified species is the major source of uncertainties to evaluate the impact of bycatch on sea turtle populations, so we tried to estimate species composition of unidentified sea turtles from operational circumstance via machine learning approach. We used bycatch data from the Japanese scientific observer program, which includes 10,490 operations and catch records of 141 loggerheads, 75 olive ridleys, and 152 unidentified turtles. The random forest, which is a machine learning approaches, was conducted to estimate probability of the species identities (loggerhead or olive ridley). As training datasets, species-identified sea turtle bycatch number including set date, location, sea surface temperature and catch number of target and non-target species such as tunas, billfishes, other teleost fishes, sharks, and sea turtles. As a result, the probabilities of species identity were calculated. When the species was defined as identified (the probability larger than 0.7), the identified 111 turtles were identified as 16 loggerheads and 95 olive ridleys, and 41 could not be identified. We conclude that random forest approach will be helpful to improve the species estimation.