A deep learning approach to photo–identification demonstrates high performance on two dozen cetacean species
Researchers can investigate many aspects of animal ecology through noninvasive photo–identification. Photo–identification is becoming more efficient as matching individuals between photos is increasingly automated. However, the convolutional neural network models that have facilitated this change need many training images to generalize well. As a result, they have often been developed for individual species that meet this threshold. These single-species methods might underperform, as they ignore potential similarities in identifying characteristics and the photo–identification process among species. In this paper, we introduce a multi-species photo–identification model based on a state-of-the-art method in human facial recognition, the ArcFace classification head. Our model uses two such heads to jointly classify species and identities, allowing species to share information and parameters within the network. As a demonstration, we trained this model with 50,796 images from 39 catalogues of 24 cetacean species, evaluating its predictive performance on 21,192 test images from the same catalogues. We further evaluated its predictive performance with two external catalogues entirely composed of identities that the model did not see during training. The model achieved a mean average precision (MAP) of 0.869 on the test set. Of these, 10 catalogues representing seven species achieved a MAP score over 0.95. For some species, there was notable variation in performance among catalogues, largely explained by variation in photo quality. Finally, the model appeared to generalize well, with the two external catalogues scoring similarly to their species' counterparts in the larger test set. From our cetacean application, we provide a list of recommendations for potential users of this model, focusing on those with cetacean photo–identification catalogues. For example, users with high quality images of animals identified by dorsal nicks and notches should expect near optimal performance. Users can expect decreasing performance for catalogues with higher proportions of indistinct individuals or poor quality photos. Finally, we note that this model is currently freely available as code in a GitHub repository and as a graphical user interface, with additional functionality for collaborative data management, via Happywhale.com.