Primate Face Identification in the Wild
Pith reviewed 2026-05-25 10:03 UTC · model grok-4.3
The pith
Primate face identification improves by augmenting cross-entropy loss with a pairwise loss on image pairs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The PFID loss augments the standard cross entropy loss with a pairwise loss to learn more discriminative and generalizable features, thus making it appropriate for other related identification tasks like open-set, closed set and verification. State-of-the-art accuracy is reported on facial recognition of rhesus macaques and chimpanzees under the four protocols of classification, verification, closed-set identification and open-set recognition.
What carries the argument
The PFID loss, which adds a pairwise term on positive and negative image pairs to the usual cross-entropy objective.
If this is right
- State-of-the-art accuracy is achieved on rhesus macaques and chimpanzees for four identification protocols.
- The learned features support open-set recognition, closed-set identification, and verification tasks.
- The method directly targets the challenges of limited data and nuisance factors in wild images.
Where Pith is reading between the lines
- The same loss construction could be tested on other animal species that require individual tracking from camera-trap images.
- Integration with existing camera networks might allow continuous, automated population estimates without repeated field visits.
- The pairwise term's effectiveness likely depends on how representative the chosen image pairs are of real environmental variation.
Load-bearing premise
Training on positive and negative image pairs will produce features robust to pose, lighting, and occlusions even when training data is limited and environments are uncontrolled.
What would settle it
If a model trained only with standard cross-entropy loss matches or exceeds the PFID model's accuracy on the same primate test sets that contain large pose and lighting variation, the added benefit of the pairwise term would be falsified.
Figures
read the original abstract
Ecological imbalance owing to rapid urbanization and deforestation has adversely affected the population of several wild animals. This loss of habitat has skewed the population of several non-human primate species like chimpanzees and macaques and has constrained them to co-exist in close proximity of human settlements, often leading to human-wildlife conflicts while competing for resources. For effective wildlife conservation and conflict management, regular monitoring of population and of conflicted regions is necessary. However, existing approaches like field visits for data collection and manual analysis by experts is resource intensive, tedious and time consuming, thus necessitating an automated, non-invasive, more efficient alternative like image based facial recognition. The challenge in individual identification arises due to unrelated factors like pose, lighting variations and occlusions due to the uncontrolled environments, that is further exacerbated by limited training data. Inspired by human perception, we propose to learn representations that are robust to such nuisance factors and capture the notion of similarity over the individual identity sub-manifolds. The proposed approach, Primate Face Identification (PFID), achieves this by training the network to distinguish between positive and negative pairs of images. The PFID loss augments the standard cross entropy loss with a pairwise loss to learn more discriminative and generalizable features, thus making it appropriate for other related identification tasks like open-set, closed set and verification. We report state-of-the-art accuracy on facial recognition of two primate species, rhesus macaques and chimpanzees under the four protocols of classification, verification, closed-set identification and open-set recognition.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes the Primate Face Identification (PFID) method, which augments standard cross-entropy loss with a pairwise loss term to learn discriminative and generalizable features for individual primate face recognition in uncontrolled environments. It targets challenges of pose, lighting, and occlusions with limited data and claims state-of-the-art results on rhesus macaques and chimpanzees across four protocols: classification, verification, closed-set identification, and open-set recognition.
Significance. If the empirical results and robustness claims hold after proper validation, the work could support automated, non-invasive tools for primate population monitoring and wildlife conservation. The loss combination itself follows established supervised metric-learning patterns, so novelty would rest on the primate-specific application and any demonstrated gains on the four protocols.
major comments (2)
- [Abstract] Abstract: the assertion of 'state-of-the-art accuracy' on four protocols supplies no numerical results, baseline comparisons, dataset statistics (images per identity, total identities), or ablation isolating the pairwise term, rendering the central empirical claim unverifiable.
- [Abstract] Abstract: the claim that the pairwise loss produces representations robust to pose, lighting, and occlusions requires that positive pairs explicitly span those intra-identity variations; the text provides neither pair-sampling details nor per-identity image counts to establish this condition.
minor comments (1)
- [Abstract] The phrase 'inspired by human perception' is stated without elaboration on the concrete mapping to the loss or architecture.
Simulated Author's Rebuttal
We thank the referee for the detailed feedback. The comments highlight opportunities to strengthen the abstract's clarity and verifiability. We address each point below and will incorporate revisions to include additional quantitative details and methodological clarifications where the manuscript body already contains supporting information.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion of 'state-of-the-art accuracy' on four protocols supplies no numerical results, baseline comparisons, dataset statistics (images per identity, total identities), or ablation isolating the pairwise term, rendering the central empirical claim unverifiable.
Authors: We agree that the abstract would be strengthened by including key numerical results, baseline comparisons, and dataset statistics to allow immediate verification of the central claims. The full manuscript reports these in the Experiments section (including per-protocol accuracies, comparisons to standard cross-entropy baselines, total identities, average images per identity, and ablations isolating the pairwise term). In revision we will condense the most salient figures and statistics into the abstract while preserving its length constraints. revision: yes
-
Referee: [Abstract] Abstract: the claim that the pairwise loss produces representations robust to pose, lighting, and occlusions requires that positive pairs explicitly span those intra-identity variations; the text provides neither pair-sampling details nor per-identity image counts to establish this condition.
Authors: The manuscript states that the datasets were collected in uncontrolled environments and that positive pairs are formed from images of the same individual. To make the robustness argument explicit in the abstract, we will add concise statements on pair construction (random sampling of same-identity images that naturally include pose/lighting/occlusion variation) and report the per-identity image counts already tabulated in the dataset description section. This does not require new experiments, only clearer exposition. revision: yes
Circularity Check
No circularity; standard supervised loss on held-out evaluation
full rationale
The paper presents PFID as an augmentation of cross-entropy by a pairwise term to encourage discriminative features, then reports empirical accuracies on four protocols using held-out splits. No equation reduces a claimed prediction to a fitted parameter by construction, no uniqueness theorem is imported from self-citation, and no ansatz is smuggled via prior work. The central claim that the loss yields robustness is an empirical assertion, not a definitional identity, so the derivation chain remains independent of its inputs.
Axiom & Free-Parameter Ledger
free parameters (1)
- pairwise loss weight
axioms (1)
- domain assumption Deep networks trained with pairwise similarity objectives learn features invariant to common imaging nuisances when data are limited.
Reference graph
Works this paper leans on
-
[1]
Anand, S., Radhakrishna, S.: Investigating trends in human-wildlife conflict: is conflict es- calation real or imagined? Journal of Asia-Pacific Biodiversity 10(2), 154 – 161 (2017)
work page 2017
-
[2]
Anderson, C.J., Johnson, S.A., Hostetler, M.E., Summers, M.G.: History and status of in- troduced rhesus macaques (macaca mulatta) in silver springs state park, florida (2016), http://edis.ifas.ufl.edu/uw412
work page 2016
-
[3]
IEEE Transactions on Neural Networks and Learning Systems 27, 1997–2008 (2016)
Brahma, P.P., Wu, D., She, Y .: Why deep learning works: A manifold disentanglement per- spective. IEEE Transactions on Neural Networks and Learning Systems 27, 1997–2008 (2016)
work page 1997
- [4]
-
[5]
Cabral, S.J., Prasad, T., Deeyagoda, T.P., Weerakkody, S.N., Nadarajah, A., Rudran, R.: In- vestigating sri lanka’s human-monkey conflict and developing a strategy to mitigate the prob- lem. Journal of Threatened Taxa 10(3), 11391–11398 (2018) 3 http://smartconservationtools.org/ Primate Face Identification in the Wild 13
work page 2018
-
[6]
Crouse, D., Jacobs, R.L., Richardson, Z., Klum, S., Jain, A., Baden, A.L., Tecot, S.R.: Lemurfaceid: a face recognition system to facilitate individual identification of lemurs. BMC Zoology 2(1), 2 (2017)
work page 2017
-
[7]
Face Recognition: Primates in the Wild
Deb, D., Wiper, S., Russo, A., Gong, S., Shi, Y ., Tymoszek, C., Jain, A.: Face recognition: Primates in the wild. arXiv preprint arXiv:1804.08790 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[8]
Arcface: Additive angular margin loss for deep face recognition,
Deng, J., Guo, J., Xue, N., Zafeiriou, S.: Arcface: Additive angular margin loss for deep face recognition. arXiv preprint arXiv:1801.07698 (2018)
-
[9]
In: German Conference on Pattern Recognition
Freytag, A., Rodner, E., Simon, M., Loos, A., K ¨uhl, H.S., Denzler, J.: Chimpanzee faces in the wild: Log-euclidean cnns for predicting identities and attributes of primates. In: German Conference on Pattern Recognition. pp. 51–63. Springer (2016)
work page 2016
-
[10]
In: Pro- ceedings of the IEEE conference on computer vision and pattern recognition
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Pro- ceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016)
work page 2016
-
[11]
DenseNet: Implementing Efficient ConvNet Descriptor Pyramids
Iandola, F., Moskewicz, M., Karayev, S., Girshick, R., Darrell, T., Keutzer, K.: Densenet: Implementing efficient convnet descriptor pyramids. arXiv preprint arXiv:1404.1869 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[12]
In: Proceedings of the IEEE conference on computer vision and pattern recognition
Liu, W., Wen, Y ., Yu, Z., Li, M., Raj, B., Song, L.: Sphereface: Deep hypersphere embedding for face recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 212–220 (2017)
work page 2017
-
[13]
In: Multi- media (ISM), 2012 IEEE International Symposium on
Loos, A., Ernst, A.: Detection and identification of chimpanzee faces in the wild. In: Multi- media (ISM), 2012 IEEE International Symposium on. pp. 116–119. IEEE (2012)
work page 2012
-
[14]
In: Applications of Artificial Neural Networks in Image Processing III
Lu, H.M., Fainman, Y ., Hecht-Nielsen, R.: Image manifolds. In: Applications of Artificial Neural Networks in Image Processing III. vol. 3307, pp. 52–64. International Society for Optics and Photonics (1998)
work page 1998
-
[15]
IEEE transactions on pattern analysis and machine intelligence 31(4), 607–626 (2008)
Murphy-Chutorian, E., Trivedi, M.M.: Head pose estimation in computer vision: A survey. IEEE transactions on pattern analysis and machine intelligence 31(4), 607–626 (2008)
work page 2008
-
[16]
Annual Review of Environment and Resources 41(1), 143–171 (2016)
Nyhus, P.J.: Human–wildlife conflict and coexistence. Annual Review of Environment and Resources 41(1), 143–171 (2016)
work page 2016
- [17]
- [18]
-
[19]
European Journal of Wildlife Research61(3), 435–443 (Jun 2015)
Saraswat, R., Sinha, A., Radhakrishna, S.: A god becomes a pest? human-rhesus macaque in- teractions in himachal pradesh, northern india. European Journal of Wildlife Research61(3), 435–443 (Jun 2015)
work page 2015
-
[20]
In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recogni- tion and clustering. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2015)
work page 2015
-
[21]
In: Advances in neural information processing systems
Sun, Y ., Chen, Y ., Wang, X., Tang, X.: Deep learning face representation by joint identification-verification. In: Advances in neural information processing systems. pp. 1988– 1996 (2014)
work page 1988
-
[22]
IEEE Signal Processing Letters 25(7), 926–930 (July 2018)
Wang, F., Cheng, J., Liu, W., Liu, H.: Additive margin softmax for face verification. IEEE Signal Processing Letters 25(7), 926–930 (July 2018)
work page 2018
-
[23]
In: European Conference on Computer Vision
Wen, Y ., Zhang, K., Li, Z., Qiao, Y .: A discriminative feature learning approach for deep face recognition. In: European Conference on Computer Vision. pp. 499–515. Springer (2016)
work page 2016
-
[24]
Journal of neuroscience methods (2017)
Witham, C.L.: Automated face recognition of rhesus macaques. Journal of neuroscience methods (2017)
work page 2017
-
[25]
IEEE transactions on pattern analysis and machine intelligence 31(2), 210– 227 (2009)
Wright, J., Yang, A.Y ., Ganesh, A., Sastry, S.S., Ma, Y .: Robust face recognition via sparse representation. IEEE transactions on pattern analysis and machine intelligence 31(2), 210– 227 (2009)
work page 2009
-
[26]
Learning Face Representation from Scratch
Yi, D., Lei, Z., Liao, S., Li, S.Z.: Learning face representation from scratch. arXiv preprint arXiv:1411.7923 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.