Adaptive Calibration for Fair and Performant Facial Recognition

Chris Russell; Ryan Brown

arxiv: 2606.04469 · v1 · pith:QTVZZ6PTnew · submitted 2026-06-03 · 💻 cs.CV · cs.AI

Adaptive Calibration for Fair and Performant Facial Recognition

Ryan Brown , Chris Russell This is my paper

Pith reviewed 2026-06-28 07:04 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords adaptive calibrationfacial recognitionfairnesscosine similaritycalibrationembedding spacebias mitigation

0 comments

The pith

Adaptive Calibration corrects varying match probabilities for the same cosine similarity by using local embedding context, raising both accuracy and fairness in facial recognition without demographic labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Adaptive Calibration, a post-processing step that turns raw cosine similarities into match probabilities by looking at the local neighborhood around each embedding pair. This fixes the problem that identical similarity scores can signal very different match likelihoods depending on where the embeddings sit in the space. The method is shown to raise both overall accuracy and group fairness metrics on standard benchmarks and multiple pretrained models. A sympathetic reader would care because the improvement happens without any demographic annotations and without forcing performance down for any group.

Core claim

Adaptive Calibration (AC) is a calibration strategy that maps cosine similarity between normalized embeddings to well-calibrated probabilities by incorporating local context. It corrects a fundamental mismatch whereby the same distance corresponds to different match probabilities in different embedding regions. The approach consistently dominates existing methods on both accuracy and fairness metrics across pretrained models and benchmarks, supplies continuous region-specific calibration, and does so without demographic metadata or leveling down performance for any group.

What carries the argument

Adaptive Calibration, a region-specific mapping from cosine similarity to probability that conditions on local embedding context.

If this is right

AC raises both accuracy and fairness metrics at the same time on standard facial recognition benchmarks.
The gains hold across multiple pretrained models without retraining.
No demographic group labels are required at any stage.
Calibration remains continuous and varies smoothly with local region rather than applying a single global adjustment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same local-context correction could be tested on other cosine-similarity tasks such as image retrieval or speaker verification.
Embedding spaces appear to have position-dependent probability densities, so similar region-aware adjustments might improve calibration in non-facial domains.
A direct test would be to measure whether AC still improves fairness when the underlying model is already trained with explicit fairness constraints.

Load-bearing premise

The assumption that local context around embeddings can be used to correct the mismatch between cosine distance and match probability without any demographic metadata.

What would settle it

An experiment on the same benchmarks and models where Adaptive Calibration produces no reduction in calibration error or fairness disparity relative to standard global calibration.

Figures

Figures reproduced from arXiv: 2606.04469 by Chris Russell, Ryan Brown.

**Figure 2.** Figure 2: Adaptive Calibration vs. beta calibration on RFW with FaceNet. Left: per-ethnicity re [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

We introduce Adaptive Calibration (AC), a novel calibration strategy for facial recognition that maps cosine similarity between normalized embeddings to well-calibrated probabilities. By incorporating local context into calibration, Adaptive Calibration corrects for a fundamental mismatch in cosine similarity, whereby the same distance can correspond to different match probabilities in different embedding regions. Our approach improves both overall performance and results in a fairer calibration without requiring demographic metadata. Our approach consistently dominates existing methods both on accuracy and fairness metrics across a variety of pretrained models and standard benchmarks. AC provides a practical solution for equitable facial recognition, without requiring demographic group annotations, and while improving overall performance. Unlike existing approaches, our method provides continuous, region-specific calibration that avoids "leveling down" where fairness comes at the cost of degraded performance for some groups.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces local-context adaptive calibration to map cosine similarities to probabilities in face embeddings, claiming simultaneous gains in accuracy and fairness without demographic labels, but the provided abstract supplies no experiments, equations, or data to support the dominance claim.

read the letter

The core idea is using local context around embeddings to create region-specific calibration curves, addressing the fact that the same cosine distance can mean different match probabilities in different parts of the space. This is positioned as an improvement over global calibration or methods that require group annotations, and it claims to avoid performance trade-offs for some groups.

What stands out is the framing around continuous, region-specific mappings rather than discrete adjustments. If the full paper shows a clean implementation and reproducible gains on standard benchmarks like those in face recognition, that could be useful for practitioners who want fairness fixes without extra metadata.

The soft spot is the complete absence of supporting material in the abstract: no tables, no error bars, no description of how local context is computed or incorporated, and no comparison details. The claim of consistent dominance across models and benchmarks is stated but not evidenced here, which makes it hard to assess whether the method actually works or if the mismatch correction holds up. The assumption that local context fixes the probability mismatch without introducing new biases is plausible on paper but untested in the visible text.

This is aimed at people working on calibration and fairness in embedding-based recognition systems. It might be worth a look if the experiments are solid, but based on what's here it does not yet look ready for serious refereeing.

Referee Report

1 major / 1 minor

Summary. The paper introduces Adaptive Calibration (AC), a post-hoc calibration method for facial recognition systems. It maps cosine similarities between normalized embeddings to match probabilities by incorporating local context (region-specific information in embedding space) to address the mismatch where identical distances yield different match probabilities across embedding regions. The central claim is that AC simultaneously improves accuracy and fairness metrics over existing methods across multiple pretrained models and standard benchmarks, without requiring demographic metadata and without 'leveling down' performance for any group.

Significance. If the reported empirical dominance holds under rigorous evaluation, the work would be significant for practical deployment of facial recognition. It offers a metadata-free route to improved calibration fairness while enhancing overall performance, addressing a key tension in the field. The local-context approach to correcting cosine similarity non-uniformity is a concrete technical contribution that could generalize beyond the reported benchmarks.

major comments (1)

[Abstract] The abstract asserts consistent dominance on accuracy and fairness metrics, but the provided text supplies no quantitative results, tables, error bars, or experimental protocol details to support this. If the full manuscript contains such evidence (e.g., specific benchmark tables or ablation studies), they must be explicitly referenced and statistically validated; otherwise the central claim remains unsupported.

minor comments (1)

Notation for 'local context' and the precise functional form of the region-specific mapping should be defined formally (e.g., via an equation) early in the methods section for reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the review and the opportunity to clarify the support for our claims. We address the single major comment below.

read point-by-point responses

Referee: [Abstract] The abstract asserts consistent dominance on accuracy and fairness metrics, but the provided text supplies no quantitative results, tables, error bars, or experimental protocol details to support this. If the full manuscript contains such evidence (e.g., specific benchmark tables or ablation studies), they must be explicitly referenced and statistically validated; otherwise the central claim remains unsupported.

Authors: The full manuscript contains the requested evidence. Section 4 reports results on five pretrained models (ArcFace, CosFace, SFace, AdaFace, and a ResNet-50 baseline) across LFW, CFP-FP, AgeDB-30, IJB-A, IJB-C, and RFW. Table 2 quantifies overall accuracy gains (e.g., +1.4% AUC and -0.8% EER on IJB-C) with standard deviations over 10 runs; Table 3 reports fairness metrics (demographic parity and equalized odds) showing consistent reductions in disparity without performance degradation on any subgroup. Section 4.3 contains ablation studies isolating the local-context term. We will revise the abstract to include explicit citations to these tables and the evaluation protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical claims are externally testable

full rationale

The paper introduces Adaptive Calibration as a method that maps cosine similarities using local context to produce region-specific probabilities, with the central claim being consistent dominance on accuracy and fairness metrics across pretrained models and benchmarks. No equations, derivation steps, fitted parameters renamed as predictions, or self-citations appear in the provided text. The result is presented as an observed empirical outcome rather than a mathematical identity or self-referential construction, making the claims falsifiable against external data without reducing to the method's own definition.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No full text available; abstract provides no information on free parameters, axioms, or invented entities.

pith-pipeline@v0.9.1-grok · 5649 in / 942 out tokens · 26415 ms · 2026-06-28T07:04:26.107525+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

36 extracted references · 20 canonical work pages · 6 internal anchors

[1]

Beliatis

Syed Murtaza Hussain Abidi, Syed Ali Hassan, Syed Muhammad Raza, and Michail J. Beliatis. Advances in face recognition: A comprehensive review of feature extraction and dataset evaluation.Electronics, 15(2), 2026

2026
[2]

FALCON: Fair Face Recognition via Local Optimal Feature Normalization

Rouqaiah Al-Refai, Philipp Hempel, Clara Biagi, and Philipp Terh ¨orst. FALCON: Fair Face Recognition via Local Optimal Feature Normalization. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3416–3426, Tucson, AZ, USA, February 2025. IEEE

2025
[3]

GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations.IEEE Access, 11:35429–35446, 2023

Mohamad Alansari, Oussama Abdul Hay, Sajid Javed, Abdulhadi Shoufan, Yahya Zweiri, and Naoufel Werghi. GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations.IEEE Access, 11:35429–35446, 2023

2023
[4]

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Joy Buolamwini and Timnit Gebru. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. InProceedings of the 1st Conference on Fairness, Accountability and Trans- parency, pages 77–91. PMLR, January 2018. ISSN: 2640-3498

2018
[5]

A Deep Dive into Dataset Imbalance and Bias in Face Identification, March 2022

Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri, Micah Goldblum, and Tom Gold- stein. A Deep Dive into Dataset Imbalance and Bias in Face Identification, March 2022. arXiv:2203.08235 [cs]

work page arXiv 2022
[6]

Mit- igating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model, February 2024

Jean-R ´emy Conti, Nathan Noiry, Vincent Despiegel, St ´ephane Gentric, and St ´ephan Cl ´emenc ¸on. Mit- igating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model, February 2024. arXiv:2210.13664 [cs]

work page arXiv 2024
[7]

OxonFair: A Flexible Toolkit for Algorithmic Fairness, November 2024

Eoin Delaney, Zihao Fu, Sandra Wachter, Brent Mittelstadt, and Chris Russell. OxonFair: A Flexible Toolkit for Algorithmic Fairness, November 2024. arXiv:2407.13710 [cs]

work page arXiv 2024
[8]

ArcFace: Ad- ditive Angular Margin Loss for Deep Face Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5962–5979, October 2022

Jiankang Deng, Jia Guo, Jing Yang, Niannan Xue, Irene Kotsia, and Stefanos Zafeiriou. ArcFace: Ad- ditive Angular Margin Loss for Deep Face Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5962–5979, October 2022. arXiv:1801.07698 [cs]

work page arXiv 2022
[9]

Castillo, and Rama Chellappa

Prithviraj Dhar, Joshua Gleason, Aniket Roy, Carlos D. Castillo, and Rama Chellappa. PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition, August 2021. arXiv:2108.03764 [cs]

work page arXiv 2021
[10]

Castillo, and Rama Chellappa

Prithviraj Dhar, Joshua Gleason, Hossein Souri, Carlos D. Castillo, and Rama Chellappa. To- wards Gender-Neutral Face Descriptors for Mitigating Bias in Face Recognition, September 2020. arXiv:2006.07845 [cs]

work page arXiv 2020
[11]

Local Temperature Scaling for Probability Calibration

Zhipeng Ding, Xu Han, Peirong Liu, and Marc Niethammer. Local Temperature Scaling for Probability Calibration. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 6869–6879, Montreal, QC, Canada, October 2021. IEEE

2021
[12]

Face recognition vendor test part 3: demographic effects

Patrick Grother, Mei Ngan, and Kayee Hanaoka. Face recognition vendor test part 3: demographic effects. Technical Report NIST IR 8280, National Institute of Standards and Technology, Gaithersburg, MD, December 2019

2019
[13]

Insightface: State-of-the-art 2d and 3d face analysis library

Jia Guo and Jiankang Deng. Insightface: State-of-the-art 2d and 3d face analysis library. GitHub reposi- tory, 2025. Accessed: March 7, 2025

2025
[14]

Facial Recognition Led to Wrongful Arrests

Kashmir Hill. Facial Recognition Led to Wrongful Arrests. So Detroit Is Making Changes.The New York Times, June 2024

2024
[15]

Deep Imbalanced Learning for Face Recognition and Attribute Prediction

Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. Deep Imbalanced Learning for Face Recognition and Attribute Prediction, April 2019. arXiv:1806.00194 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[16]

Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments

Gary B Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Technical Report 07-49, Uni- versity of Massachusetts, Amherst, Amherst, MA, October 2007

2007
[17]

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation

Kimmo Karkkainen and Jungseock Joo. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation. In2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1547–1557, Waikoloa, HI, USA, January 2021. IEEE

2021
[18]

Review of Demographic Bias in Face Recognition, February 2025

Ketan Kotwal and Sebastien Marcel. Review of Demographic Bias in Face Recognition, February 2025. arXiv:2502.02309 [cs]. 10

work page arXiv 2025
[19]

Score Normalization for Demographic Fairness in Face Recognition, July 2024

Yu Linghu, Tiago de Freitas Pereira, Christophe Ecabert, S ´ebastien Marcel, and Manuel G ¨unther. Score Normalization for Demographic Fairness in Face Recognition, July 2024. arXiv:2407.14087 [cs]

work page arXiv 2024
[20]

SphereFace: Deep Hypersphere Embedding for Face Recognition

Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. SphereFace: Deep Hyper- sphere Embedding for Face Recognition, January 2018. arXiv:1704.08063 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[21]

The Unfairness of Fair Machine Learning: Level- ling down and strict egalitarianism by default, March 2023

Brent Mittelstadt, Sandra Wachter, and Chris Russell. The Unfairness of Fair Machine Learning: Level- ling down and strict egalitarianism by default, March 2023. arXiv:2302.02404 [cs]

work page arXiv 2023
[22]

On Fairness and Calibration

Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q. Weinberger. On Fairness and Calibration, November 2017. arXiv:1709.02012 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[23]

Prince.Understanding Deep Learning

Simon J.D. Prince.Understanding Deep Learning. MIT Press, 2023

2023
[24]

Post-hoc Calibration of Neural Networks by g-Layers, 2020

Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, and Richard Hartley. Post-hoc Calibration of Neural Networks by g-Layers, 2020. Version Number: 2

2020
[25]

Robinson, Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner

Joseph P. Robinson, Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. Face Recog- nition: Too Bias, or Not Too Bias?, April 2020. arXiv:2002.06483 [cs]

work page arXiv 2020
[26]

FairCal: Fairness Calibration for Face Verification, March 2022

Tiago Salvador, Stephanie Cairns, Vikram V oleti, Noah Marshall, and Adam Oberman. FairCal: Fairness Calibration for Face Verification, March 2022. arXiv:2106.03761 [cs]

work page arXiv 2022
[27]

David Sandberg. facenet. GitHub repository, 2018. Accessed: March 7, 2025

2018
[28]

FaceNet: A Unified Embedding for Face Recognition and Clustering

Florian Schroff, Dmitry Kalenichenko, and James Philbin. FaceNet: A Unified Embedding for Face Recognition and Clustering. In2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823, June 2015. arXiv:1503.03832 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2015
[29]

Face Recognition: A Novel Multi-Level Taxonomy based Survey

Alireza Sepas-Moghaddam, Fernando Pereira, and Paulo Lobato Correia. Face Recognition: A Novel Multi-Level Taxonomy based Survey, January 2019. arXiv:1901.00713 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[30]

When do Minimax-fair Learning and Empirical Risk Minimization Coincide? InProceedings of the 40th Interna- tional Conference on Machine Learning, pages 31969–31989

Harvineet Singh, Matth ¨aus Kleindessner, V olkan Cevher, Rumi Chunara, and Chris Russell. When do Minimax-fair Learning and Empirical Risk Minimization Coincide? InProceedings of the 40th Interna- tional Conference on Machine Learning, pages 31969–31989. PMLR, July 2023

2023
[31]

Comparison- Level Mitigation of Ethnic Bias in Face Recognition

Philipp Terhorst, Mai Ly Tran, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. Comparison- Level Mitigation of Ethnic Bias in Face Recognition. In2020 8th International Workshop on Biometrics and F orensics (IWBF), pages 1–6, Porto, Portugal, April 2020. IEEE

2020
[32]

Post- comparison mitigation of demographic bias in face recognition using fair score normalization.Pattern Recognition Letters, 140:332–338, December 2020

Philipp Terh ¨orst, Jan Niklas Kolf, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. Post- comparison mitigation of demographic bias in face recognition using fair score normalization.Pattern Recognition Letters, 140:332–338, December 2020

2020
[33]

FRAPPE: A Group Fairness Framework for Post-Processing Everything, June 2024

Alexandru Tifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, and Flavien Prost. FRAPPE: A Group Fairness Framework for Post-Processing Everything, June 2024. arXiv:2312.02592 [cs]

work page arXiv 2024
[34]

CosFace: Large Margin Cosine Loss for Deep Face Recognition

Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. CosFace: Large Margin Cosine Loss for Deep Face Recognition, April 2018. arXiv:1801.09414 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018
[35]

Racial Faces in-the-Wild: Re- ducing Racial Bias by Information Maximization Adaptation Network, July 2019

Mei Wang, Weihong Deng, Jiani Hu, Xunqiang Tao, and Yaohai Huang. Racial Faces in-the-Wild: Re- ducing Racial Bias by Information Maximization Adaptation Network, July 2019. arXiv:1812.00194 [cs]

work page arXiv 2019
[36]

Fairlearn: Assessing and Improving Fairness of AI Systems, 2023

Hilde Weerts, Miroslav Dud´ık, Richard Edgar, Adrin Jalali, Roman Lutz, and Michael Madaio. Fairlearn: Assessing and Improving Fairness of AI Systems, 2023. original-date: 2018-05-15T01:51:35Z. 11 A Additional Results This appendix provides supplementary evidence in three stages. First, we report aggregate calibra- tion and ranking metrics for the headlin...

work page arXiv 2023

[1] [1]

Beliatis

Syed Murtaza Hussain Abidi, Syed Ali Hassan, Syed Muhammad Raza, and Michail J. Beliatis. Advances in face recognition: A comprehensive review of feature extraction and dataset evaluation.Electronics, 15(2), 2026

2026

[2] [2]

FALCON: Fair Face Recognition via Local Optimal Feature Normalization

Rouqaiah Al-Refai, Philipp Hempel, Clara Biagi, and Philipp Terh ¨orst. FALCON: Fair Face Recognition via Local Optimal Feature Normalization. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 3416–3426, Tucson, AZ, USA, February 2025. IEEE

2025

[3] [3]

GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations.IEEE Access, 11:35429–35446, 2023

Mohamad Alansari, Oussama Abdul Hay, Sajid Javed, Abdulhadi Shoufan, Yahya Zweiri, and Naoufel Werghi. GhostFaceNets: Lightweight Face Recognition Model From Cheap Operations.IEEE Access, 11:35429–35446, 2023

2023

[4] [4]

Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification

Joy Buolamwini and Timnit Gebru. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification. InProceedings of the 1st Conference on Fairness, Accountability and Trans- parency, pages 77–91. PMLR, January 2018. ISSN: 2640-3498

2018

[5] [5]

A Deep Dive into Dataset Imbalance and Bias in Face Identification, March 2022

Valeriia Cherepanova, Steven Reich, Samuel Dooley, Hossein Souri, Micah Goldblum, and Tom Gold- stein. A Deep Dive into Dataset Imbalance and Bias in Face Identification, March 2022. arXiv:2203.08235 [cs]

work page arXiv 2022

[6] [6]

Mit- igating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model, February 2024

Jean-R ´emy Conti, Nathan Noiry, Vincent Despiegel, St ´ephane Gentric, and St ´ephan Cl ´emenc ¸on. Mit- igating Gender Bias in Face Recognition Using the von Mises-Fisher Mixture Model, February 2024. arXiv:2210.13664 [cs]

work page arXiv 2024

[7] [7]

OxonFair: A Flexible Toolkit for Algorithmic Fairness, November 2024

Eoin Delaney, Zihao Fu, Sandra Wachter, Brent Mittelstadt, and Chris Russell. OxonFair: A Flexible Toolkit for Algorithmic Fairness, November 2024. arXiv:2407.13710 [cs]

work page arXiv 2024

[8] [8]

ArcFace: Ad- ditive Angular Margin Loss for Deep Face Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5962–5979, October 2022

Jiankang Deng, Jia Guo, Jing Yang, Niannan Xue, Irene Kotsia, and Stefanos Zafeiriou. ArcFace: Ad- ditive Angular Margin Loss for Deep Face Recognition.IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):5962–5979, October 2022. arXiv:1801.07698 [cs]

work page arXiv 2022

[9] [9]

Castillo, and Rama Chellappa

Prithviraj Dhar, Joshua Gleason, Aniket Roy, Carlos D. Castillo, and Rama Chellappa. PASS: Protected Attribute Suppression System for Mitigating Bias in Face Recognition, August 2021. arXiv:2108.03764 [cs]

work page arXiv 2021

[10] [10]

Castillo, and Rama Chellappa

Prithviraj Dhar, Joshua Gleason, Hossein Souri, Carlos D. Castillo, and Rama Chellappa. To- wards Gender-Neutral Face Descriptors for Mitigating Bias in Face Recognition, September 2020. arXiv:2006.07845 [cs]

work page arXiv 2020

[11] [11]

Local Temperature Scaling for Probability Calibration

Zhipeng Ding, Xu Han, Peirong Liu, and Marc Niethammer. Local Temperature Scaling for Probability Calibration. In2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 6869–6879, Montreal, QC, Canada, October 2021. IEEE

2021

[12] [12]

Face recognition vendor test part 3: demographic effects

Patrick Grother, Mei Ngan, and Kayee Hanaoka. Face recognition vendor test part 3: demographic effects. Technical Report NIST IR 8280, National Institute of Standards and Technology, Gaithersburg, MD, December 2019

2019

[13] [13]

Insightface: State-of-the-art 2d and 3d face analysis library

Jia Guo and Jiankang Deng. Insightface: State-of-the-art 2d and 3d face analysis library. GitHub reposi- tory, 2025. Accessed: March 7, 2025

2025

[14] [14]

Facial Recognition Led to Wrongful Arrests

Kashmir Hill. Facial Recognition Led to Wrongful Arrests. So Detroit Is Making Changes.The New York Times, June 2024

2024

[15] [15]

Deep Imbalanced Learning for Face Recognition and Attribute Prediction

Chen Huang, Yining Li, Chen Change Loy, and Xiaoou Tang. Deep Imbalanced Learning for Face Recognition and Attribute Prediction, April 2019. arXiv:1806.00194 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[16] [16]

Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments

Gary B Huang, Manu Ramesh, Tamara Berg, and Erik Learned-Miller. Labeled Faces in the Wild: A Database for Studying Face Recognition in Unconstrained Environments. Technical Report 07-49, Uni- versity of Massachusetts, Amherst, Amherst, MA, October 2007

2007

[17] [17]

FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation

Kimmo Karkkainen and Jungseock Joo. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age for Bias Measurement and Mitigation. In2021 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1547–1557, Waikoloa, HI, USA, January 2021. IEEE

2021

[18] [18]

Review of Demographic Bias in Face Recognition, February 2025

Ketan Kotwal and Sebastien Marcel. Review of Demographic Bias in Face Recognition, February 2025. arXiv:2502.02309 [cs]. 10

work page arXiv 2025

[19] [19]

Score Normalization for Demographic Fairness in Face Recognition, July 2024

Yu Linghu, Tiago de Freitas Pereira, Christophe Ecabert, S ´ebastien Marcel, and Manuel G ¨unther. Score Normalization for Demographic Fairness in Face Recognition, July 2024. arXiv:2407.14087 [cs]

work page arXiv 2024

[20] [20]

SphereFace: Deep Hypersphere Embedding for Face Recognition

Weiyang Liu, Yandong Wen, Zhiding Yu, Ming Li, Bhiksha Raj, and Le Song. SphereFace: Deep Hyper- sphere Embedding for Face Recognition, January 2018. arXiv:1704.08063 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[21] [21]

The Unfairness of Fair Machine Learning: Level- ling down and strict egalitarianism by default, March 2023

Brent Mittelstadt, Sandra Wachter, and Chris Russell. The Unfairness of Fair Machine Learning: Level- ling down and strict egalitarianism by default, March 2023. arXiv:2302.02404 [cs]

work page arXiv 2023

[22] [22]

On Fairness and Calibration

Geoff Pleiss, Manish Raghavan, Felix Wu, Jon Kleinberg, and Kilian Q. Weinberger. On Fairness and Calibration, November 2017. arXiv:1709.02012 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[23] [23]

Prince.Understanding Deep Learning

Simon J.D. Prince.Understanding Deep Learning. MIT Press, 2023

2023

[24] [24]

Post-hoc Calibration of Neural Networks by g-Layers, 2020

Amir Rahimi, Thomas Mensink, Kartik Gupta, Thalaiyasingam Ajanthan, Cristian Sminchisescu, and Richard Hartley. Post-hoc Calibration of Neural Networks by g-Layers, 2020. Version Number: 2

2020

[25] [25]

Robinson, Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner

Joseph P. Robinson, Gennady Livitz, Yann Henon, Can Qin, Yun Fu, and Samson Timoner. Face Recog- nition: Too Bias, or Not Too Bias?, April 2020. arXiv:2002.06483 [cs]

work page arXiv 2020

[26] [26]

FairCal: Fairness Calibration for Face Verification, March 2022

Tiago Salvador, Stephanie Cairns, Vikram V oleti, Noah Marshall, and Adam Oberman. FairCal: Fairness Calibration for Face Verification, March 2022. arXiv:2106.03761 [cs]

work page arXiv 2022

[27] [27]

David Sandberg. facenet. GitHub repository, 2018. Accessed: March 7, 2025

2018

[28] [28]

FaceNet: A Unified Embedding for Face Recognition and Clustering

Florian Schroff, Dmitry Kalenichenko, and James Philbin. FaceNet: A Unified Embedding for Face Recognition and Clustering. In2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 815–823, June 2015. arXiv:1503.03832 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2015

[29] [29]

Face Recognition: A Novel Multi-Level Taxonomy based Survey

Alireza Sepas-Moghaddam, Fernando Pereira, and Paulo Lobato Correia. Face Recognition: A Novel Multi-Level Taxonomy based Survey, January 2019. arXiv:1901.00713 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[30] [30]

When do Minimax-fair Learning and Empirical Risk Minimization Coincide? InProceedings of the 40th Interna- tional Conference on Machine Learning, pages 31969–31989

Harvineet Singh, Matth ¨aus Kleindessner, V olkan Cevher, Rumi Chunara, and Chris Russell. When do Minimax-fair Learning and Empirical Risk Minimization Coincide? InProceedings of the 40th Interna- tional Conference on Machine Learning, pages 31969–31989. PMLR, July 2023

2023

[31] [31]

Comparison- Level Mitigation of Ethnic Bias in Face Recognition

Philipp Terhorst, Mai Ly Tran, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. Comparison- Level Mitigation of Ethnic Bias in Face Recognition. In2020 8th International Workshop on Biometrics and F orensics (IWBF), pages 1–6, Porto, Portugal, April 2020. IEEE

2020

[32] [32]

Post- comparison mitigation of demographic bias in face recognition using fair score normalization.Pattern Recognition Letters, 140:332–338, December 2020

Philipp Terh ¨orst, Jan Niklas Kolf, Naser Damer, Florian Kirchbuchner, and Arjan Kuijper. Post- comparison mitigation of demographic bias in face recognition using fair score normalization.Pattern Recognition Letters, 140:332–338, December 2020

2020

[33] [33]

FRAPPE: A Group Fairness Framework for Post-Processing Everything, June 2024

Alexandru Tifrea, Preethi Lahoti, Ben Packer, Yoni Halpern, Ahmad Beirami, and Flavien Prost. FRAPPE: A Group Fairness Framework for Post-Processing Everything, June 2024. arXiv:2312.02592 [cs]

work page arXiv 2024

[34] [34]

CosFace: Large Margin Cosine Loss for Deep Face Recognition

Hao Wang, Yitong Wang, Zheng Zhou, Xing Ji, Dihong Gong, Jingchao Zhou, Zhifeng Li, and Wei Liu. CosFace: Large Margin Cosine Loss for Deep Face Recognition, April 2018. arXiv:1801.09414 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2018

[35] [35]

Racial Faces in-the-Wild: Re- ducing Racial Bias by Information Maximization Adaptation Network, July 2019

Mei Wang, Weihong Deng, Jiani Hu, Xunqiang Tao, and Yaohai Huang. Racial Faces in-the-Wild: Re- ducing Racial Bias by Information Maximization Adaptation Network, July 2019. arXiv:1812.00194 [cs]

work page arXiv 2019

[36] [36]

Fairlearn: Assessing and Improving Fairness of AI Systems, 2023

Hilde Weerts, Miroslav Dud´ık, Richard Edgar, Adrin Jalali, Roman Lutz, and Michael Madaio. Fairlearn: Assessing and Improving Fairness of AI Systems, 2023. original-date: 2018-05-15T01:51:35Z. 11 A Additional Results This appendix provides supplementary evidence in three stages. First, we report aggregate calibra- tion and ranking metrics for the headlin...

work page arXiv 2023