pith. machine review for the scientific record. sign in

arxiv: 2604.10707 · v1 · submitted 2026-04-12 · 💻 cs.CV

Recognition: unknown

Investigating Bias and Fairness in Appearance-based Gaze Estimation

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords gaze estimationfairnessbiasethnicitygenderappearance-basedcomputer visionbias mitigation
0
0 comments X

The pith

Gaze estimation models display notable accuracy differences across ethnicities and genders, with standard bias corrections showing limited success.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper conducts the first broad assessment of fairness in appearance-based gaze estimation by testing state-of-the-art models on groups defined by ethnicity and gender. It finds clear performance gaps using common fairness measures and tests whether usual bias reduction techniques improve things in this setting. A sympathetic reader would care because gaze systems are increasingly used in devices and applications where unequal performance could disadvantage some users. The work also releases annotations and code to help others build on this baseline. Overall it pushes the field toward more equitable designs.

Core claim

We provide the first comprehensive benchmark for fairness in appearance-based gaze estimation. By evaluating multiple state-of-the-art models with standard fairness metrics on ethnicity and gender, we identify substantial performance disparities. We further apply existing bias mitigation methods and demonstrate that they offer only limited improvements in the gaze estimation domain. These findings highlight the need for gaze-specific approaches to fairness.

What carries the argument

The fairness evaluation pipeline that applies metrics such as demographic parity and equalized odds to the error in predicted gaze directions across demographic subgroups.

If this is right

  • Models that ignore fairness will continue to underperform for certain demographic groups in real deployments.
  • Existing general bias mitigation techniques are insufficient for regression-based vision tasks like gaze estimation.
  • Future gaze estimators need to incorporate fairness considerations during training or evaluation to reduce disparities.
  • The released dataset annotations enable reproducible research on equitable gaze systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar bias patterns may exist in other appearance-based computer vision tasks such as facial recognition or pose estimation.
  • Developers of consumer devices using gaze tracking should test for demographic fairness before release.
  • New mitigation strategies tailored to continuous output spaces like gaze angles could be developed and tested using this benchmark.

Load-bearing premise

The assumption that the demographic annotations in the datasets accurately represent the populations where these systems will be used and that standard group fairness metrics are suitable measures for a continuous regression task.

What would settle it

Re-running the evaluation on a dataset with balanced demographics and using regression-specific fairness measures such as mean absolute error parity, and finding no significant disparities or effective mitigation.

Figures

Figures reproduced from arXiv: 2604.10707 by Burak Akg\"ul, Erol \c{S}ahin, Sinan Kalkan.

Figure 1
Figure 1. Figure 1: Contributions of the paper for fairness in gaze [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: The ethnicity and gender annotation pipeline. A [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall pipeline for bias mitigation methods and [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
read the original abstract

While appearance-based gaze estimation has achieved significant improvements in accuracy and domain adaptation, the fairness of these systems across different demographic groups remains largely unexplored. To date, there is no comprehensive benchmark quantifying algorithmic bias in gaze estimation. This paper presents the first extensive evaluation of fairness in appearance-based gaze estimation, focusing on ethnicity and gender attributes. We establish a fairness baseline by analyzing state-of-the-art models using standard fairness metrics, revealing significant performance disparities. Furthermore, we evaluate the effectiveness of existing bias mitigation strategies when applied to the gaze domain and show that their fairness contributions are limited. We summarize key insights and open issues. Overall, our work calls for research into developing robust, equitable gaze estimators. To support future research and reproducibility, we publicly release our annotations, code, and trained models at: github.com/akgulburak/gaze-estimation-fairness

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript presents the first extensive evaluation of fairness in appearance-based gaze estimation, focusing on ethnicity and gender attributes. It establishes a fairness baseline by analyzing state-of-the-art models using standard fairness metrics, revealing significant performance disparities across demographic groups. The authors evaluate the effectiveness of existing bias mitigation strategies when applied to the gaze domain and conclude that their fairness contributions are limited. Key insights and open issues are summarized, and the work calls for research into developing robust, equitable gaze estimators. Annotations, code, and trained models are publicly released to support reproducibility.

Significance. If the central findings hold after addressing metric adaptations, this work is significant for the computer vision community because gaze estimation systems are deployed in safety-critical applications such as driver monitoring and assistive technologies. Providing an empirical baseline, quantifying disparities, and testing mitigation strategies highlights an important gap. The public release of annotations, code, and models is a clear strength that enables follow-up research and reproducibility.

major comments (3)
  1. [§4.2] §4.2 (Fairness Metrics Application): The manuscript applies classification-oriented fairness metrics (demographic parity, equalized odds) to the continuous regression outputs of gaze estimation (yaw/pitch angles or gaze vectors). The text indicates these are adapted via error thresholds or binning, but provides no derivation, justification, or sensitivity analysis for the chosen thresholds/bins. This is load-bearing for the central claims of 'significant performance disparities' and 'limited fairness contributions,' as different binning choices could materially change the reported gaps.
  2. [§5] §5 (Mitigation Experiments): When evaluating bias mitigation strategies, the paper reports limited fairness improvements but does not detail whether experiments used matched hyperparameter budgets, identical training schedules, or controls that isolate fairness gains from overall accuracy changes. Without these, it is unclear whether the 'limited contributions' conclusion is robust or an artifact of experimental setup.
  3. [Table 3] Table 3 and §4.3: Subgroup performance gaps are presented without statistical significance tests, confidence intervals, or discussion of subgroup sample sizes. Given that some demographic groups may have smaller representation in the datasets, the 'significant disparities' claim requires quantification of uncertainty to be load-bearing.
minor comments (2)
  1. [Abstract] The abstract refers to 'standard fairness metrics' without naming them; listing the specific metrics (and their adaptations) in the abstract would improve clarity.
  2. [Figure 2] Figure 2: Axis labels and color legends for per-group error distributions are difficult to read at standard print size; increasing font size or adding a table of numeric values would aid interpretation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback on our manuscript. We have carefully reviewed each major comment and provide point-by-point responses below. We agree with the need for greater rigor in several areas and will revise the manuscript accordingly to strengthen the presentation of our fairness audit.

read point-by-point responses
  1. Referee: [§4.2] §4.2 (Fairness Metrics Application): The manuscript applies classification-oriented fairness metrics (demographic parity, equalized odds) to the continuous regression outputs of gaze estimation (yaw/pitch angles or gaze vectors). The text indicates these are adapted via error thresholds or binning, but provides no derivation, justification, or sensitivity analysis for the chosen thresholds/bins. This is load-bearing for the central claims of 'significant performance disparities' and 'limited fairness contributions,' as different binning choices could materially change the reported gaps.

    Authors: We thank the referee for this important observation on metric adaptation. In the original §4.2, we mapped the regression task to a binary classification setting by thresholding angular error at 5° (a value widely used in gaze estimation literature to denote acceptable accuracy for applications like driver monitoring) and applied binning to discretize yaw/pitch for parity calculations. We acknowledge the absence of explicit derivation and sensitivity analysis. In the revised manuscript, we will add a formal derivation of the adaptation, justify the 5° threshold with citations to prior gaze work, and include a sensitivity study varying thresholds (3°–10°) and bin widths. Results will show that the reported ethnic and gender disparities persist across these choices, supporting the robustness of our central claims. revision: yes

  2. Referee: [§5] §5 (Mitigation Experiments): When evaluating bias mitigation strategies, the paper reports limited fairness improvements but does not detail whether experiments used matched hyperparameter budgets, identical training schedules, or controls that isolate fairness gains from overall accuracy changes. Without these, it is unclear whether the 'limited contributions' conclusion is robust or an artifact of experimental setup.

    Authors: We appreciate the referee's call for experimental transparency. All mitigation strategies (adversarial debiasing, reweighting, etc.) were trained with the identical base architecture, optimizer, and epoch schedule as the baselines. Fairness-specific hyperparameters were selected via grid search within a fixed computational budget, with final models chosen on a validation set balancing accuracy and fairness. In the revision, we will expand §5 with a detailed description of the hyperparameter ranges, training schedules, and controls, plus an appendix table listing all settings. We will also explicitly report accuracy changes alongside fairness metrics to demonstrate that the limited gains are not artifacts of mismatched setups. revision: yes

  3. Referee: [Table 3] Table 3 and §4.3: Subgroup performance gaps are presented without statistical significance tests, confidence intervals, or discussion of subgroup sample sizes. Given that some demographic groups may have smaller representation in the datasets, the 'significant disparities' claim requires quantification of uncertainty to be load-bearing.

    Authors: We agree that uncertainty quantification is essential for the subgroup analysis. In the revised version, we will update Table 3 and §4.3 to include 95% bootstrap confidence intervals for all per-subgroup metrics, pairwise statistical significance tests (t-tests with multiple-comparison correction) for the reported gaps, and explicit discussion of subgroup sample sizes drawn from the datasets (e.g., noting larger cohorts for certain ethnicities and smaller but still analyzable groups for others). This will allow readers to assess the reliability of the disparities and address concerns about smaller subgroups. revision: yes

Circularity Check

0 steps flagged

Empirical fairness audit with no derivation chain or self-referential predictions

full rationale

The paper is an empirical evaluation that applies standard fairness metrics (demographic parity, equalized odds, etc.) to pre-trained gaze estimation models on public datasets. No mathematical derivations, fitted parameters, or predictions are defined within the work; all reported disparities and mitigation results are direct measurements from external benchmarks. The central claims rest on reproducible experiments rather than any internal reduction to self-defined quantities or self-citations. This matches the default expectation for non-circular empirical audits.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that the chosen public datasets contain reliable ethnicity and gender labels and that the selected fairness metrics are appropriate for continuous gaze regression. No new entities or free parameters are introduced.

axioms (2)
  • domain assumption Public gaze datasets contain sufficiently accurate and unbiased ethnicity/gender annotations for fairness auditing.
    Invoked when the authors compute group-wise performance disparities.
  • domain assumption Standard group fairness metrics (demographic parity, equalized odds) are meaningful for a regression output such as gaze direction.
    Used to quantify 'significant performance disparities' and to evaluate mitigation strategies.

pith-pipeline@v0.9.0 · 5450 in / 1296 out tokens · 19099 ms · 2026-05-10T15:49:42.605130+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

77 extracted references · 5 canonical work pages · 1 internal anchor

  1. [1]

    A. A. Abdelrahman, T. Hempel, A. Khalifa, A. Al-Hamadi, and L. Dinges. L2cs-net : Fine-grained gaze estimation in unconstrained environments. In2023 8th International Conference on Frontiers of Signal Processing (ICFSP), pages 98–102, 2023

  2. [2]

    M. M. Amin and B. W. Schuller. Normalise for fairness: A simple normalisation technique for fairness in regression machine learning problems.arXiv preprint arXiv:2202.00993, 2022

  3. [3]

    Balim, S

    H. Balim, S. Park, X. Wang, X. Zhang, and O. Hilliges. Efe: End- to-end frame-to-gaze estimation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2688– 2697, 2023

  4. [4]

    Catruna, A

    A. Catruna, A. Cosma, and E. Radoi. Crossgaze: A strong method for 3d gaze estimation in the wild. In2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition (FG), pages 1–5. IEEE, 2024

  5. [5]

    L. E. Celis, V . Keswani, and N. Vishnoi. Data preprocessing to mitigate bias: A maximum entropy based approach. InInternational conference on machine learning, pages 1349–1359. PMLR, 2020

  6. [6]

    Chai and X

    J. Chai and X. Wang. Fairness with adaptive weights. InInternational conference on machine learning, pages 2853–2866. PMLR, 2022

  7. [7]

    Chen and B

    Z. Chen and B. Shi. Offset calibration for appearance-based gaze estimation via gaze decomposition. InProceedings of the IEEE/CVF winter conference on applications of computer vision, pages 270–279, 2020

  8. [8]

    Chen and B

    Z. Chen and B. E. Shi. Towards high performance low complexity calibration in appearance based gaze estimation.IEEE transactions on pattern analysis and machine intelligence, 45(1):1174–1188, 2022

  9. [9]

    Cheng, Y

    Y . Cheng, Y . Bao, and F. Lu. Puregaze: Purifying gaze feature for generalizable gaze estimation. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 436–443, 2022

  10. [10]

    Cheng, S

    Y . Cheng, S. Huang, F. Wang, C. Qian, and F. Lu. A coarse- to-fine adaptive network for appearance-based gaze estimation. 34(07):10623–10630, 2020

  11. [11]

    Cheng and F

    Y . Cheng and F. Lu. Gaze estimation using transformer. In2022 26th International Conference on Pattern Recognition (ICPR), pages 3341–3347. IEEE, 2022

  12. [12]

    Cheng, F

    Y . Cheng, F. Lu, and X. Zhang. Appearance-based gaze estimation via evaluation-guided asymmetric regression. InProceedings of the European conference on computer vision (ECCV), pages 100–115, 2018

  13. [13]

    Cheng, H

    Y . Cheng, H. Wang, Y . Bao, and F. Lu. Appearance-based gaze estimation with deep learning: A review and benchmark.IEEE Trans- actions on Pattern Analysis and Machine Intelligence, 46(12):7509– 7528, 2024

  14. [14]

    Cheong, S

    J. Cheong, S. Kalkan, and H. Gunes. The hitchhiker’s guide to bias and fairness in facial affective signal processing: Overview and techniques. IEEE Signal Processing Magazine, 38(6):39–49, 2021

  15. [15]

    Cheong, S

    J. Cheong, S. Kuzucu, S. Kalkan, and H. Gunes. Towards gender fairness for mental health prediction. International Joint Conferences on Artificial Intelligence Organization, 2023

  16. [16]

    Chzhen, C

    E. Chzhen, C. Denis, M. Hebiri, L. Oneto, and M. Pontil. Fair regres- sion with wasserstein barycenters.Advances in Neural Information Processing Systems, 33:7321–7331, 2020

  17. [17]

    Feldman, S

    M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. Certifying and removing disparate impact. Inproceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 259–268, 2015

  18. [18]

    Ghosh, A

    S. Ghosh, A. Dhall, M. Hayat, J. Knibbe, and Q. Ji. Automatic gaze analysis: A survey of deep learning based approaches.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(1):61– 84, 2023

  19. [19]

    Y . Guan, Z. Chen, W. Zeng, Z. Cao, and Y . Xiao. End-to-end video gaze estimation via capturing head-face-eye spatial-temporal interaction context.IEEE Signal Processing Letters, 30:1687–1691, 2023

  20. [20]

    Hardt, E

    M. Hardt, E. Price, and N. Srebro. Equality of opportunity in supervised learning.Advances in neural information processing systems, 29, 2016

  21. [21]

    M. Hort, Z. Chen, J. M. Zhang, M. Harman, and F. Sarro. Bias mitigation for machine learning classifiers: A comprehensive survey. ACM Journal on Responsible Computing, 1(2):1–52, 2024

  22. [22]

    M. M. Hosseini, A. P. Fard, and M. H. Mahoor. Faces of fairness: Examining bias in facial expression recognition datasets and models. arXiv preprint arXiv:2502.11049, 2025

  23. [23]

    M. M. Hosseini, A. P. Fard, and M. H. Mahoor. Faces of fairness: Examining bias in facial expression recognition datasets and models, 2025

  24. [24]

    Jianfeng and L

    L. Jianfeng and L. Shigang. Eye-model-based gaze estimation by rgb- d camera. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 592–596, 2014

  25. [25]

    Jigang, B

    L. Jigang, B. S. L. Francis, and D. Rajan. Free-head appearance-based eye gaze estimation on mobile devices. In2019 International Con- ference on Artificial Intelligence in Information and Communication (ICAIIC), pages 232–237. IEEE, 2019

  26. [27]

    A. R. Joshi, X. S. Cuadros, N. Sivakumar, L. Zappella, and N. Apos- toloff. Fair sa: Sensitivity analysis for fairness in face recognition. InAlgorithmic fairness through the lens of causality and robustness workshop, pages 40–58. PMLR, 2022

  27. [28]

    Kamiran and T

    F. Kamiran and T. Calders. Data preprocessing techniques for classification without discrimination.Knowledge and information systems, 33(1):1–33, 2012

  28. [29]

    Kamishima, S

    T. Kamishima, S. Akaho, H. Asoh, and J. Sakuma. Fairness-aware classifier with prejudice remover regularizer. InJoint European con- ference on machine learning and knowledge discovery in databases, pages 35–50. Springer, 2012

  29. [30]

    Karkkainen and J

    K. Karkkainen and J. Joo. Fairface: Face attribute dataset for balanced race, gender, and age for bias measurement and mitigation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 1548–1558, 2021

  30. [31]

    Kellnhofer, A

    P. Kellnhofer, A. Recasens, S. Stent, W. Matusik, and A. Torralba. Gaze360: Physically unconstrained gaze estimation in the wild. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6912–6921, 2019

  31. [32]

    Krafka, A

    K. Krafka, A. Khosla, P. Kellnhofer, H. Kannan, S. Bhandarkar, W. Matusik, and A. Torralba. Eye tracking for everyone. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2176–2184, 2016

  32. [33]

    Lanillos, J

    P. Lanillos, J. F. Ferreira, and J. Dias. A bayesian hierarchy for robust gaze estimation in human–robot interaction.International Journal of Approximate Reasoning, 87:1–22, 2017

  33. [34]

    Lee and S

    H. Lee and S. Chen. Systematic bias of machine learning regression models and correction.IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

  34. [35]

    G. Liu, Y . Yu, K. A. F. Mora, and J.-M. Odobez. A differential approach for gaze estimation.IEEE transactions on pattern analysis and machine intelligence, 43(3):1092–1099, 2019

  35. [36]

    J. Liu, J. Chi, W. Hu, and Z. Wang. 3d model-based gaze tracking via iris features with a single camera and a single light source.IEEE Transactions on Human-Machine Systems, 51(2):75–86, 2020

  36. [37]

    J. Liu, J. Chi, H. Yang, and X. Yin. In the eye of the beholder: A survey of gaze tracking techniques.Pattern Recognition, 132:108944, 2022

  37. [38]

    R. Liu, Y . Liu, H. Wang, and F. Lu. Pnp-ga+: Plug-and-play domain adaptation for gaze estimation using model variants.IEEE Transac- tions on Pattern Analysis and Machine Intelligence, 46(5):3707–3721, 2024

  38. [39]

    Y . Liu, R. Liu, H. Wang, and F. Lu. Generalizing gaze estimation with outlier-guided collaborative adaptation. InProceedings of the IEEE/CVF international conference on computer vision, pages 3835– 3844, 2021

  39. [40]

    Demographic parity: Mitigating biases in real-world data,

    O. Loukas and H.-R. Chung. Demographic parity: Mitigating biases in real-world data.arXiv preprint arXiv:2309.17347, 2023

  40. [41]

    Malladi, K

    S. Malladi, K. Lyu, A. Panigrahi, and S. Arora. On the sdes and scaling rules for adaptive gradient algorithms.Advances in Neural Information Processing Systems, 35:7697–7711, 2022

  41. [42]

    Manfei, D

    X. Manfei, D. Fralick, J. Z. Zheng, B. Wang, M. T. Xin, and F. Changyong. The differences and similarities between two-sample t-test and paired t-test.Shanghai archives of psychiatry, 29(3):184, 2017

  42. [43]

    Martinez, A

    F. Martinez, A. Carbone, and E. Pissaloux. Gaze estimation using local features and non-linear regression. In2012 19th IEEE International Conference on Image Processing, pages 1961–1964. IEEE, 2012

  43. [45]

    H. F. Menezes, A. S. Ferreira, E. T. Pereira, and H. M. Gomes. Bias and fairness in face detection. In2021 34th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 247–254. IEEE, 2021

  44. [46]

    Miroshnikov, K

    A. Miroshnikov, K. Kotsiopoulos, R. Franks, and A. R. Kannan. Model-agnostic bias mitigation methods with regressor distribu- tion control for wasserstein-based fairness metrics.arXiv preprint arXiv:2111.11259, 2021

  45. [47]

    Mokatren, T

    M. Mokatren, T. Kuflik, and I. Shimshoni. 3d gaze estimation using rgb-ir cameras.Sensors, 23(1):381, 2022

  46. [48]

    Mukherjee, M

    D. Mukherjee, M. Yurochkin, M. Banerjee, and Y . Sun. Two simple ways to learn individual fairness metrics from data. InInternational conference on machine learning, pages 7097–7107. PMLR, 2020

  47. [49]

    S. Park, A. Spurr, and O. Hilliges. Deep pictorial gaze estimation. In Proceedings of the European conference on computer vision (ECCV), pages 721–738, 2018

  48. [50]

    S. Park, X. Zhang, A. Bulling, and O. Hilliges. Learning to find eye region landmarks for remote gaze estimation in unconstrained settings. InProceedings of the 2018 ACM symposium on eye tracking research & applications, pages 1–10, 2018

  49. [51]

    Pathirana, S

    P. Pathirana, S. Senarath, D. Meedeniya, and S. Jayarathna. Eye gaze estimation: A survey on deep learning-based approaches.Expert Systems with Applications, 199:116894, 2022

  50. [52]

    Petersen, D

    F. Petersen, D. Mukherjee, Y . Sun, and M. Yurochkin. Post-processing for individual fairness.Advances in Neural Information Processing Systems, 34:25944–25955, 2021

  51. [53]

    Plecko and E

    D. Plecko and E. Bareinboim. Reconciling predictive and statistical parity: A causal approach. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 14625–14632, 2024

  52. [54]

    Ple ˇcko, N

    D. Ple ˇcko, N. Bennett, and N. Meinshausen. fairadapt: Causal reasoning for fair data preprocessing.Journal of Statistical Software, 110:1–35, 2024

  53. [55]

    Pleiss, M

    G. Pleiss, M. Raghavan, F. Wu, J. Kleinberg, and K. Q. Weinberger. On fairness and calibration.Advances in neural information processing systems, 30, 2017

  54. [56]

    Salvador, S

    T. Salvador, S. Cairns, V . V oleti, N. Marshall, and A. Oberman. Faircal: Fairness calibration for face verification, 2022

  55. [57]

    Sarridis, C

    I. Sarridis, C. Koutlis, S. Papadopoulos, and C. Diou. Towards fair face verification: An in-depth analysis of demographic biases. In R. Meo and F. Silvestri, editors,Machine Learning and Principles and Practice of Knowledge Discovery in Databases, pages 194–208, Cham, 2025. Springer Nature Switzerland

  56. [58]

    P. K. Sharma and P. Chakraborty. A review of driver gaze estimation and application in gaze behavior understanding.Engineering Appli- cations of Artificial Intelligence, 133:108117, 2024

  57. [59]

    K. Shen, Y . Li, Z. Guo, J. Gao, and Y . Wu. Model-based 3d gaze estimation using a tof camera.Sensors, 24(4):1070, 2024

  58. [60]

    Silvia, J

    C. Silvia, J. Ray, S. Tom, P. Aldo, J. Heinrich, and A. John. A general approach to fairness with optimal transport. InProceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 3633– 3640, 2020

  59. [61]

    Uddin, H

    S. Uddin, H. Lu, A. Rahman, and J. Gao. A novel approach for assessing fairness in deployed machine learning algorithms.Scientific Reports, 14(1):17753, 2024

  60. [62]

    Ugirumurera, E

    J. Ugirumurera, E. A. Bensen, J. Severino, and J. Sanyal. Addressing bias in bagging and boosting regression models.Scientific Reports, 14(1):18452, 2024

  61. [63]

    Ustun, Y

    B. Ustun, Y . Liu, and D. Parkes. Fairness without harm: Decoupled classifiers with preference guarantees. InInternational Conference on Machine Learning, pages 6373–6382. PMLR, 2019

  62. [64]

    Wang, C.-Y

    F.-E. Wang, C.-Y . Wang, M. Sun, and S.-H. Lai. Mixfairface: Towards ultimate fairness via mixfair adapter in face recognition. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 14531–14538, 2023

  63. [65]

    Wang and Q

    K. Wang and Q. Ji. Real time eye gaze tracking with 3d deformable eye-face model. InProceedings of the IEEE International Conference on Computer Vision, pages 1003–1011, 2017

  64. [66]

    Wang and W

    M. Wang and W. Deng. Mitigating bias in face recognition using skewness-aware reinforcement learning. In2020 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pages 9319–9328, 2020

  65. [67]

    T. Wang, J. Zhao, M. Yatskar, K.-W. Chang, and V . Ordonez. Balanced datasets are not enough: Estimating and mitigating gender bias in deep image representations. InProceedings of the IEEE/CVF international conference on computer vision, pages 5310–5319, 2019

  66. [68]

    X. Wang, J. Zhang, H. Zhang, S. Zhao, and H. Liu. Vision-based gaze estimation: A review.IEEE Transactions on Cognitive and Developmental Systems, 14(2):316–332, 2022

  67. [69]

    Y . Wang, Y . Jiang, J. Li, B. Ni, W. Dai, C. Li, H. Xiong, and T. Li. Contrastive regression for domain adaptation on gaze estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 19376–19385, 2022

  68. [70]

    Z. Wang, K. Qinami, I. C. Karakozis, K. Genova, P. Nair, K. Hata, and O. Russakovsky. Towards fairness in visual recognition: Effective strategies for bias mitigation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8919– 8928, 2020

  69. [71]

    E. Wood, T. Baltru ˇsaitis, L.-P. Morency, P. Robinson, and A. Bulling. A 3d morphable eye region model for gaze estimation. InEuropean conference on computer vision, pages 297–313. Springer, 2016

  70. [72]

    Wood and A

    E. Wood and A. Bulling. Eyetab: Model-based gaze estimation on unmodified tablet computers. InProceedings of the symposium on eye tracking research and applications, pages 207–210, 2014

  71. [73]

    T. Xu, J. White, S. Kalkan, and H. Gunes. Investigating bias and fairness in facial expression recognition. InEuropean Conference on Computer Vision, pages 506–523. Springer, 2020

  72. [74]

    X. Xu, Y . Huang, P. Shen, S. Li, J. Li, F. Huang, Y . Li, and Z. Cui. Consistent instance false positive improves fairness in face recognition. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 578–586, 2021

  73. [75]

    Y . Yang, A. Gupta, J. Feng, P. Singhal, V . Yadav, Y . Wu, P. Natarajan, V . Hedau, and J. Joo. Enhancing fairness in face detection in computer vision systems by demographic bias mitigation. InProceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pages 813– 822, 2022

  74. [76]

    J. Yu, X. Hao, Z. Cui, P. He, and T. Liu. Boosting fairness for masked face recognition. InProceedings of the IEEE/CVF international conference on computer vision, pages 1531–1540, 2021

  75. [77]

    Yucer, F

    S. Yucer, F. Tektas, N. Al Moubayed, and T. Breckon. Racial bias within face recognition: A survey.ACM Computing Surveys, 57(4):1– 39, 2024

  76. [78]

    B. H. Zhang, B. Lemoine, and M. Mitchell. Mitigating unwanted bi- ases with adversarial learning. InProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, pages 335–340, 2018

  77. [79]

    mixup: Beyond Empirical Risk Minimization

    H. Zhang, M. Cisse, Y . N. Dauphin, and D. Lopez-Paz. mixup: Beyond empirical risk minimization.arXiv preprint arXiv:1710.09412, 2017