CAFP: A Post-Processing Framework for Group Fairness via Counterfactual Model Averaging

Irina Ar\'evalo; Marcos Oliva

arxiv: 2604.07009 · v1 · submitted 2026-04-08 · 💻 cs.AI · cs.LG

CAFP: A Post-Processing Framework for Group Fairness via Counterfactual Model Averaging

Irina Ar\'evalo , Marcos Oliva This is my paper

Pith reviewed 2026-05-10 17:23 UTC · model grok-4.3

classification 💻 cs.AI cs.LG

keywords group fairnesspost-processingcounterfactual averagingdemographic parityequalized oddsmachine learningfairness intervention

0 comments

The pith

Averaging a model's predictions on factual and counterfactual inputs eliminates direct dependence on the protected attribute.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes CAFP, a post-processing framework that improves group fairness in any existing machine learning model by averaging its predictions on an input and on a version of the input where the protected attribute has been flipped. The technique requires no changes to the model training or architecture and can be applied at inference time. A sympathetic reader would care if they need to deploy a pre-trained classifier in a fairness-sensitive domain without the ability to retrain it or access protected attributes during training. The authors provide theoretical guarantees that this averaging removes direct dependence on the sensitive attribute and achieves certain fairness metrics under mild assumptions.

Core claim

CAFP operates by generating counterfactual versions of each input in which the sensitive attribute is flipped, and then averaging the model's predictions across factual and counterfactual instances. This eliminates direct dependence on the protected attribute, reduces mutual information between predictions and sensitive attributes, and provably bounds the distortion introduced relative to the original model. Under mild assumptions, CAFP achieves perfect demographic parity and reduces the equalized odds gap by at least half the average counterfactual bias.

What carries the argument

Counterfactual model averaging: averaging the original model's output on the factual input with its output on the input after flipping the value of the protected attribute.

Load-bearing premise

Realistic counterfactual inputs can be created by simply flipping the protected attribute value and the original model can be queried on these at inference time.

What would settle it

Test the averaged model on a dataset where flipping the protected attribute produces inputs that are out of distribution or implausible, and check whether the demographic parity or equalized odds guarantees still hold.

read the original abstract

Ensuring fairness in machine learning predictions is a critical challenge, especially when models are deployed in sensitive domains such as credit scoring, healthcare, and criminal justice. While many fairness interventions rely on data preprocessing or algorithmic constraints during training, these approaches often require full control over the model architecture and access to protected attribute information, which may not be feasible in real-world systems. In this paper, we propose Counterfactual Averaging for Fair Predictions (CAFP), a model-agnostic post-processing method that mitigates unfair influence from protected attributes without retraining or modifying the original classifier. CAFP operates by generating counterfactual versions of each input in which the sensitive attribute is flipped, and then averaging the model's predictions across factual and counterfactual instances. We provide a theoretical analysis of CAFP, showing that it eliminates direct dependence on the protected attribute, reduces mutual information between predictions and sensitive attributes, and provably bounds the distortion introduced relative to the original model. Under mild assumptions, we further show that CAFP achieves perfect demographic parity and reduces the equalized odds gap by at least half the average counterfactual bias.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CAFP averages a black-box model's predictions on each test point and its protected-attribute flip to force demographic parity by construction, but the whole thing collapses if the attribute is unavailable at inference.

read the letter

CAFP's main idea is to query the original model twice per test example—once with the real protected attribute and once with it flipped—then average the two outputs. That averaging step makes the final prediction independent of the original value of A, which immediately delivers perfect demographic parity. The paper also claims this cuts the equalized-odds gap by at least half the average counterfactual bias and keeps the distortion from the original model bounded. Those are the concrete guarantees on offer. The method is model-agnostic and needs no retraining or architecture changes, so it fits the common case where you only have black-box access at deployment. That part is genuinely useful for practitioners who can afford an extra forward pass and still have the sensitive attribute at test time. The framing around mutual-information reduction and distortion bounds is a reasonable way to quantify the trade-off. The soft spot is exactly the one the stress-test flags: everything rests on being able to observe A at inference and treat the flipped version as a valid input. Plenty of deployed systems withhold A for privacy or legal reasons, and the paper supplies no fallback procedure for that situation. The abstract mentions “mild assumptions” but does not list them, and without the actual equations or proof sketches it is difficult to tell how much of the equalized-odds reduction is a non-trivial derivation versus a direct consequence of the averaging definition. Experiments are not described in the abstract either, so the practical size of the distortion and the half-gap claim remain unverified here. This paper is for people already working on inference-time fairness patches who need a lightweight option when they control the test-time inputs. A reader focused on post-processing or black-box constraints would find the construction worth examining, though the scope is narrower than the abstract suggests. I would send it to peer review. The core procedure is simple enough that referees can check the math quickly, and the community benefits from more documented attempts at this exact deployment constraint even if the current version needs tighter assumptions and a clearer limitations section.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes CAFP, a model-agnostic post-processing framework for group fairness. For each test input, it generates a counterfactual by flipping the protected attribute A and averages the original model's predictions on the factual and counterfactual versions. The paper claims this eliminates direct dependence on A, reduces mutual information between predictions and A, bounds distortion relative to the original model, and—under mild assumptions—achieves perfect demographic parity while reducing the equalized odds gap by at least half the average counterfactual bias.

Significance. If the theoretical claims are rigorously established and the assumptions hold in practice, CAFP would supply a simple, training-free post-processing technique applicable to any black-box classifier. This could be useful in deployment settings where retraining is infeasible. The post-processing design and the explicit bounds on distortion and information leakage are potentially valuable, but their impact is limited by the requirement for protected-attribute access at inference.

major comments (3)

[Abstract / Method] Abstract and Method section: The perfect demographic parity guarantee is obtained only by averaging f(x, A) and f(x, 1-A) at inference time. This construction mathematically forces identical output distributions across groups solely when A is observed and the flipped input is a valid query to the original model. No alternative procedure is supplied for the common case in which A is withheld at deployment; the fairness claims therefore do not hold under that realistic constraint.
[Theoretical Analysis] Theoretical Analysis section: The stated bounds on mutual-information reduction and distortion, as well as the 'at least half' reduction in the equalized-odds gap, appear to follow directly from the averaging definition itself rather than from an independent derivation. Explicit equations, proof sketches, and the precise 'mild assumptions' must be provided to demonstrate that the results are not tautological with the method's construction.
[Method] Method section: The assumption that simply flipping the value of A produces realistic counterfactuals is load-bearing for all fairness guarantees. When features are correlated with A, the counterfactual (x, 1-A) may lie far outside the data distribution, rendering the averaged prediction meaningless and invalidating the claimed bounds.

minor comments (2)

[Abstract] Abstract: The phrase 'mild assumptions' is used without enumeration; these assumptions should be stated explicitly so readers can assess their realism.
[Experiments] Throughout: No empirical results, tables, or figures are referenced in the provided abstract or summary; if experiments exist, they should be summarized to illustrate that the theoretical reductions materialize on real data.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, indicating planned revisions to the manuscript where appropriate.

read point-by-point responses

Referee: [Abstract / Method] Abstract and Method section: The perfect demographic parity guarantee is obtained only by averaging f(x, A) and f(x, 1-A) at inference time. This construction mathematically forces identical output distributions across groups solely when A is observed and the flipped input is a valid query to the original model. No alternative procedure is supplied for the common case in which A is withheld at deployment; the fairness claims therefore do not hold under that realistic constraint.

Authors: We agree that the exact demographic parity guarantee requires access to the protected attribute A at inference time to construct and query the counterfactual input. This is a core aspect of the post-processing design presented in the Method section. We will revise the Abstract to explicitly state the inference-time access requirement and add a dedicated paragraph in the Method section discussing deployment scenarios where A is unavailable. In those cases, CAFP cannot be applied directly, and we will note that alternative approaches (such as those relying on proxies or training-time interventions) would be needed instead. revision: yes
Referee: [Theoretical Analysis] Theoretical Analysis section: The stated bounds on mutual-information reduction and distortion, as well as the 'at least half' reduction in the equalized-odds gap, appear to follow directly from the averaging definition itself rather than from an independent derivation. Explicit equations, proof sketches, and the precise 'mild assumptions' must be provided to demonstrate that the results are not tautological with the method's construction.

Authors: The properties do derive from the averaging construction, but the Theoretical Analysis section presents them as formal theorems obtained via probabilistic arguments applied to the averaged predictor. We will expand this section substantially by inserting the explicit equations (e.g., the mutual-information bound I(Ŷ;A) ≤ ½ I(f(X,A);A), the distortion bound in terms of total variation, and the equalized-odds gap reduction), step-by-step proof sketches, and a precise list of the mild assumptions (including that the base model is defined on the augmented feature space and that averaging is performed exactly). These additions will clarify the derivations and show they are not merely restatements of the method. revision: yes
Referee: [Method] Method section: The assumption that simply flipping the value of A produces realistic counterfactuals is load-bearing for all fairness guarantees. When features are correlated with A, the counterfactual (x, 1-A) may lie far outside the data distribution, rendering the averaged prediction meaningless and invalidating the claimed bounds.

Authors: We acknowledge that the realism of the counterfactual inputs is a key assumption underlying the guarantees. The paper treats this as one of the 'mild assumptions' under which the bounds hold, without requiring the counterfactual to lie in the training distribution—only that the original model can evaluate it. When features are strongly correlated with A, the averaged output may indeed be less interpretable. We will revise the Method section to state the assumption more explicitly and add a new Limitations paragraph that discusses this issue, references related work on counterfactual generation, and notes that more advanced causal models could be used to improve counterfactual quality in practice. revision: partial

Circularity Check

0 steps flagged

No significant circularity in CAFP derivation chain

full rationale

The paper defines CAFP via counterfactual averaging of model outputs and separately provides a theoretical analysis deriving fairness properties (elimination of direct dependence, MI reduction, distortion bounds, perfect DP and EO gap reduction under mild assumptions). No equations or steps are exhibited where a claimed result reduces exactly to the input definition by construction, no self-citations load-bear the central claims, and no fitted parameters are relabeled as predictions. The analysis is presented as independent derivation from the averaging operator plus stated assumptions, making the chain self-contained rather than tautological.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard fairness assumptions about counterfactual generation and model access at inference plus unspecified 'mild assumptions' that enable the perfect demographic parity result.

axioms (2)

domain assumption Mild assumptions that enable perfect demographic parity after averaging
Invoked in the abstract to support the claim of achieving perfect demographic parity.
domain assumption Counterfactual inputs can be generated by flipping the protected attribute value
Implicit in the method description; required for the averaging step to be well-defined.

pith-pipeline@v0.9.0 · 5491 in / 1410 out tokens · 42420 ms · 2026-05-10T17:23:58.943530+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

[1]

and it’s biased against blacks

Angwin, J., Larson, J., Mattu, S., Kirch- ner, L.: Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. ProPublica (2016) 28

work page 2016
[2]

California Law Review104(3), 671– 732 (2016)

Barocas, S., Selbst, A.D.: Big data’s disparate impact. California Law Review104(3), 671– 732 (2016)

work page 2016
[3]

Eubanks, V.: Automating Inequality: How High-Tech Tools Profile, Police, and Pun- ish the Poor. St. Martin’s Press, Inc., USA (2018)

work page 2018
[4]

How Search Engines Reinforce Racism

Noble, S.U.: Algorithms of Oppression. How Search Engines Reinforce Racism. New York University Press, New York (2018)

work page 2018
[5]

In: Friedler, S.A., Wilson, C

Buolamwini, J., Gebru, T.: Gender shades: Intersectional accuracy disparities in com- mercial gender classification. In: Friedler, S.A., Wilson, C. (eds.) Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Proceedings of Machine Learn- ing Research, vol. 81, pp. 77–91. PMLR, Cambridge, MA (2018)

work page 2018
[6]

Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrim- ination, vol. 33, pp. 1–33. Springer, Berlin, Heidelberg (2012)

work page 2012
[7]

In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining

Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining. KDD ’15, pp. 259–

work page
[8]

Association for Computing Machinery, New York, NY, USA (2015)

work page 2015
[9]

In: Proceedings of the 26th International Con- ference on World Wide Web

Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: Learning clas- sification without disparate mistreatment. In: Proceedings of the 26th International Con- ference on World Wide Web. WWW ’17, pp. 1171–1180. International World Wide Web Conferences Steering Committee, Republic and Canton of ...

work page 2017
[10]

In: Dy, J., Krause, A

Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 60–69. PMLR, Cam- bridge, MA (2018)

work page 2018
[11]

In: Pro- ceedings of the 30th International Conference on Neural Information Processing Systems

Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Pro- ceedings of the 30th International Conference on Neural Information Processing Systems. NIPS’16, pp. 3323–3331. Curran Associates Inc., Red Hook, NY, USA (2016)

work page 2016
[12]

In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems

Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and calibra- tion. In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems. NIPS’17, pp. 5684–5693. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[13]

MIT Press, Cambridge, MA (2023)

Barocas, S., Hardt, M., Narayanan, A.: Fair- ness and Machine Learning: Limitations and Opportunities. MIT Press, Cambridge, MA (2023)

work page 2023
[14]

Science366(6464), 447–453 (2019)

Obermeyer, Z., Powers, B., Vogeli, C., Mul- lainathan, S.: Dissecting racial bias in an algorithm used to manage the health of pop- ulations. Science366(6464), 447–453 (2019)

work page 2019
[15]

In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

Raji, I.D., Buolamwini, J.: Actionable audit- ing: Investigating the impact of publicly nam- ing biased performance results of commercial ai products. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. AIES ’19, pp. 429–435. Association for Computing Machinery, New York, NY, USA (2019)

work page 2019
[16]

In: Proceedings of the 3rd Innovations in The- oretical Computer Science Conference

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in The- oretical Computer Science Conference. ITCS ’12, pp. 214–226. Association for Computing Machinery, New York, NY, USA (2012)

work page 2012
[17]

In: Proceedings of the 31st International Conference on Neu- ral Information Processing Systems

Kusner, M., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Proceedings of the 31st International Conference on Neu- ral Information Processing Systems. NIPS’17, pp. 4069–4079. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[18]

Chiappa, S.: Path-specific counterfactual 29 fairness. In: Proceedings of the Thirty- Third AAAI Conference on Artificial Intel- ligence and Thirty-First Innovative Appli- cations of Artificial Intelligence Conference and Ninth AAAI Symposium on Educa- tional Advances in Artificial Intelligence. AAAI’19/IAAI’19/EAAI’19. AAAI Press, Washington, DC (2019)

work page 2019
[19]

In: Papadimitriou, C.H

Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determina- tion of risk scores. In: Papadimitriou, C.H. (ed.) 8th Innovations in Theoretical Com- puter Science Conference (ITCS 2017). Leib- niz International Proceedings in Informat- ics (LIPIcs), vol. 67, pp. 43–14323. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik

work page 2017
[20]

Big Data5(2), 153– 163 (2017)

Chouldechova, A.: Fair prediction with dis- parate impact: A study of bias in recidivism prediction instruments. Big Data5(2), 153– 163 (2017)

work page 2017
[21]

In: 2012 IEEE 12th International Conference on Data Mining, pp

Kamiran, F., Karim, A., Zhang, X.: Decision theory for discrimination-aware classification. In: 2012 IEEE 12th International Conference on Data Mining, pp. 924–929 (2012)

work page 2012
[22]

Fairbatch: Batch selection for model fairness,

Roh, Y., Lee, K., Whang, S.E., Suh, C.: Fair- Batch: Batch Selection for Model Fairness (2021). https://arxiv.org/abs/2012.01696

work page arXiv 2021
[23]

In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency

Mishler, A., Kennedy, E.H., Chouldechova, A.: Fairness in risk assessment instruments: Post-processing to achieve counterfactual equalized odds. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’21, pp. 386–400. Association for Computing Machinery, New York, NY, USA (2021)

work page 2021
[24]

Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Proceedings of the Thirty- Second AAAI Conference on Artificial Intel- ligence and Thirtieth Innovative Applica- tions of Artificial Intelligence Conference and Eighth AAAI Symposium on Educa- tional Advances in Artificial Intelligence. AAAI’18/IAAI’18/EAAI’18. AAAI Press, Washington, DC (2018)

work page 2018
[25]

In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems

Kilbertus, N., Rojas-Carulla, M., Parascan- dolo, G., Hardt, M., Janzing, D., Sch¨ olkopf, B.: Avoiding discrimination through causal reasoning. In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems. NIPS’17, pp. 656–666. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[26]

In: Proceedings of the Conference on Fairness, Accountability, and Transparency

Madras, D., Creager, E., Pitassi, T., Zemel, R.: Fairness through causal awareness: Learn- ing causal latent-variable models for biased data. In: Proceedings of the Conference on Fairness, Accountability, and Transparency. FAT* ’19, pp. 349–358. Association for Com- puting Machinery, New York, NY, USA (2019)

work page 2019
[27]

In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems

Russell, C., Kusner, M.J., Loftus, J.R., Silva, R.: When worlds collide: integrating differ- ent counterfactual assumptions in fairness. In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems. NIPS’17, pp. 6417–6426. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017
[28]

In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp

Wang, M., Deng, W., Hu, J., Tao, X., Huang, Y.: Racial Faces in the Wild: Reducing Racial Bias by Information Max- imization Adaptation Network . In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 692–702. IEEE Computer Society, Los Alamitos, CA, USA (2019)

work page 2019
[29]

In: Palmer, M., Hwa, R., Riedel, S

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Nat- ural Language Processing, pp. 2979–2989. Association for Computational Linguistics, Copenhagen, Den...

work page 2017
[30]

In: Inui, K., Jiang, J., Ng, V., Wan, X

Sheng, E., Chang, K.-W., Natarajan, P., Peng, N.: The woman worked as a babysitter: On biases in language generation. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural 30 Language Processing (EMNLP-IJCNLP), pp. 3407–34...

work page 2019
[31]

Electronic Commerce Research, 1–31 (2024)

Bahi, A., Gasmi, I., Bentrad, S., Khantouchi, R.: Mycgnn: enhancing recommendation diversity in e-commerce through mycelium- inspired graph neural network. Electronic Commerce Research, 1–31 (2024)

work page 2024
[32]

West Virginia Law Review123(3), 735–790 (2021)

Wachter, S., Mittelstadt, B., Russell, C.: Bias preservation in machine learning: The legality of fairness metrics under eu non- discrimination law. West Virginia Law Review123(3), 735–790 (2021)

work page 2021
[33]

Wiley-Interscience, USA (2006)

Cover, T.M., Thomas, J.A.: Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, USA (2006)

work page 2006
[34]

In: Friedler, S.A., Wilson, C

Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Friedler, S.A., Wilson, C. (eds.) Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Proceedings of Machine Learn- ing Research, vol. 81, pp. 107–118. PMLR, Cambridge, MA (2018)

work page 2018
[35]

In: Proceedings of the 32nd International Conference on Neu- ral Information Processing Systems

Moyer, D., Gao, S., Brekelmans, R., Steeg, G.V., Galstyan, A.: Invariant representations without adversarial training. In: Proceedings of the 32nd International Conference on Neu- ral Information Processing Systems. NIPS’18, pp. 9102–9111. Curran Associates Inc., Red Hook, NY, USA (2018)

work page 2018
[36]

UCI Machine Learning Repository (1996)

Becker, B., Kohavi, R.: Adult. UCI Machine Learning Repository (1996)

work page 1996
[37]

Angwin, J., Larson, J., Mattu, S., Kirchner, L.: How we analyzed the compas recidivism algorithm (2016)

work page 2016
[38]

Statlog (German Credit Data)

Hofmann, H.: Statlog (German Credit Data). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5NC77 (1994) 31

work page doi:10.24432/c5nc77 1994

[1] [1]

and it’s biased against blacks

Angwin, J., Larson, J., Mattu, S., Kirch- ner, L.: Machine bias: There’s software used across the country to predict future criminals. and it’s biased against blacks. ProPublica (2016) 28

work page 2016

[2] [2]

California Law Review104(3), 671– 732 (2016)

Barocas, S., Selbst, A.D.: Big data’s disparate impact. California Law Review104(3), 671– 732 (2016)

work page 2016

[3] [3]

Eubanks, V.: Automating Inequality: How High-Tech Tools Profile, Police, and Pun- ish the Poor. St. Martin’s Press, Inc., USA (2018)

work page 2018

[4] [4]

How Search Engines Reinforce Racism

Noble, S.U.: Algorithms of Oppression. How Search Engines Reinforce Racism. New York University Press, New York (2018)

work page 2018

[5] [5]

In: Friedler, S.A., Wilson, C

Buolamwini, J., Gebru, T.: Gender shades: Intersectional accuracy disparities in com- mercial gender classification. In: Friedler, S.A., Wilson, C. (eds.) Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Proceedings of Machine Learn- ing Research, vol. 81, pp. 77–91. PMLR, Cambridge, MA (2018)

work page 2018

[6] [6]

Kamiran, F., Calders, T.: Data preprocessing techniques for classification without discrim- ination, vol. 33, pp. 1–33. Springer, Berlin, Heidelberg (2012)

work page 2012

[7] [7]

In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining

Feldman, M., Friedler, S.A., Moeller, J., Scheidegger, C., Venkatasubramanian, S.: Certifying and removing disparate impact. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Dis- covery and Data Mining. KDD ’15, pp. 259–

work page

[8] [8]

Association for Computing Machinery, New York, NY, USA (2015)

work page 2015

[9] [9]

In: Proceedings of the 26th International Con- ference on World Wide Web

Zafar, M.B., Valera, I., Gomez Rodriguez, M., Gummadi, K.P.: Fairness beyond disparate treatment & disparate impact: Learning clas- sification without disparate mistreatment. In: Proceedings of the 26th International Con- ference on World Wide Web. WWW ’17, pp. 1171–1180. International World Wide Web Conferences Steering Committee, Republic and Canton of ...

work page 2017

[10] [10]

In: Dy, J., Krause, A

Agarwal, A., Beygelzimer, A., Dudik, M., Langford, J., Wallach, H.: A reductions approach to fair classification. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 60–69. PMLR, Cam- bridge, MA (2018)

work page 2018

[11] [11]

In: Pro- ceedings of the 30th International Conference on Neural Information Processing Systems

Hardt, M., Price, E., Srebro, N.: Equality of opportunity in supervised learning. In: Pro- ceedings of the 30th International Conference on Neural Information Processing Systems. NIPS’16, pp. 3323–3331. Curran Associates Inc., Red Hook, NY, USA (2016)

work page 2016

[12] [12]

In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems

Pleiss, G., Raghavan, M., Wu, F., Kleinberg, J., Weinberger, K.Q.: On fairness and calibra- tion. In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems. NIPS’17, pp. 5684–5693. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017

[13] [13]

MIT Press, Cambridge, MA (2023)

Barocas, S., Hardt, M., Narayanan, A.: Fair- ness and Machine Learning: Limitations and Opportunities. MIT Press, Cambridge, MA (2023)

work page 2023

[14] [14]

Science366(6464), 447–453 (2019)

Obermeyer, Z., Powers, B., Vogeli, C., Mul- lainathan, S.: Dissecting racial bias in an algorithm used to manage the health of pop- ulations. Science366(6464), 447–453 (2019)

work page 2019

[15] [15]

In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society

Raji, I.D., Buolamwini, J.: Actionable audit- ing: Investigating the impact of publicly nam- ing biased performance results of commercial ai products. In: Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society. AIES ’19, pp. 429–435. Association for Computing Machinery, New York, NY, USA (2019)

work page 2019

[16] [16]

In: Proceedings of the 3rd Innovations in The- oretical Computer Science Conference

Dwork, C., Hardt, M., Pitassi, T., Reingold, O., Zemel, R.: Fairness through awareness. In: Proceedings of the 3rd Innovations in The- oretical Computer Science Conference. ITCS ’12, pp. 214–226. Association for Computing Machinery, New York, NY, USA (2012)

work page 2012

[17] [17]

In: Proceedings of the 31st International Conference on Neu- ral Information Processing Systems

Kusner, M., Loftus, J., Russell, C., Silva, R.: Counterfactual fairness. In: Proceedings of the 31st International Conference on Neu- ral Information Processing Systems. NIPS’17, pp. 4069–4079. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017

[18] [18]

Chiappa, S.: Path-specific counterfactual 29 fairness. In: Proceedings of the Thirty- Third AAAI Conference on Artificial Intel- ligence and Thirty-First Innovative Appli- cations of Artificial Intelligence Conference and Ninth AAAI Symposium on Educa- tional Advances in Artificial Intelligence. AAAI’19/IAAI’19/EAAI’19. AAAI Press, Washington, DC (2019)

work page 2019

[19] [19]

In: Papadimitriou, C.H

Kleinberg, J., Mullainathan, S., Raghavan, M.: Inherent trade-offs in the fair determina- tion of risk scores. In: Papadimitriou, C.H. (ed.) 8th Innovations in Theoretical Com- puter Science Conference (ITCS 2017). Leib- niz International Proceedings in Informat- ics (LIPIcs), vol. 67, pp. 43–14323. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik

work page 2017

[20] [20]

Big Data5(2), 153– 163 (2017)

Chouldechova, A.: Fair prediction with dis- parate impact: A study of bias in recidivism prediction instruments. Big Data5(2), 153– 163 (2017)

work page 2017

[21] [21]

In: 2012 IEEE 12th International Conference on Data Mining, pp

Kamiran, F., Karim, A., Zhang, X.: Decision theory for discrimination-aware classification. In: 2012 IEEE 12th International Conference on Data Mining, pp. 924–929 (2012)

work page 2012

[22] [22]

Fairbatch: Batch selection for model fairness,

Roh, Y., Lee, K., Whang, S.E., Suh, C.: Fair- Batch: Batch Selection for Model Fairness (2021). https://arxiv.org/abs/2012.01696

work page arXiv 2021

[23] [23]

In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency

Mishler, A., Kennedy, E.H., Chouldechova, A.: Fairness in risk assessment instruments: Post-processing to achieve counterfactual equalized odds. In: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. FAccT ’21, pp. 386–400. Association for Computing Machinery, New York, NY, USA (2021)

work page 2021

[24] [24]

Nabi, R., Shpitser, I.: Fair inference on outcomes. In: Proceedings of the Thirty- Second AAAI Conference on Artificial Intel- ligence and Thirtieth Innovative Applica- tions of Artificial Intelligence Conference and Eighth AAAI Symposium on Educa- tional Advances in Artificial Intelligence. AAAI’18/IAAI’18/EAAI’18. AAAI Press, Washington, DC (2018)

work page 2018

[25] [25]

In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems

Kilbertus, N., Rojas-Carulla, M., Parascan- dolo, G., Hardt, M., Janzing, D., Sch¨ olkopf, B.: Avoiding discrimination through causal reasoning. In: Proceedings of the 31st Inter- national Conference on Neural Information Processing Systems. NIPS’17, pp. 656–666. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017

[26] [26]

In: Proceedings of the Conference on Fairness, Accountability, and Transparency

Madras, D., Creager, E., Pitassi, T., Zemel, R.: Fairness through causal awareness: Learn- ing causal latent-variable models for biased data. In: Proceedings of the Conference on Fairness, Accountability, and Transparency. FAT* ’19, pp. 349–358. Association for Com- puting Machinery, New York, NY, USA (2019)

work page 2019

[27] [27]

In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems

Russell, C., Kusner, M.J., Loftus, J.R., Silva, R.: When worlds collide: integrating differ- ent counterfactual assumptions in fairness. In: Proceedings of the 31st International Conference on Neural Information Process- ing Systems. NIPS’17, pp. 6417–6426. Curran Associates Inc., Red Hook, NY, USA (2017)

work page 2017

[28] [28]

In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp

Wang, M., Deng, W., Hu, J., Tao, X., Huang, Y.: Racial Faces in the Wild: Reducing Racial Bias by Information Max- imization Adaptation Network . In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV), pp. 692–702. IEEE Computer Society, Los Alamitos, CA, USA (2019)

work page 2019

[29] [29]

In: Palmer, M., Hwa, R., Riedel, S

Zhao, J., Wang, T., Yatskar, M., Ordonez, V., Chang, K.-W.: Men also like shopping: Reducing gender bias amplification using corpus-level constraints. In: Palmer, M., Hwa, R., Riedel, S. (eds.) Proceedings of the 2017 Conference on Empirical Methods in Nat- ural Language Processing, pp. 2979–2989. Association for Computational Linguistics, Copenhagen, Den...

work page 2017

[30] [30]

In: Inui, K., Jiang, J., Ng, V., Wan, X

Sheng, E., Chang, K.-W., Natarajan, P., Peng, N.: The woman worked as a babysitter: On biases in language generation. In: Inui, K., Jiang, J., Ng, V., Wan, X. (eds.) Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural 30 Language Processing (EMNLP-IJCNLP), pp. 3407–34...

work page 2019

[31] [31]

Electronic Commerce Research, 1–31 (2024)

Bahi, A., Gasmi, I., Bentrad, S., Khantouchi, R.: Mycgnn: enhancing recommendation diversity in e-commerce through mycelium- inspired graph neural network. Electronic Commerce Research, 1–31 (2024)

work page 2024

[32] [32]

West Virginia Law Review123(3), 735–790 (2021)

Wachter, S., Mittelstadt, B., Russell, C.: Bias preservation in machine learning: The legality of fairness metrics under eu non- discrimination law. West Virginia Law Review123(3), 735–790 (2021)

work page 2021

[33] [33]

Wiley-Interscience, USA (2006)

Cover, T.M., Thomas, J.A.: Elements of Information Theory (Wiley Series in Telecommunications and Signal Processing). Wiley-Interscience, USA (2006)

work page 2006

[34] [34]

In: Friedler, S.A., Wilson, C

Menon, A.K., Williamson, R.C.: The cost of fairness in binary classification. In: Friedler, S.A., Wilson, C. (eds.) Proceedings of the 1st Conference on Fairness, Accountability and Transparency. Proceedings of Machine Learn- ing Research, vol. 81, pp. 107–118. PMLR, Cambridge, MA (2018)

work page 2018

[35] [35]

In: Proceedings of the 32nd International Conference on Neu- ral Information Processing Systems

Moyer, D., Gao, S., Brekelmans, R., Steeg, G.V., Galstyan, A.: Invariant representations without adversarial training. In: Proceedings of the 32nd International Conference on Neu- ral Information Processing Systems. NIPS’18, pp. 9102–9111. Curran Associates Inc., Red Hook, NY, USA (2018)

work page 2018

[36] [36]

UCI Machine Learning Repository (1996)

Becker, B., Kohavi, R.: Adult. UCI Machine Learning Repository (1996)

work page 1996

[37] [37]

Angwin, J., Larson, J., Mattu, S., Kirchner, L.: How we analyzed the compas recidivism algorithm (2016)

work page 2016

[38] [38]

Statlog (German Credit Data)

Hofmann, H.: Statlog (German Credit Data). UCI Machine Learning Repository. DOI: https://doi.org/10.24432/C5NC77 (1994) 31

work page doi:10.24432/c5nc77 1994