Counterfactually Fair Regression via Optimal Transport

J-J. Vie; M. Generali Lince; P. Loiseau; S. Gaucher

arxiv: 2605.28251 · v1 · pith:RSAEG66Vnew · submitted 2026-05-27 · 📊 stat.ML · cs.CY· cs.LG

Counterfactually Fair Regression via Optimal Transport

M. Generali Lince , S. Gaucher , J-J. Vie , P. Loiseau This is my paper

Pith reviewed 2026-06-29 09:59 UTC · model grok-4.3

classification 📊 stat.ML cs.CYcs.LG

keywords counterfactual fairnessoptimal transportregressiondemographic paritypost-processingfinite-sample boundscausal machine learning

0 comments

The pith

Counterfactual fairness equals demographic parity conditional on the latent variable, giving a closed-form optimal fair regressor via barycentric quantile map.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that counterfactual fairness, defined via resampled noise, is equivalent to demographic parity conditional on the latent variable. This equivalence supplies a closed-form expression for the optimal fair regressor through a barycentric quantile map. For continuous latent variables a discretized post-processing estimator is introduced that attains high-probability finite-sample fairness guarantees with unfairness decaying at rate Õ(n^{-1/3}) together with a matching risk bound. The results extend to relaxed counterfactual fairness.

Core claim

Counterfactual fairness is equivalent to satisfying demographic parity conditional on the latent variable. This equivalence yields a closed-form expression of the optimal fair regressor via a barycentric quantile map. The discretized post-processing method provides finite-sample fairness guarantees at rate Õ(n^{-1/3}) with matching risk bounds and a matching lower bound on excess risk for almost fair predictions.

What carries the argument

Barycentric quantile map obtained from optimal transport that enforces conditional demographic parity while minimizing regression risk.

If this is right

The optimal fair regressor admits an explicit closed-form expression via the barycentric quantile map.
The estimator satisfies high-probability finite-sample fairness at rate Õ(n^{-1/3}).
A matching lower bound holds for the excess risk of predictors that are almost counterfactually fair.
The same guarantees extend to relaxed counterfactual fairness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The discretization step may be reusable for other post-processing fairness methods that involve continuous conditioning variables.
The cubic dependence on sample size suggests practical limits on how small the fairness tolerance can be made without very large datasets.
Similar optimal-transport reductions could apply to other causal fairness definitions that involve latent noise resampling.

Load-bearing premise

The underlying distributions satisfy mild regularity conditions that justify the convergence of the empirical maps and the discretization approximation.

What would settle it

On data drawn from distributions satisfying the regularity conditions, observe whether the post-processed estimator's conditional demographic parity violation decays slower than n to the power of minus one third with high probability.

Figures

Figures reproduced from arXiv: 2605.28251 by J-J. Vie, M. Generali Lince, P. Loiseau, S. Gaucher.

**Figure 1.** Figure 1: Structural Causal Model. Illustration. Throughout the paper, we consider the following example: S is a gender; V is a student’s intrinsic ability; X aggregates coursework signals (e.g., homework, class participation of previous year) downstream of (V, S); and Y is the future year course grade, downstream of X. Both X and Y reflect systemic biases [29]. Two students with the same V can have different Y due… view at source ↗

**Figure 2.** Figure 2: Synthetic Dataset. Left: Unfairness drops and stabilizes near the theoretical L ∗ th. Middle: Our method (purple) dominates WFR (gray) on the CF tradeoff frontier. Right: WFR exhibits a better DP tradeoff frontier. 7 Experiments In the main text, we evaluate on a synthetic and real-world Law School (LSAC) dataset, comparing our post-processing method against two baselines: Counterfactual Fairness (Fair K) … view at source ↗

**Figure 3.** Figure 3: LSAC: Analysis of fairness and risk trade-offs. (Left) Unfairness vs. number of intervals L, showing the bias-variance trade-off with the theoretical L ∗ th indicated by the vertical dashed line. (Center) Pareto frontier of risk vs. unfairness for varying relaxation parameter α, demonstrating smooth interpolation. (Right) Pareto frontier of risk vs. Demographic Parity for different α. Shaded regions repres… view at source ↗

**Figure 4.** Figure 4: evaluates this empirically on synthetic data. The log-log plot demonstrates that empirical unfairness decays significantly faster than the theoretical bound, reaching near-zero levels (U(fb L∗ ) < 10−5 ) with just n ≈ 2000 samples. Furthermore, because our post-processor operates entirely on 1D scalars rather than highdimensional features, it is highly scalable [PITH_FULL_IMAGE:figures/full_fig_p060_4.png] view at source ↗

**Figure 6.** Figure 6: confirms this on synthetic data for K ∈ {3, 5, 10}: as K increases and data sparsity becomes extreme, the theoretical L ∗ th dynamically adapts to locate the empirical minimum [PITH_FULL_IMAGE:figures/full_fig_p061_6.png] view at source ↗

**Figure 7.** Figure 7: Discretization Trade-off on the Law School Dataset for black-box models not satisfying Properties 4.1 and 4.2. Applied to Gradient Boosting (red squares) and Random Forest (teal circles). The impact of non-smoothness dissipates as L increases, proving the method works effectively without the Lipschitz assumption on the black-box model. Minority Group Imbalance. To evaluate robustness to demographic scarcit… view at source ↗

**Figure 8.** Figure 8: Robustness to minority group imbalance (wmin). The method maintains stable, near-zero unfairness down to wmin ≈ 0.05. At extreme scarcity, the inability to reliably estimate the minority empirical c.d.f. causes a sharp spike in approximation bias, while global RMSE drops as the majority group dominates the metric. 10−1 100 101 Error of estimation (c = Lbcdf /Lcdf ) 10.00 15.00 20.00 25.00 Unfairness ( U( … view at source ↗

**Figure 10.** Figure 10: Robustness to unbiased proxy noise. Dual axes show that as noise σ increases, unfairness rises mildly (peaking below 0.05) while RMSE decreases. 0.0 0.2 0.4 0.6 0.8 1.0 Proxy leakage (λ) 0.00 1.00 2.00 3.00 Unfairness ( U(fL∗ )) (×10−1 ) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 RMSE [PITH_FULL_IMAGE:figures/full_fig_p063_10.png] view at source ↗

read the original abstract

We consider the problem of learning a counterfactually fair regressor. We adopt a causal uncertainty view in which counterfactual fairness is defined with resampled noise. We focus on obtaining theoretical fairness guarantees for a new post-processing estimator. We begin by showing that counterfactual fairness is equivalent to satisfying demographic parity conditional on the latent variable. This allows us to provide a closed-form expression of the optimal fair regressor via a barycentric quantile map. In order to handle continuous latent variables, we propose a discretized post-processing method. Then, under mild regularity assumptions, we prove high-probability finite-sample fairness guarantees for our estimator, providing an unfairness decay at rate $\tilde O(n^{-1/3})$, and establishing a matching risk bound of order $\tilde O(n^{-1/3})$. We provide a matching lower bound on the excess risk of almost fair predictions. Finally, we extend our results to the setting of relaxed counterfactual fairness. We validate our approach on real-world and synthetic data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows counterfactual fairness equals conditional demographic parity on the latent, yielding a closed-form barycentric quantile map plus Õ(n^{-1/3}) fairness and risk bounds.

read the letter

The main point is that their resampled-noise version of counterfactual fairness is equivalent to demographic parity conditional on the latent variable. That equivalence produces a closed-form optimal fair regressor via the barycentric quantile map from one-dimensional optimal transport.

They do a few things cleanly. The reduction avoids heavier causal machinery, the discretization step handles continuous latents, and they supply high-probability finite-sample bounds on unfairness decay together with a matching excess-risk bound and a lower bound. The extension to relaxed fairness is also straightforward. These are concrete additions to the post-processing literature.

The soft spots sit in the mild regularity assumptions that support both the rates and the discretization error; the paper should clarify how often those hold for typical regression distributions and whether the constants are usable. The abstract mentions validation on real and synthetic data, but without the actual tables or baseline comparisons it is hard to judge practical payoff. The proofs are stated but not shown here, so verification rests on the full derivations.

This is for researchers who want explicit rates or an OT route to fair regression rather than constraint-based training. A reader focused on theoretical post-processing tools will find the equivalence and bounds useful.

It deserves peer review; the central claims are internally consistent and the rates are new relative to the cited prior work.

Referee Report

0 major / 2 minor

Summary. The paper considers learning a counterfactually fair regressor under a resampled-noise definition of counterfactual fairness. It establishes equivalence to demographic parity conditional on the latent variable, yielding a closed-form optimal fair regressor via a barycentric quantile map. A discretized post-processing estimator is proposed for continuous latent variables. Under mild regularity assumptions, high-probability finite-sample fairness guarantees are proved with unfairness decaying at rate ilde O(n^{-1/3}), along with a matching excess-risk bound of the same order, a matching lower bound on excess risk for almost-fair predictors, and an extension to relaxed counterfactual fairness. The approach is validated on synthetic and real-world data.

Significance. If the derivations hold, the work makes a solid contribution by linking counterfactual fairness to conditional demographic parity and optimal transport, delivering an explicit closed-form solution and finite-sample rates with a matching lower bound. These elements provide concrete, falsifiable guarantees that go beyond typical heuristic post-processing in fair ML, and the use of standard one-dimensional OT regularity conditions keeps the assumptions mild and interpretable.

minor comments (2)

[Abstract] The abstract and main text use \tilde O without an explicit definition on first appearance; adding a short clarification (e.g., hiding polylog factors) would improve readability.
The description of the discretization step for continuous latents would benefit from a brief remark on how the grid size is chosen in practice and its effect on the stated rates.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our contributions, the assessment of significance, and the recommendation for minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation begins by establishing an equivalence between counterfactual fairness (under the resampled-noise definition) and conditional demographic parity given the latent variable; this equivalence is derived rather than assumed by definition. The optimal fair regressor is then expressed via the barycentric quantile map from one-dimensional optimal transport, which is an external property applied to the equivalence. Finite-sample high-probability fairness guarantees at rate Õ(n^{-1/3}) and the matching excess-risk bound are obtained from concentration arguments and discretization under explicitly stated mild regularity conditions on the distributions. A separate lower bound on excess risk is provided for grounding. No steps reduce by construction to fitted inputs, self-citations, or renamed ansatzes; the central claims remain independent of the paper's own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard causal model assumptions for counterfactuals defined via resampled noise and on regularity conditions for convergence; no new entities are postulated and no free parameters are explicitly fitted beyond the discretization choice.

axioms (1)

domain assumption Mild regularity assumptions on the underlying distributions and latent variable
Invoked to establish the high-probability finite-sample bounds and the quality of the discretized approximation.

pith-pipeline@v0.9.1-grok · 5709 in / 1326 out tokens · 27660 ms · 2026-06-29T09:59:45.695795+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 21 canonical work pages

[1]

Fair regression: Quantitative defi- nitions and reduction-based algorithms

Alekh Agarwal, Miroslav Dudik, and Zhiwei Steven Wu. Fair regression: Quantitative defi- nitions and reduction-based algorithms. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 120–129. PMLR, 09–15 Jun 2019. URL https:/...

2019
[2]

Barycenters in the wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011

Martial Agueh and Guillaume Carlier. Barycenters in the wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011. doi: 10.1137/100805741. URL https: //doi.org/10.1137/100805741

work page doi:10.1137/100805741 2011
[3]

MIT Press, 2023

Solon Barocas, Moritz Hardt, and Arvind Narayanan.Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023

2023
[4]

D-hacking

Emily Black, Talia Gillis, and Zara Yasmine Hall. D-hacking. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’24, page 602–615, New York, NY, USA, 2024. Association for Computing Machinery. ISBN 9798400704505. doi: 10.1145/3630106.3658928. URLhttps://doi.org/10.1145/3630106.3658928

work page doi:10.1145/3630106.3658928 2024
[5]

Gender shades: Intersectional accuracy disparities in commercial gender classification

Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Sorelle A. Friedler and Christo Wilson, editors, Proceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81 ofProceedings of Machine Learning Research, pages 77–91. PMLR, 23–24 Feb 2018. URL https://pro...

2018
[6]

Building classifiers with indepen- dency constraints

Toon Calders, Faisal Kamiran, and Mykola Pechenizkiy. Building classifiers with indepen- dency constraints. In2009 IEEE International Conference on Data Mining Workshops, pages 13–18, 2009. doi: 10.1109/ICDMW.2009.83

work page doi:10.1109/icdmw.2009.83 2009
[8]

Path-specific counterfactual fairness

Silvia Chiappa. Path-specific counterfactual fairness. InProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19. AAAI Press, 2019. ISBN 978-1-57735- 809-1. d...

work page doi:10.1609/aaai.v33i01.33017801 2019
[9]

Fairness with continuous optimal transport.arXiv preprint arXiv:2101.02084, 2021

Silvia Chiappa and Aldo Pacchiano. Fairness with continuous optimal transport.arXiv preprint arXiv:2101.02084, 2021. URLhttps://arxiv.org/abs/2101.02084

work page arXiv 2021
[10]

Cunningham

Silvia Chiappa, Ray Jiang, Tom Stepleton, Aldo Pacchiano, Heinrich Jiang, and John P. Cunningham. A general approach to fairness with optimal transport. 34(04):3633–3640, Apr. 2020. doi: 10.1609/aaai.v34i04.5771. URLhttps://ojs.aaai.org/index.php/AAAI/ article/view/5771

work page doi:10.1609/aaai.v34i04.5771 2020
[11]

A minimax framework for quantifying risk-fairness trade-off in regression.The Annals of Statistics, 50(4):2416–2442, August 2022

Evgenii Chzhen and Nicolas Schreuder. A minimax framework for quantifying risk-fairness trade-off in regression.The Annals of Statistics, 50(4):2416–2442, August 2022. doi: 10.1214/22-AOS2198. URLhttps://arxiv.org/abs/2007.14265. 12

work page doi:10.1214/22-aos2198 2022
[13]

Fair regression with wasserstein barycenters

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, and Massimiliano Pontil. Fair regression with wasserstein barycenters. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 7321–7331. Curran Associates, Inc., 2020. URLhttps://proceedings.neurips.cc/ pape...

2020
[14]

Fair regression via plug-in estimator and recalibration with statistical guarantees

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, and Massimiliano Pon- til. Fair regression via plug-in estimator and recalibration with statistical guarantees. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 19137–19148. Curran As- sociates, Inc., ...

2020
[15]

Explaining machine learning classifiers through diverse counterfactual explanations,

Amanda Coston, Alan Mishler, Edward H. Kennedy, and Alexandra Chouldechova. Coun- terfactual risk assessments, evaluation, and fairness. InProceedings of the 2020 Con- ference on Fairness, Accountability, and Transparency, FAT* ’20, page 582–593, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369367. doi: 10.1145/3351095.3372851...

work page doi:10.1145/3351095.3372851 2020
[16]

Causal modeling for fairness in dynamical systems

Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. Causal modeling for fairness in dynamical systems. InProceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020

2020
[17]

Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics

Kimberlé Crenshaw. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. InFeminist legal theories, pages 23–51. Routledge, 2013

2013
[18]

Achieving counterfactual fairness with imperfect structural causal model.Expert Systems with Applications, 240:122411, 11 2023

Tri Duong, Qian Li, and Guandong Xu. Achieving counterfactual fairness with imperfect structural causal model.Expert Systems with Applications, 240:122411, 11 2023. doi: 10.1016/j.eswa.2023.122411

work page doi:10.1016/j.eswa.2023.122411 2023
[19]

Fairness through awareness, 2011

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Rich Zemel. Fairness through awareness, 2011

2011
[20]

Equality of opportunity in supervised learning

Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 29. Curran Asso- ciates, Inc., 2016. URL https://proceedings.neurips.cc/paper_files/paper/2016/ file/6a9659feb1216f14f7384ba499518b38-Paper.pdf

2016
[21]

Learning representations for counterfactual inference

Fredrik Johansson, Uri Shalit, and David Sontag. Learning representations for counterfactual inference. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 3020–3029, New York, New York, USA, 20–22 Jun 2016. PMLR. URLhttp...

2016
[22]

Preventing fairness gerrymandering: Auditing and learning for subgroup fairness

Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Jennifer Dy and Andreas 13 Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 2564–2572. PMLR, 2018. URLhttps://proce...

2018
[23]

Counterfactual fairness

Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 30. Cur- ran Associates, Inc., 2017. URLhttps://proceedings.neurips.cc/paper_files/paper/ 2017/file/a486cd...

2017
[24]

When worlds col- lide: Integrating different counterfactual assumptions in fairness

Matt J Kusner, Chris Russell, Joshua Loftus, and Ricardo Silva. When worlds col- lide: Integrating different counterfactual assumptions in fairness. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, ed- itors,Advances in Neural Information Processing Systems, volume 30. Curran Asso- ciates, Inc., 2017. URL htt...

2017
[25]

Projection to fairness in statistical learning, 2020

Thibaut Le Gouic, Jean-Michel Loubes, and Philippe Rigollet. Projection to fairness in statistical learning, 2020. URLhttps://arxiv.org/abs/2005.11720

work page arXiv 2020
[26]

Too relaxed to be fair

Michael Lohaus, Michael Perrot, and Ulrike Von Luxburg. Too relaxed to be fair. In Hal Daumé III and Aarti Singh, editors,Proceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 6360–6369. PMLR, 2020. URLhttps://proceedings.mlr.press/v119/lohaus20a.html. 13–18 Jul

2020
[27]

Learning for counterfactual fairness from observational data

Jing Ma, Ruocheng Guo, Aidong Zhang, and Jundong Li. Learning for counterfactual fairness from observational data. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, page 1620–1630. ACM, August 2023. doi: 10.1145/3580305.3599408. URLhttp://dx.doi.org/10.1145/3580305.3599408

work page doi:10.1145/3580305.3599408 2023
[28]

Survey on causal-based machine learning fairness notions, 2022

Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. Survey on causal-based machine learning fairness notions, 2022. URLhttps://arxiv.org/abs/2010.09553

work page arXiv 2022
[29]

Martinot, B

P. Martinot, B. Colnet, T. Breda, J. Sultan, L. Touitou, P. Huguet, E. Spelke, G. Dehaene- Lambertz, P. Bressoux, and S. Dehaene. Rapid emergence of a maths gender gap in first grade.Nature, 643(8073):1020–1029, 2025. doi: 10.1038/s41586-025-09126-4. URL https://doi.org/10.1038/s41586-025-09126-4

work page doi:10.1038/s41586-025-09126-4 2025
[30]

The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality.The Annals of Probability, 18(3):1269–1283, 1990.doi:10.1214/aop/1176990746

Pascal Massart. The tight constant in the dvoretzky–kiefer–wolfowitz inequality.The Annals of Probability, 18(3):1269–1283, July 1990. doi: 10.1214/aop/1176990746. URL https://doi.org/10.1214/aop/1176990746

work page doi:10.1214/aop/1176990746 1990
[31]

Pearl and D

J. Pearl and D. Mackenzie.The Book of Why: The New Science of Cause and Effect. Basic Books, 2018. ISBN 9780465097616. URL https://books.google.fr/books?id= BzM0DwAAQBAJ

2018
[32]

Cambridge University Press, 2 edition, 2009

Judea Pearl.Causality. Cambridge University Press, 2 edition, 2009

2009
[33]

Causal fairness analysis, 2022

Drago Plecko and Elias Bareinboim. Causal fairness analysis, 2022. URLhttps://arxiv. org/abs/2207.11385

work page arXiv 2022
[34]

FairPFN: A tabular foundation model for causal fairness

Jake Robertson, Noah Hollmann, Samuel Müller, Noor Awad, and Frank Hutter. FairPFN: A tabular foundation model for causal fairness. InForty-second International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=I8DVh2jnEA. 14

2025
[35]

Counterfactual fairness is basically demographic parity

Lucas Rosenblatt and R Teal Witter. Counterfactual fairness is basically demographic parity. Proceedings of AAAI, 2023

2023
[36]

Birkhäuser Cham, 1 edition, 2015

Filippo Santambrogio.Optimal Transport for Applied Mathematicians: Calculus of Vari- ations, PDEs, and Modeling, volume 87 ofProgress in Nonlinear Differential Equations and Their Applications. Birkhäuser Cham, 1 edition, 2015. ISBN 978-3-319-20827-5. doi: 10.1007/978-3-319-20828-2. Published 27 October 2015

work page doi:10.1007/978-3-319-20828-2 2015
[37]

Counterfactual fairness is not demographic parity, and other observations,

Ricardo Silva. Counterfactual fairness is not demographic parity, and other observations,
[38]

URLhttps://arxiv.org/abs/2402.02663

work page arXiv
[39]

Towards counterfactual fairness through auxiliary variables

Bowei Tian, Ziyao Wang, Shwai He, Wanghao Ye, Guoheng Sun, Yucong Dai, Yongkai Wu, and Ang Li. Towards counterfactual fairness through auxiliary variables. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview. net/forum?id=GpUv1FvZi1

2025
[40]

Tsybakov.Introduction to Nonparametric Estimation

Alexandre B. Tsybakov.Introduction to Nonparametric Estimation. Springer Series in Statistics. Springer, New York, NY, 2009. ISBN 978-0-387-79051-0. doi: 10.1007/b13794

work page doi:10.1007/b13794 2009
[41]

A. W. van der Vaart.Quantiles and Order Statistics, page 304–315. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998

1998
[42]

Counterfactual explanations and algorithmic recourses for machine learning: A review

Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan Hines, John Dickerson, and Chirag Shah. Counterfactual explanations and algorithmic recourses for machine learning: A review. ACM Comput. Surv., 56(12), October 2024. ISSN 0360-0300. doi: 10.1145/3677119. URL https://doi.org/10.1145/3677119

work page doi:10.1145/3677119 2024
[43]

Counterfactual fairness: Unidentification, bound and algorithm

Yongkai Wu, Lu Zhang, and Xintao Wu. Counterfactual fairness: Unidentification, bound and algorithm. InProceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 1438–1444. International Joint Conferences on Artificial Intelligence Organization, 7 2019. doi: 10.24963/ijcai.2019/199. URL https: //doi.org/10....

work page doi:10.24963/ijcai.2019/199 2019
[44]

Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, and David I. Inouye. Counterfactual fairness by combining factual and counterfactual predictions. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=J0Itri0UiN

2024
[45]

act as non-deterministic causes of observable variables

Zhiqun Zuo, Mohammad Mahdi Khalili, and Xueru Zhang. Counterfactually fair representa- tion. InThirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=QZo1cge4Tc. 15 Appendix Table of Contents A Useful background 2 A.1 Additional notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....

2023
[46]

Latent Knowledge:The unobserved ability is drawn from a standard normal distribution: V∼ N(0,1) 2.UGPA:Modeled as a linear function of race, sex, and knowledge: UGPA∼ N(µ GPA +w ⊤ GPAS+λ GPA V, σ 2 GPA)
[47]

The model is fitted using Hamiltonian Monte Carlo (HMC) with 2000 iterations

LSAT Score:Modeled as a count variable (approximated via Poisson) driven by the same factors: LSAT∼Poisson(exp(µ LSAT +w ⊤ LSATS+λ LSAT V)) 4.First-Year Average (ZFYA):The outcome variable is also a noisy linear function: ZFYA∼ N(µ ZFYA +w ⊤ ZFYAS+λ ZFYA V,1) Inference Procedure.We use the Stan implementation provided by Kusner et al.[23]. The model is fi...

2000
[48]

We partitionVintoLequal-width intervals{I ℓ}L ℓ=1
[49]

In each intervalℓand groups, we collect the scoreszi =f bb(xi, si)
[50]

We compute empirical quantilesqℓ,s on a fixed grid and form the barycenter quantiles qℓ,⋆ =P s wsqℓ,s
[51]

proxy leakage,

For a new test point with scoreznew in groupsand intervalℓ, we: •Compute its percentileτwithin the groupsdistribution via linear interpolation. •Mapτto the barycenter distribution:by=q ℓ,⋆(τ). Plug-in Selection of L∗.We select the optimal discretization level L∗ using the formula derived in Theorem 1. We estimate the Lipschitz constantLcdf using finite di...

work page arXiv 2020

[1] [1]

Fair regression: Quantitative defi- nitions and reduction-based algorithms

Alekh Agarwal, Miroslav Dudik, and Zhiwei Steven Wu. Fair regression: Quantitative defi- nitions and reduction-based algorithms. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 120–129. PMLR, 09–15 Jun 2019. URL https:/...

2019

[2] [2]

Barycenters in the wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011

Martial Agueh and Guillaume Carlier. Barycenters in the wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011. doi: 10.1137/100805741. URL https: //doi.org/10.1137/100805741

work page doi:10.1137/100805741 2011

[3] [3]

MIT Press, 2023

Solon Barocas, Moritz Hardt, and Arvind Narayanan.Fairness and Machine Learning: Limitations and Opportunities. MIT Press, 2023

2023

[4] [4]

D-hacking

Emily Black, Talia Gillis, and Zara Yasmine Hall. D-hacking. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’24, page 602–615, New York, NY, USA, 2024. Association for Computing Machinery. ISBN 9798400704505. doi: 10.1145/3630106.3658928. URLhttps://doi.org/10.1145/3630106.3658928

work page doi:10.1145/3630106.3658928 2024

[5] [5]

Gender shades: Intersectional accuracy disparities in commercial gender classification

Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Sorelle A. Friedler and Christo Wilson, editors, Proceedings of the 1st Conference on Fairness, Accountability and Transparency, volume 81 ofProceedings of Machine Learning Research, pages 77–91. PMLR, 23–24 Feb 2018. URL https://pro...

2018

[6] [6]

Building classifiers with indepen- dency constraints

Toon Calders, Faisal Kamiran, and Mykola Pechenizkiy. Building classifiers with indepen- dency constraints. In2009 IEEE International Conference on Data Mining Workshops, pages 13–18, 2009. doi: 10.1109/ICDMW.2009.83

work page doi:10.1109/icdmw.2009.83 2009

[7] [8]

Path-specific counterfactual fairness

Silvia Chiappa. Path-specific counterfactual fairness. InProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI’19/IAAI’19/EAAI’19. AAAI Press, 2019. ISBN 978-1-57735- 809-1. d...

work page doi:10.1609/aaai.v33i01.33017801 2019

[8] [9]

Fairness with continuous optimal transport.arXiv preprint arXiv:2101.02084, 2021

Silvia Chiappa and Aldo Pacchiano. Fairness with continuous optimal transport.arXiv preprint arXiv:2101.02084, 2021. URLhttps://arxiv.org/abs/2101.02084

work page arXiv 2021

[9] [10]

Cunningham

Silvia Chiappa, Ray Jiang, Tom Stepleton, Aldo Pacchiano, Heinrich Jiang, and John P. Cunningham. A general approach to fairness with optimal transport. 34(04):3633–3640, Apr. 2020. doi: 10.1609/aaai.v34i04.5771. URLhttps://ojs.aaai.org/index.php/AAAI/ article/view/5771

work page doi:10.1609/aaai.v34i04.5771 2020

[10] [11]

A minimax framework for quantifying risk-fairness trade-off in regression.The Annals of Statistics, 50(4):2416–2442, August 2022

Evgenii Chzhen and Nicolas Schreuder. A minimax framework for quantifying risk-fairness trade-off in regression.The Annals of Statistics, 50(4):2416–2442, August 2022. doi: 10.1214/22-AOS2198. URLhttps://arxiv.org/abs/2007.14265. 12

work page doi:10.1214/22-aos2198 2022

[11] [13]

Fair regression with wasserstein barycenters

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, and Massimiliano Pontil. Fair regression with wasserstein barycenters. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 7321–7331. Curran Associates, Inc., 2020. URLhttps://proceedings.neurips.cc/ pape...

2020

[12] [14]

Fair regression via plug-in estimator and recalibration with statistical guarantees

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri, Luca Oneto, and Massimiliano Pon- til. Fair regression via plug-in estimator and recalibration with statistical guarantees. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin, editors,Advances in Neural Information Processing Systems, volume 33, pages 19137–19148. Curran As- sociates, Inc., ...

2020

[13] [15]

Explaining machine learning classifiers through diverse counterfactual explanations,

Amanda Coston, Alan Mishler, Edward H. Kennedy, and Alexandra Chouldechova. Coun- terfactual risk assessments, evaluation, and fairness. InProceedings of the 2020 Con- ference on Fairness, Accountability, and Transparency, FAT* ’20, page 582–593, New York, NY, USA, 2020. Association for Computing Machinery. ISBN 9781450369367. doi: 10.1145/3351095.3372851...

work page doi:10.1145/3351095.3372851 2020

[14] [16]

Causal modeling for fairness in dynamical systems

Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. Causal modeling for fairness in dynamical systems. InProceedings of the 37th International Conference on Machine Learning, ICML’20. JMLR.org, 2020

2020

[15] [17]

Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics

Kimberlé Crenshaw. Demarginalizing the intersection of race and sex: A black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. InFeminist legal theories, pages 23–51. Routledge, 2013

2013

[16] [18]

Achieving counterfactual fairness with imperfect structural causal model.Expert Systems with Applications, 240:122411, 11 2023

Tri Duong, Qian Li, and Guandong Xu. Achieving counterfactual fairness with imperfect structural causal model.Expert Systems with Applications, 240:122411, 11 2023. doi: 10.1016/j.eswa.2023.122411

work page doi:10.1016/j.eswa.2023.122411 2023

[17] [19]

Fairness through awareness, 2011

Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Rich Zemel. Fairness through awareness, 2011

2011

[18] [20]

Equality of opportunity in supervised learning

Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In D. Lee, M. Sugiyama, U. Luxburg, I. Guyon, and R. Garnett, edi- tors,Advances in Neural Information Processing Systems, volume 29. Curran Asso- ciates, Inc., 2016. URL https://proceedings.neurips.cc/paper_files/paper/2016/ file/6a9659feb1216f14f7384ba499518b38-Paper.pdf

2016

[19] [21]

Learning representations for counterfactual inference

Fredrik Johansson, Uri Shalit, and David Sontag. Learning representations for counterfactual inference. In Maria Florina Balcan and Kilian Q. Weinberger, editors,Proceedings of The 33rd International Conference on Machine Learning, volume 48 ofProceedings of Machine Learning Research, pages 3020–3029, New York, New York, USA, 20–22 Jun 2016. PMLR. URLhttp...

2016

[20] [22]

Preventing fairness gerrymandering: Auditing and learning for subgroup fairness

Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. In Jennifer Dy and Andreas 13 Krause, editors,Proceedings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 2564–2572. PMLR, 2018. URLhttps://proce...

2018

[21] [23]

Counterfactual fairness

Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 30. Cur- ran Associates, Inc., 2017. URLhttps://proceedings.neurips.cc/paper_files/paper/ 2017/file/a486cd...

2017

[22] [24]

When worlds col- lide: Integrating different counterfactual assumptions in fairness

Matt J Kusner, Chris Russell, Joshua Loftus, and Ricardo Silva. When worlds col- lide: Integrating different counterfactual assumptions in fairness. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, ed- itors,Advances in Neural Information Processing Systems, volume 30. Curran Asso- ciates, Inc., 2017. URL htt...

2017

[23] [25]

Projection to fairness in statistical learning, 2020

Thibaut Le Gouic, Jean-Michel Loubes, and Philippe Rigollet. Projection to fairness in statistical learning, 2020. URLhttps://arxiv.org/abs/2005.11720

work page arXiv 2020

[24] [26]

Too relaxed to be fair

Michael Lohaus, Michael Perrot, and Ulrike Von Luxburg. Too relaxed to be fair. In Hal Daumé III and Aarti Singh, editors,Proceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 6360–6369. PMLR, 2020. URLhttps://proceedings.mlr.press/v119/lohaus20a.html. 13–18 Jul

2020

[25] [27]

Learning for counterfactual fairness from observational data

Jing Ma, Ruocheng Guo, Aidong Zhang, and Jundong Li. Learning for counterfactual fairness from observational data. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, page 1620–1630. ACM, August 2023. doi: 10.1145/3580305.3599408. URLhttp://dx.doi.org/10.1145/3580305.3599408

work page doi:10.1145/3580305.3599408 2023

[26] [28]

Survey on causal-based machine learning fairness notions, 2022

Karima Makhlouf, Sami Zhioua, and Catuscia Palamidessi. Survey on causal-based machine learning fairness notions, 2022. URLhttps://arxiv.org/abs/2010.09553

work page arXiv 2022

[27] [29]

Martinot, B

P. Martinot, B. Colnet, T. Breda, J. Sultan, L. Touitou, P. Huguet, E. Spelke, G. Dehaene- Lambertz, P. Bressoux, and S. Dehaene. Rapid emergence of a maths gender gap in first grade.Nature, 643(8073):1020–1029, 2025. doi: 10.1038/s41586-025-09126-4. URL https://doi.org/10.1038/s41586-025-09126-4

work page doi:10.1038/s41586-025-09126-4 2025

[28] [30]

The tight constant in the Dvoretzky–Kiefer–Wolfowitz inequality.The Annals of Probability, 18(3):1269–1283, 1990.doi:10.1214/aop/1176990746

Pascal Massart. The tight constant in the dvoretzky–kiefer–wolfowitz inequality.The Annals of Probability, 18(3):1269–1283, July 1990. doi: 10.1214/aop/1176990746. URL https://doi.org/10.1214/aop/1176990746

work page doi:10.1214/aop/1176990746 1990

[29] [31]

Pearl and D

J. Pearl and D. Mackenzie.The Book of Why: The New Science of Cause and Effect. Basic Books, 2018. ISBN 9780465097616. URL https://books.google.fr/books?id= BzM0DwAAQBAJ

2018

[30] [32]

Cambridge University Press, 2 edition, 2009

Judea Pearl.Causality. Cambridge University Press, 2 edition, 2009

2009

[31] [33]

Causal fairness analysis, 2022

Drago Plecko and Elias Bareinboim. Causal fairness analysis, 2022. URLhttps://arxiv. org/abs/2207.11385

work page arXiv 2022

[32] [34]

FairPFN: A tabular foundation model for causal fairness

Jake Robertson, Noah Hollmann, Samuel Müller, Noor Awad, and Frank Hutter. FairPFN: A tabular foundation model for causal fairness. InForty-second International Conference on Machine Learning, 2025. URLhttps://openreview.net/forum?id=I8DVh2jnEA. 14

2025

[33] [35]

Counterfactual fairness is basically demographic parity

Lucas Rosenblatt and R Teal Witter. Counterfactual fairness is basically demographic parity. Proceedings of AAAI, 2023

2023

[34] [36]

Birkhäuser Cham, 1 edition, 2015

Filippo Santambrogio.Optimal Transport for Applied Mathematicians: Calculus of Vari- ations, PDEs, and Modeling, volume 87 ofProgress in Nonlinear Differential Equations and Their Applications. Birkhäuser Cham, 1 edition, 2015. ISBN 978-3-319-20827-5. doi: 10.1007/978-3-319-20828-2. Published 27 October 2015

work page doi:10.1007/978-3-319-20828-2 2015

[35] [37]

Counterfactual fairness is not demographic parity, and other observations,

Ricardo Silva. Counterfactual fairness is not demographic parity, and other observations,

[36] [38]

URLhttps://arxiv.org/abs/2402.02663

work page arXiv

[37] [39]

Towards counterfactual fairness through auxiliary variables

Bowei Tian, Ziyao Wang, Shwai He, Wanghao Ye, Guoheng Sun, Yucong Dai, Yongkai Wu, and Ang Li. Towards counterfactual fairness through auxiliary variables. InThe Thirteenth International Conference on Learning Representations, 2025. URLhttps://openreview. net/forum?id=GpUv1FvZi1

2025

[38] [40]

Tsybakov.Introduction to Nonparametric Estimation

Alexandre B. Tsybakov.Introduction to Nonparametric Estimation. Springer Series in Statistics. Springer, New York, NY, 2009. ISBN 978-0-387-79051-0. doi: 10.1007/b13794

work page doi:10.1007/b13794 2009

[39] [41]

A. W. van der Vaart.Quantiles and Order Statistics, page 304–315. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998

1998

[40] [42]

Counterfactual explanations and algorithmic recourses for machine learning: A review

Sahil Verma, Varich Boonsanong, Minh Hoang, Keegan Hines, John Dickerson, and Chirag Shah. Counterfactual explanations and algorithmic recourses for machine learning: A review. ACM Comput. Surv., 56(12), October 2024. ISSN 0360-0300. doi: 10.1145/3677119. URL https://doi.org/10.1145/3677119

work page doi:10.1145/3677119 2024

[41] [43]

Counterfactual fairness: Unidentification, bound and algorithm

Yongkai Wu, Lu Zhang, and Xintao Wu. Counterfactual fairness: Unidentification, bound and algorithm. InProceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI-19, pages 1438–1444. International Joint Conferences on Artificial Intelligence Organization, 7 2019. doi: 10.24963/ijcai.2019/199. URL https: //doi.org/10....

work page doi:10.24963/ijcai.2019/199 2019

[42] [44]

Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, and David I. Inouye. Counterfactual fairness by combining factual and counterfactual predictions. InThe Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=J0Itri0UiN

2024

[43] [45]

act as non-deterministic causes of observable variables

Zhiqun Zuo, Mohammad Mahdi Khalili, and Xueru Zhang. Counterfactually fair representa- tion. InThirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=QZo1cge4Tc. 15 Appendix Table of Contents A Useful background 2 A.1 Additional notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ....

2023

[44] [46]

Latent Knowledge:The unobserved ability is drawn from a standard normal distribution: V∼ N(0,1) 2.UGPA:Modeled as a linear function of race, sex, and knowledge: UGPA∼ N(µ GPA +w ⊤ GPAS+λ GPA V, σ 2 GPA)

[45] [47]

The model is fitted using Hamiltonian Monte Carlo (HMC) with 2000 iterations

LSAT Score:Modeled as a count variable (approximated via Poisson) driven by the same factors: LSAT∼Poisson(exp(µ LSAT +w ⊤ LSATS+λ LSAT V)) 4.First-Year Average (ZFYA):The outcome variable is also a noisy linear function: ZFYA∼ N(µ ZFYA +w ⊤ ZFYAS+λ ZFYA V,1) Inference Procedure.We use the Stan implementation provided by Kusner et al.[23]. The model is fi...

2000

[46] [48]

We partitionVintoLequal-width intervals{I ℓ}L ℓ=1

[47] [49]

In each intervalℓand groups, we collect the scoreszi =f bb(xi, si)

[48] [50]

We compute empirical quantilesqℓ,s on a fixed grid and form the barycenter quantiles qℓ,⋆ =P s wsqℓ,s

[49] [51]

proxy leakage,

For a new test point with scoreznew in groupsand intervalℓ, we: •Compute its percentileτwithin the groupsdistribution via linear interpolation. •Mapτto the barycenter distribution:by=q ℓ,⋆(τ). Plug-in Selection of L∗.We select the optimal discretization level L∗ using the formula derived in Theorem 1. We estimate the Lipschitz constantLcdf using finite di...

work page arXiv 2020