Long-term Fairness with Selective Labels
Pith reviewed 2026-05-22 07:45 UTC · model grok-4.3
The pith
A decomposition of true fairness into observed fairness plus bounded prediction bias yields sufficient conditions for long-term fairness under selective labels.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The true fairness measure under long-term dynamics and selective labels decomposes exactly into the fairness computed from observed labels plus a bias term induced by the label predictor. When the predictor supplies well-calibrated estimates, the bias can be bounded from observable quantities alone, producing sufficient conditions that replace the missing fairness term and thereby allow enforcement of true fairness without direct access to all labels.
What carries the argument
Decomposition of the fairness measure into observed fairness and label-prediction bias, with the bias term controlled by the predictor model's reported .
If this is right
- Algorithms that treat observed labels as complete fail to guarantee long-term fairness once selective revelation is taken into account.
- Sufficient conditions for true fairness become expressible using only observable data and the predictor's scalar.
- A reinforcement-learning policy built on these conditions reaches fairness and utility levels statistically indistinguishable from an oracle with full label access in controlled semisynthetic trials.
Where Pith is reading between the lines
- The same decomposition could be applied to other partially observed dynamic systems such as medical treatment allocation or content recommendation where outcomes are revealed only after positive decisions.
- Joint training of the label predictor and the decision policy might tighten the bias bounds and improve sample efficiency.
- Empirical stress tests on real credit or hiring datasets would reveal whether the independence of the bias bound from policy dynamics holds outside semisynthetic regimes.
Load-bearing premise
The label predictor supplies well-calibrated values whose bias term can be bounded independently of the evolving policy and population dynamics.
What would settle it
A setting in which the predictor's values are miscalibrated yet the derived sufficient conditions are reported as satisfied, producing a measurable gap between the algorithm's claimed fairness and the true long-term fairness computed from fully observed labels.
Figures
read the original abstract
Long-term fairness algorithms aim to satisfy fairness beyond static and short-term notions by accounting for the dynamics between decision-making policies and population behavior. Most previous approaches evaluate performance and fairness measures from observable features and a label, which is assumed to be fully observed. However, in scenarios such as hiring or lending, the labels (e.g., ability to repay the loan) are selective labels as they are only revealed based on positive decisions (e.g., when a loan is granted). In this paper, we study long-term fairness in the selective labels setting and analytically show that naive solutions do not guarantee fairness. To address this gap, we then introduce a novel framework that leverages both the observed data and a label predictor model to estimate the true fairness measure value by decomposing it into the observed fairness and bias from label predictions. This allows us to derive sufficient conditions to satisfy true fairness from observable quantities by using the confidence in the predictor model. Finally, we rely on our theoretical results to propose a novel reinforcement learning algorithm for effective long-term fair decision-making with selective labels. In semisynthetic environments, the proposed algorithm reached comparable fairness and performance to an agent with oracle access to the true labels.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper examines long-term fairness under selective labels, where outcomes are observed only for positive decisions. It analytically proves that naive approaches fail to ensure fairness, proposes a decomposition of the true fairness measure into observed fairness plus bias from a label predictor, derives sufficient conditions for true fairness in terms of observable quantities and predictor confidence, and introduces an RL algorithm whose semisynthetic performance matches an oracle with access to true labels.
Significance. If the derivation holds, the work provides a concrete way to achieve long-term fairness without requiring full label observability, by leveraging a predictor to bound the missing bias term. The analytical failure proof for naive methods and the RL formulation that reaches oracle-level fairness and utility in semisynthetic settings are clear strengths. The result would be of interest to the fairness-in-ML community provided the calibration assumption survives policy-induced shifts.
major comments (1)
- [Section 4] Section 4 (derivation after Eq. (3)): the move from the decomposition identity to the sufficient conditions replaces the unobserved fairness term with a function of the predictor's confidence. This replacement is valid only if the predictor's calibration error (or bias bound) remains controlled independently of the distribution shifts induced by the evolving policy. In the selective-labels regime the observed data are already conditioned on past positive decisions, so any change in policy alters the feature-label distribution seen by the predictor; the manuscript supplies no argument or bound showing that calibration error stays controlled under such shifts.
minor comments (1)
- [Abstract and Experiments] The abstract and experimental section should explicitly state the base datasets, how the selective-label mechanism is simulated, and whether the label predictor is retrained on data generated by the learned policy or only on the initial fixed environment.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for highlighting the potential impact of this work on long-term fairness under selective labels. We address the single major comment below.
read point-by-point responses
-
Referee: [Section 4] Section 4 (derivation after Eq. (3)): the move from the decomposition identity to the sufficient conditions replaces the unobserved fairness term with a function of the predictor's confidence. This replacement is valid only if the predictor's calibration error (or bias bound) remains controlled independently of the distribution shifts induced by the evolving policy. In the selective-labels regime the observed data are already conditioned on past positive decisions, so any change in policy alters the feature-label distribution seen by the predictor; the manuscript supplies no argument or bound showing that calibration error stays controlled under such shifts.
Authors: We thank the referee for this precise observation. The identity decomposition itself holds by definition and does not depend on calibration. The sufficient conditions that follow replace the unobserved bias term with a bound derived from the predictor's reported confidence; this step implicitly treats the predictor as calibrated on the data distribution induced by the current policy. The manuscript does not supply an explicit argument or bound establishing that calibration error remains controlled when the policy evolves and thereby changes the feature-label distribution seen by the predictor. We agree this is a substantive modeling assumption that warrants explicit discussion. In the revised manuscript we will (i) state the assumption clearly after Eq. (3), (ii) add a remark on its implications for long-term deployment, and (iii) outline a practical safeguard—periodic recalibration of the predictor on newly observed selective labels—to keep the bound valid in practice. These additions will appear in Section 4 and in the discussion of the RL algorithm. revision: yes
Circularity Check
No significant circularity detected; derivation relies on external predictor model.
full rationale
The paper decomposes the true fairness measure into an observed component plus a bias term attributable to the label predictor (Section 4, post-Eq. (3)), then substitutes a function of the predictor's reported confidence to obtain sufficient conditions on observable quantities. This step treats the label predictor as an independent, externally provided model whose calibration properties are assumed rather than fitted inside the fairness objective or derived from the target metric. No equation reduces the claimed sufficient conditions to the input fairness value by algebraic identity or by renaming a fitted parameter. No self-citations, uniqueness theorems, or ansatzes imported from prior author work are invoked as load-bearing steps. The derivation therefore remains self-contained once the external predictor and its bounded-bias assumption are granted.
Axiom & Free-Parameter Ledger
free parameters (1)
- predictor confidence threshold
axioms (1)
- domain assumption The label predictor produces confidence scores that can be used to bound the difference between predicted and true labels in the unobserved region.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 3.2 (Observed Disparity Decomposition) and Theorem 3.6 (sufficient conditions via IPW error bounds ϵi)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Assumption 3.4 (overlap) and Renyi-divergence regularization LRenyi
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Advances in Neural Information Processing Systems , volume = 36, pages =
Long-term fairness with unknown dynamics , author =. Advances in Neural Information Processing Systems , volume = 36, pages =
-
[2]
Causality: Models, Reasoning and Inference , author =
-
[3]
Algorithmic decision making and the cost of fairness , author =. Proceedings of the 23rd acm sigkdd international conference on knowledge discovery and data mining , pages =
-
[4]
Sociological Methods & Research , publisher =
Fairness in criminal justice risk assessments: The state of the art , author =. Sociological Methods & Research , publisher =
-
[5]
A Survey on Bias and Fairness in Machine Learning , author =. ACM Comput. Surv. , publisher =. doi:10.1145/3457607 , issn =
-
[6]
Raab, Reilly and Boczar, Ross and Fazel, Maryam and Liu, Yang , year = 2024, month = mar, journal =. Fair. doi:10.1609/aaai.v38i13.29394 , issn =
-
[7]
Ge, Yingqiang and Liu, Shuchang and Gao, Ruoyuan and Xian, Yikun and Li, Yunqi and Zhao, Xiangyu and Pei, Changhua and Sun, Fei and Ge, Junfeng and Ou, Wenwu and Zhang, Yongfeng , year = 2021, month = mar, booktitle =. Towards. doi:10.1145/3437963.3441824 , url =
-
[8]
Yin, Tongxin and Raab, Reilly and Liu, Mingyan and Liu, Yang , year = 2023, booktitle =. Long-
work page 2023
-
[9]
Perdomo, Juan and Zrnic, Tijana and Mendler-Dünner, Celestine and Hardt, Moritz , year = 2020, month = jul, booktitle =. Performative
work page 2020
-
[10]
Equal opportunity in online classification with partial feedback , author =
-
[11]
Creager, Elliot and Madras, David and Pitassi, Toniann and Zemel, Richard , year = 2020, month = jul, booktitle =. Causal
work page 2020
-
[12]
Equality of opportunity in supervised learning , author =
-
[13]
The Fragility of Fairness: Causal Sensitivity Analysis for Fair Machine Learning , author =. The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track , url =
-
[14]
Proceedings of the 3rd innovations in theoretical computer science conference , pages =
Fairness through awareness , author =. Proceedings of the 3rd innovations in theoretical computer science conference , pages =
-
[15]
Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , location =
Fairness is not static: deeper understanding of long term fairness via simulation studies , author =. Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency , location =. doi:10.1145/3351095.3372878 , isbn = 9781450369367, url =
-
[16]
Proceedings of the 1st Conference on Fairness, Accountability and Transparency , publisher =
Runaway Feedback Loops in Predictive Policing , author =. Proceedings of the 1st Conference on Fairness, Accountability and Transparency , publisher =
-
[17]
Adapting Static Fairness to Sequential Decision-Making: Bias Mitigation Strategies towards Equal Long-term Benefit Rate , author =
-
[18]
International Conference on Machine Learning , pages =
Delayed impact of fair machine learning , author =. International Conference on Machine Learning , pages =
-
[19]
Report to the congress on credit scoring and its effects on the availability and affordability of credit , author =
-
[20]
Weapons of Math Destruction , author =
-
[21]
Risk elements in consumer instalment financing , author =. Nber Books , publisher =
-
[22]
Machine Bias , author =
-
[23]
Journal of Machine Learning Research , volume = 16, number = 42, pages =
A Comprehensive Survey on Safe Reinforcement Learning , author =. Journal of Machine Learning Research , volume = 16, number = 42, pages =
-
[24]
IEEE Transactions on Neural Networks and Learning Systems , volume =
A Survey on Offline Reinforcement Learning: Taxonomy, Review, and Open Problems , author =. IEEE Transactions on Neural Networks and Learning Systems , volume =
-
[25]
Remembering to Be Fair: Non-Markovian Fairness in Sequential Decision Making , author =
-
[26]
Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges , author =
-
[27]
Preserving the Fairness Guarantees of Classifiers in Changing Environments: A Survey , author =. ACM Comput. Surv. , publisher =. doi:10.1145/3637438 , issn =
-
[28]
Advances in Neural Information Processing Systems , volume = 33, pages =
How do fair decisions fare in long-term qualification? , author =. Advances in Neural Information Processing Systems , volume = 33, pages =
-
[29]
Advances in Neural Information Processing Systems , volume = 34, pages =
Unintended selection: Persistent qualification rate disparities and interventions , author =. Advances in Neural Information Processing Systems , volume = 34, pages =
-
[30]
Towards Return Parity in Markov Decision Processes , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , publisher =
-
[31]
2023 IEEE International Conference on Big Data (BigData) , pages =
Striking a balance in fairness for dynamic systems through reinforcement learning , author =. 2023 IEEE International Conference on Big Data (BigData) , pages =
work page 2023
-
[32]
The Thirteenth International Conference on Learning Representations , url =
A Causal Lens for Learning Long-term Fair Policies , author =. The Thirteenth International Conference on Learning Representations , url =
-
[33]
Advances in Neural Information Processing Systems , url =
Policy Optimization with Advantage Regularization for Long-Term Fairness in Decision Systems , author =. Advances in Neural Information Processing Systems , url =
-
[34]
Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society , location =
A Dynamic Decision-Making Framework Promoting Long-Term Fairness , author =. Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society , location =. doi:10.1145/3514094.3534127 , isbn = 9781450392471, url =
-
[35]
Proceedings of the AAAI Conference on Artificial Intelligence , volume = 36, pages =
Achieving long-term fairness in sequential decision making , author =. Proceedings of the AAAI Conference on Artificial Intelligence , volume = 36, pages =
-
[36]
Addressing Polarization and Unfairness in Performative Prediction , author =. CoRR , volume =
-
[37]
Proximal policy optimization algorithms , author =
-
[38]
International Conference on Artificial Intelligence and Statistics , pages =
Algorithms for fairness in sequential decision making , author =. International Conference on Artificial Intelligence and Statistics , pages =
-
[39]
Rateike, Miriam and Valera, Isabel and Forré, Patrick , year = 2024, month = jun, booktitle =. Designing. doi:10.1145/3630106.3658538 , isbn = 9798400704505, url =
-
[40]
Conference on Uncertainty in Artificial Intelligence , pages =
Fair contextual multi-armed bandits: Theory and experiments , author =. Conference on Uncertainty in Artificial Intelligence , pages =
-
[41]
Uncertainty in Artificial Intelligence , pages =
Efficient resource allocation with fairness constraints in restless multi-armed bandits , author =. Uncertainty in Artificial Intelligence , pages =
-
[42]
Proceedings of the AAAI Conference on Artificial Intelligence , volume = 38, pages =
Online restless multi-armed bandits with long-term fairness constraints , author =. Proceedings of the AAAI Conference on Artificial Intelligence , volume = 38, pages =
-
[43]
Addressing Polarization and Unfairness in Performative Prediction , author =
-
[44]
Fairness and Machine Learning: Limitations and Opportunities , author =
-
[45]
Inherent Trade-Offs in the Fair Determination of Risk Scores
Inherent trade-offs in the fair determination of risk scores , author=. arXiv preprint arXiv:1609.05807 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[46]
Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=
Don’t throw it away! the utility of unlabeled data in fair decision making , author=. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=
work page 2022
-
[47]
Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency , pages=
M ^2 FGB: A Min-Max Gradient Boosting Framework for Subgroup Fairness , author=. Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency , pages=
work page 2025
-
[48]
Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection , volume =
Alghamdi, Wael and Hsu, Hsiang and Jeong, Haewon and Wang, Hao and Michalak, Peter and Asoodeh, Shahab and Calmon, Flavio , booktitle =. Beyond Adult and COMPAS: Fair Multi-Class Prediction via Information Projection , volume =
-
[49]
Journal of Banking & Finance , volume=
Does reject inference really improve the performance of application scoring models? , author=. Journal of Banking & Finance , volume=. 2004 , publisher=
work page 2004
-
[50]
An introduction to deep reinforcement learning , author=. Foundations and Trends. 2018 , publisher=
work page 2018
-
[51]
International Conference on Artificial Intelligence and Statistics , pages=
Fair decisions despite imperfect predictions , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2020 , organization=
work page 2020
-
[52]
Proceedings of the AAAI conference on artificial intelligence , volume=
The importance of modeling data missingness in algorithmic fairness: A causal perspective , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
-
[53]
advances in neural information processing systems , volume=
The self-normalized estimator for counterfactual learning , author=. advances in neural information processing systems , volume=
-
[54]
International conference on machine learning , pages=
Constrained policy optimization , author=. International conference on machine learning , pages=. 2017 , organization=
work page 2017
-
[55]
Cortes, Corinna and Mansour, Yishay and Mohri, Mehryar , title =. Proceedings of the 24th International Conference on Neural Information Processing Systems - Volume 1 , pages =. 2010 , publisher =
work page 2010
-
[56]
Keswani, Vijay and Mehrotra, Anay and Celis, L. Elisa , title =. Proceedings of the 41st International Conference on Machine Learning , articleno =. 2024 , publisher =
work page 2024
-
[57]
Baker, Ryan S. and Hawn, Aaron , title=. International Journal of Artificial Intelligence in Education , year=. doi:10.1007/s40593-021-00285-9 , url=
-
[58]
The Journal of Finance , volume =
Fuster, Andreas and Goldsmith-Pinkham, Paul and Ramadorai, Tarun and Walther, Ansgar , title =. The Journal of Finance , volume =. doi:https://doi.org/10.1111/jofi.13090 , url =. https://onlinelibrary.wiley.com/doi/pdf/10.1111/jofi.13090 , abstract =
-
[59]
Journal of Machine Learning Research , volume=
Nearly-tight VC-dimension and pseudodimension bounds for piecewise linear neural networks , author=. Journal of Machine Learning Research , volume=
-
[60]
Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=
Fairness in reinforcement learning , author=. Proceedings of the 34th International Conference on Machine Learning-Volume 70 , pages=
-
[61]
Transactions on Machine Learning Research , year=
Group fairness in reinforcement learning , author=. Transactions on Machine Learning Research , year=
-
[62]
arXiv preprint arXiv:2412.17123 , year=
Fairness in Reinforcement Learning with Bisimulation Metrics , author=. arXiv preprint arXiv:2412.17123 , year=
-
[63]
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence , pages=
What hides behind unfairness? exploring dynamics fairness in reinforcement learning , author=. Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence , pages=
-
[64]
Proceedings of the 41st International Conference on Machine Learning , pages=
Fair off-policy learning from observational data , author=. Proceedings of the 41st International Conference on Machine Learning , pages=
-
[65]
The selective labels problem: Evaluating algorithmic predictions in the presence of unobservables , author=. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining , pages=
-
[66]
Advances in Neural Information Processing Systems , volume=
First order constrained optimization in policy space , author=. Advances in Neural Information Processing Systems , volume=
-
[67]
Proceedings of machine learning research , volume=
From biased selective labels to pseudo-labels: an expectation-maximization framework for learning from biased decisions , author=. Proceedings of machine learning research , volume=
-
[68]
Advances in Neural Information Processing Systems , volume=
Automating data annotation under strategic human agents: Risks and potential solutions , author=. Advances in Neural Information Processing Systems , volume=
-
[69]
Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , pages=
Algorithmic fairness in performative policy learning: Escaping the impossibility of group fairness , author=. Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency , pages=
work page 2024
-
[70]
Causal inference in statistics, social, and biomedical sciences , author=. 2015 , publisher=
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.