Fairness under uncertainty in sequential decisions
Pith reviewed 2026-05-09 22:20 UTC · model grok-4.3
The pith
A taxonomy of model, feedback, and prediction uncertainty shows how uneven distributions across groups in sequential decisions produce compounding exclusion that uncertainty-aware policies can mitigate.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Policies that ignore the unobserved space in sequential decisions under uneven uncertainty systematically disadvantage under-represented groups through compounding exclusion and reduced access while also imposing unrealized losses on decision makers; formalizing model and feedback uncertainty with counterfactuals and reinforcement learning, together with the additional category of prediction uncertainty, supplies the diagnostic tools needed to surface these effects, and uncertainty-aware exploration strategies can reduce outcome variance for disadvantaged groups while preserving expected utility.
What carries the argument
Three-category taxonomy of uncertainty (model, feedback, prediction) in sequential decisions, formalized via counterfactual logic and reinforcement learning to trace how selective feedback creates self-reinforcing disparities.
If this is right
- Sequential systems can audit fairness risks by measuring how uncertainty is distributed across demographic groups rather than only tracking average outcomes.
- Uncertainty-aware exploration changes fairness metrics in simulations with varying bias levels, showing that selective feedback is not incidental noise but a driver of disparity.
- Decision makers can reduce unrealized gains and losses by explicitly modeling unobserved counterfactuals instead of relying on observed data alone.
- The taxonomy equips governance processes to distinguish incidental noise from uncertainty-driven unfairness in online applications.
Where Pith is reading between the lines
- Extending the approach to real lending or hiring platforms would require integrating uncertainty estimates into existing reinforcement learning pipelines to test long-term access effects.
- Neighbouring problems such as dynamic pricing or content recommendation could adopt the same taxonomy to check whether selective feedback creates parallel exclusion loops.
- A testable extension is to apply the framework to recidivism or medical triage data and measure whether uncertainty-aware policies alter group-level outcome variance over multiple decision rounds.
Load-bearing premise
That the three uncertainty categories are exhaustive and non-overlapping, and that counterfactual logic plus reinforcement learning suffice to diagnose and correct the resulting harms without missing interactions.
What would settle it
A controlled simulation or real deployment in which uncertainty-aware exploration policies produce no reduction—or an increase—in outcome variance for disadvantaged groups relative to standard policies, or in which observed disparities are better explained by factors outside the three-category taxonomy.
Figures
read the original abstract
Fair machine learning (ML) methods help identify and mitigate the risk that algorithms encode or automate social injustices. Algorithmic approaches alone cannot resolve structural inequalities, but they can support socio-technical decision systems by surfacing discriminatory biases, clarifying trade-offs, and enabling governance. Although fairness is well studied in supervised learning, many real ML applications are online and sequential, with prior decisions informing future ones. Each decision is taken under uncertainty due to unobserved counterfactuals and finite samples, with dire consequences for under-represented groups, systematically under-observed due to historical exclusion and selective feedback. A bank cannot know whether a denied loan would have been repaid, and may have less data on marginalized populations. This paper introduces a taxonomy of uncertainty in sequential decision-making -- model, feedback, and prediction uncertainty -- providing shared vocabulary for assessing systems where uncertainty is unevenly distributed across groups. We formalize model and feedback uncertainty via counterfactual logic and reinforcement learning, and illustrate harms to decision makers (unrealized gains/losses) and subjects (compounding exclusion, reduced access) of policies that ignore the unobserved space. Algorithmic examples show it is possible to reduce outcome variance for disadvantaged groups while preserving institutional objectives (e.g. expected utility). Experiments on data simulated with varying bias show how unequal uncertainty and selective feedback produce disparities, and how uncertainty-aware exploration alters fairness metrics. The framework equips practitioners to diagnose, audit, and govern fairness risks. Where uncertainty drives unfairness rather than incidental noise, accounting for it is essential to fair and effective decision-making.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a taxonomy of uncertainty in sequential decision-making consisting of model, feedback, and prediction uncertainty. It formalizes model and feedback uncertainty via counterfactual logic and reinforcement learning, illustrates harms to decision makers and subjects from policies that ignore unobserved outcomes, provides algorithmic examples demonstrating that outcome variance for disadvantaged groups can be reduced while preserving institutional objectives such as expected utility, and reports experiments on simulated data with varying bias showing how unequal uncertainty and selective feedback produce disparities and how uncertainty-aware exploration alters fairness metrics. The framework is intended to equip practitioners to diagnose, audit, and govern fairness risks where uncertainty is unevenly distributed.
Significance. If the taxonomy and formalizations hold, the work supplies a shared vocabulary and diagnostic lens for fairness issues that arise specifically in online and sequential settings, an area where supervised-learning fairness methods are known to be insufficient. The constructed algorithmic examples are a strength, as they demonstrate in principle that variance reduction for disadvantaged groups need not trade off against expected utility. The simulations, though illustrative, highlight compounding effects of selective feedback. These elements together could support more targeted fairness audits in high-stakes sequential domains.
minor comments (4)
- [Abstract] Abstract: the claim that 'experiments on data simulated with varying bias show how unequal uncertainty and selective feedback produce disparities' is stated without any quantitative metrics, sample sizes, bias parameters, or specific fairness measures (e.g., demographic parity, equalized odds); adding these details would allow readers to assess the magnitude and robustness of the reported alterations.
- [Formalization] The formalization section references counterfactual logic and reinforcement learning for model and feedback uncertainty but does not display the key definitions or equations; including them (even if standard) would clarify how the taxonomy maps onto existing tools and avoid reliance on high-level description alone.
- [Algorithmic examples] Algorithmic examples section: the statement that variance reduction is achieved 'while preserving institutional objectives' would be strengthened by an explicit statement of the objective function or constraint used in each example, so that readers can verify the claimed lack of trade-off.
- [Taxonomy] The manuscript would benefit from a short discussion of potential overlaps or gaps between the three uncertainty categories (model, feedback, prediction), even if only to note that the taxonomy is intended as a practical rather than exhaustive partition.
Simulated Author's Rebuttal
We thank the referee for the positive summary of the manuscript and the recommendation for minor revision. We are pleased that the taxonomy of uncertainty, the formalizations via counterfactual logic and reinforcement learning, the algorithmic examples showing variance reduction without sacrificing expected utility, and the simulations on selective feedback were viewed as strengths that could support targeted fairness audits in sequential domains.
Circularity Check
No significant circularity
full rationale
The paper presents a conceptual taxonomy of uncertainty (model, feedback, prediction) in sequential decisions, formalized using standard counterfactual logic and reinforcement learning concepts. It illustrates harms via examples and reports experiments on simulated data with varying bias. No load-bearing derivation, prediction, or result is shown to reduce by construction to fitted parameters, self-citations, or ansatzes that presuppose the target fairness outcomes. The central contribution is the taxonomy and diagnostic framing itself, which does not rely on equations that equate outputs to inputs by definition. This is a standard non-circular framework paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Counterfactual logic and reinforcement learning provide an adequate formalization of model and feedback uncertainty in sequential decisions.
invented entities (1)
-
Taxonomy of model, feedback, and prediction uncertainty
no independent evidence
Reference graph
Works this paper leans on
-
[1]
1968. Fair Housing Act. https://uscode.house.gov/view.xhtml?edition=prelim&path=%2Fprelim%40title42%2Fchapter45 42 U.S.C. §§3601–3619
work page 1968
-
[2]
1974. Equal Credit Opportunity Act. https://uscode.house.gov/view.xhtml?edition=prelim&req=granuleid:USC-prelim-title15-chapter41- subchapter4 15 U.S.C. §§1691–1691f
work page 1974
-
[3]
2010. Equality Act 2010. https://www.legislation.gov.uk/ukpga/2010/15/contents Accessed 2026-03-23
work page 2010
-
[4]
UN General Assembly et al. 1948. Universal declaration of human rights. UN General Assembly 302, 2 (1948), 14–25
work page 1948
-
[5]
Solon Barocas, Moritz Hardt, and Arvind Narayanan. 2019. Fairness and Machine Learning. fairmlbook.org. http://www.fairmlbook.org
work page 2019
-
[6]
Joachim Baumann, Alessandro Castelnovo, Riccardo Crupi, Nicole Inverardi, and Daniele Regoli. 2023. Bias on demand: a modelling framework that generates synthetic data with bias. In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency. 1002–1013
work page 2023
-
[7]
Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. 2021. Fairness in Criminal Justice Risk Assessments: The State of the Art. Sociological Methods & Research 50, 1 (2021), 3–44
work page 2021
-
[8]
Umang Bhatt, Javier Antorán, Yunfeng Zhang, Q Vera Liao, Prasanna Sattigeri, Riccardo Fogliato, Gabrielle Melançon, Ranganath Krishnan, Jason Stanley, Omesh Tickoo, et al. 2021. Uncertainty as a form of transparency: Measuring, communicating, and using uncertainty. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 401–413
work page 2021
-
[9]
Simon Caton and Christian Haas. 2024. Fairness in machine learning: A survey. Comput. Surveys 56, 7 (2024), 1–38
work page 2024
-
[10]
Gaebler, Hamed Nilforoshan, Ravi Shroff, and Sharad Goel
Sam Corbett-Davies, Johann D. Gaebler, Hamed Nilforoshan, Ravi Shroff, and Sharad Goel. 2023. The measure and mismeasure of fairness. J. Mach. Learn. Res. 24, 1, Article 312 (Jan. 2023), 117 pages
work page 2023
-
[11]
Amanda Coston, Ashesh Rambachan, and Alexandra Chouldechova. 2021. Characterizing Fairness Over the Set of Good Models Under Selective Labels. In Proceedings of the 38th International Conference on Machine Learning. 2144–2155
work page 2021
-
[12]
Elliot Creager, David Madras, Toniann Pitassi, and Richard Zemel. 2020. Causal Modeling for Fairness In Dynamical Systems. In Proceedings of the 37th International Conference on Machine Learning. 2185–2195
work page 2020
-
[13]
Alexander D’Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, D. Sculley, and Yoni Halpern. 2020. Fairness is Not Static: Deeper Understanding of Long Term Fairness via Simulation Studies. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 525–534
work page 2020
- [14]
-
[15]
Giovanni De Toni, Stefano Teso, Bruno Lepri, and Andrea Passerini. 2025. Time can invalidate algorithmic recourse. In Proceedings of the 2025 ACM Conference on Fairness, Accountability, and Transparency. 89–107
work page 2025
-
[16]
Christos Dimitrakakis, Yang Liu, David C. Parkes, and Goran Radanovic. 2019. Bayesian Fairness. Proceedings of the AAAI Conference on Artificial Intelligence 33, 01 (2019), 509–516
work page 2019
-
[17]
Michele Donini, Luca Oneto, Shai Ben-David, John Shawe-Taylor, and Massimiliano Pontil. 2018. Empirical Risk Minimization under Fairness Constraints. In Advances in Neural Information Processing Systems 32. 2796–2806
work page 2018
-
[18]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. 214–226. doi:10.1145/2090236.2090255
-
[19]
Kfir Eliaz and Ariel Rubinstein. 2014. On the fairness of random procedures. Economics Letters 123, 2 (2014), 168–170
work page 2014
-
[20]
Dorothy F Garrison-Wade and Chance W Lewis. 2004. Affirmative Action: History and Analysis. Journal of College Admission 184 (2004), 23–26
work page 2004
-
[21]
Talia B Gillis. 2022. The input fallacy. Minnesota Law Review, forthcoming (2022)
work page 2022
-
[22]
Usman Gohar, Zeyu Tang, Jialu Wang, Kun Zhang, Peter Spirtes, Yang Liu, and Lu Cheng. 2025. Long-Term Fairness Inquiries and Pursuits in Machine Learning: A Survey of Notions, Methods, and Challenges. Transactions on Machine Learning Research (2025)
work page 2025
-
[23]
Moritz Hardt, Eric Price, Eric Price, and Nati Srebro. 2016. Equality of Opportunity in Supervised Learning. In Advances in Neural Information Processing Systems 30, Vol. 29. 3315–3323
work page 2016
-
[24]
Kenneth Holstein, Jennifer Wortman Vaughan, Hal Daumé III, Miro Dudik, and Hanna Wallach. 2019. Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI conference on human factors in computing Fairness under uncertainty in sequential decisions FAccT ’26, June 25–28, 2026, Montreal, QC, Canada sy...
work page 2019
-
[25]
Yaowei Hu and Lu Zhang. 2022. Achieving Long-Term Fairness in Sequential Decision Making. Proceedings of the AAAI Conference on Artificial Intelligence 36 (2022), 9549–9557
work page 2022
-
[26]
Wen Huang, Lu Zhang, and Xintao Wu. 2021. Achieving Counterfactual Fairness for Causal Bandit. InNeurIPS Workshopon Algorithmic Fairness through the Lens of Causality and Robustness, Vol. 34
work page 2021
-
[27]
Eyke Hüllermeier and Willem Waegeman. 2021. Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods. Machine Learning 110, 3 (2021), 457–506
work page 2021
-
[28]
J. D. Hunter. 2007. Matplotlib: A 2D graphics environment. Computing in Science & Engineering 9, 3 (2007), 90–95. doi:10.1109/MCSE. 2007.55
-
[29]
Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. 2017. Fairness in Reinforcement Learning. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 70). 1617– 1626
work page 2017
-
[30]
Matthew Joseph, Michael Kearns, Jamie H Morgenstern, and Aaron Roth. 2016. Fairness in Learning: Classic and Contextual Bandits. In Advances in Neural Information Processing Systems, Vol. 29
work page 2016
-
[31]
Kirthevasan Kandasamy, Jeff Schneider, and Barnabás Póczos. 2015. High dimensional Bayesian optimisation and bandits via additive models. In International conference on machine learning. PMLR, 295–304
work page 2015
-
[32]
Heena Khanna, Manik Mehra, and Jordao Fortunato Diogo. 2025. Unveiling the Impact: Societal Implications of Reinforcement Learning Algorithms. In Reinforcement Learning: Foundations and Applications. Bentham Science Publishers, 77–95
work page 2025
-
[33]
Niki Kilbertus, Manuel Gomez Rodriguez, Bernhard Schölkopf, Krikamol Muandet, and Isabel Valera. 2020. Fair Decisions Despite Imperfect Predictions. In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 108). 277–287
work page 2020
-
[34]
Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. 2017. Inherent Trade-Offs in the Fair Determination of Risk Scores. In 8th Innovations in Theoretical Computer Science Conference, Vol. 67. 43.1–43.23
work page 2017
-
[35]
Selim Kuzucu, Jiaee Cheong, Hatice Gunes, and Sinan Kalkan. 2024. Uncertainty as a Fairness Measure. Journal of Artificial Intelligence Research 81 (2024), 307–335
work page 2024
-
[36]
Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. 2017. The Selective Labels Problem: Evaluating Algorithmic Predictions in the Presence of Unobservables. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 275–284
work page 2017
-
[37]
Michelle Seng Ah Lee, Luciano Floridi, and Jatinder Singh. 2021. Formalising trade-offs beyond algorithmic fairness: lessons from ethical philosophy and welfare economics. AI and Ethics 1, 4 (2021), 529–544
work page 2021
-
[38]
Michelle Seng Ah Lee and Jat Singh. 2021. The landscape and gaps in open source fairness toolkits. In Proceedings of the 2021 CHI conference on human factors in computing systems. 1–13
work page 2021
-
[39]
Michelle Seng Ah Lee and Jatinder Singh. 2021. Risk Identification Questionnaire for Detecting Unintended Bias in the Machine Learning Development Lifecycle. In Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society. 704–714
work page 2021
-
[40]
Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt
Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. 2019. Delayed Impact of Fair Machine Learning. InProceedings of the 28th International Joint Conference on Artificial Intelligence. 6196–6200
work page 2019
-
[41]
David Loshin. 2010. The practitioner’sguide to data quality improvement. Elsevier
work page 2010
-
[42]
Razieh Nabi, Daniel Malinsky, and Ilya Shpitser. 2019. Learning Optimal Fair Policies. InProceedings of the 36th International Conference on Machine Learning, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). 4674–4682
work page 2019
-
[43]
Judea Pearl. 2009. Causality. Cambridge University Press, New York
work page 2009
-
[44]
Bhagyashree Puranik, Upamanyu Madhow, and Ramtin Pedarsani. 2022. A Dynamic Decision-Making Framework Promoting Long-Term Fairness. In Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society. 547–556
work page 2022
-
[45]
Miriam Rateike, Isabel Valera, and Patrick Forré. 2024. Designing Long-term Group Fair Policies in Dynamical Systems. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency. 20–50
work page 2024
-
[46]
Anka Reuel and Devin Ma. 2024. Fairness in reinforcement learning: A survey. In Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, Vol. 7. 1218–1230
work page 2024
-
[47]
Yaniv Romano, Rina Foygel Barber, Chiara Sabatti, and Emmanuel Candès. 2020. With malice toward none: Assessing uncertainty via equalized coverage. Harvard Data Science Review 2, 2 (2020), 4
work page 2020
-
[48]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction (second ed.). The MIT Press, Cambridge, MA
work page 2018
-
[49]
Rui Tuo and Wenjia Wang. 2022. Uncertainty quantification for Bayesian optimization. In International Conference on Artificial Intelligence and Statistics. PMLR, 2862–2884
work page 2022
-
[50]
Sandra Wachter, Brent Mittelstadt, and Chris Russell. 2021. Why fairness cannot be automated: Bridging the gap between EU non- discrimination law and AI. Computer Law & Security Review 41 (2021), 105567
work page 2021
-
[51]
Jitao Wang, Chengchun Shi, John D. Piette, Joshua R. Loftus, Donglin Zeng, and Zhenke Wu. 2025. Counterfactually Fair Reinforcement Learning via Sequential Data Preprocessing. arXiv preprint arXiv:2501.06366 (2025). FAccT ’26, June 25–28, 2026, Montreal, QC, Canada Lee, Padh, Watson, Kilbertus, and Singh
-
[52]
Min Wen, Osbert Bastani, and Ufuk Topcu. 2021. Algorithms for Fairness in Sequential Decision Making. In Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, Vol. 130. 1144–1152
work page 2021
-
[53]
Jess Whittlestone, Kai Arulkumaran, and Matthew Crosby. 2021. The societal implications of deep reinforcement learning. Journal of Artificial Intelligence Research 70 (2021), 1003–1030
work page 2021
-
[54]
Tongxin Yin, Reilly Raab, Mingyan Liu, and Yang Liu. 2023. Long-Term Fairness with Unknown Dynamics. In Advances in Neural Information Processing Systems, Vol. 36. 55110–55139
work page 2023
-
[55]
Mert Yuksekgonul, Linjun Zhang, James Y Zou, and Carlos Guestrin. 2024. Beyond confidence: Reliable models should also consider atypicality. Advances in Neural Information Processing Systems 36 (2024)
work page 2024
-
[56]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez-Rodriguez, and Krishna P. Gummadi. 2019. Fairness Constraints: A Flexible Approach for Fair Classification. J. Mach. Learn. Res. 20, 75 (2019), 1–42
work page 2019
-
[57]
Junzhe Zhang and Elias Bareinboim. 2018. Fairness in Decision-Making — The Causal Explanation Formula. Proceedings of the AAAI Conference on Artificial Intelligence 32, 1 (2018)
work page 2018
-
[58]
Xueru Zhang and Mingyan Liu. 2021. Fairness in Learning-Based Sequential Decision Algorithms: A Survey. 525–555
work page 2021
-
[59]
Zeyu Zhou, Tianci Liu, Ruqi Bai, Jing Gao, Murat Kocaoglu, and David I Inouye. 2024. Counterfactual fairness by combining factual and counterfactual predictions. Advances in Neural Information Processing Systems 37 (2024), 47876–47907. A Synthetic data The underlying phenomenon takes the following form: 𝑌=𝑓(𝑋) +𝜀 𝑋 obs =𝑔(𝑋) 𝑌 obs =ℎ(𝑌), where 𝑋 denotes t...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.