Recognition: unknown
Price of Fairness in Short-Term and Long-Term Algorithmic Selections
Pith reviewed 2026-05-08 09:59 UTC · model grok-4.3
The pith
Short-term fairness constraints in repeated selections can cause large utility losses even when groups start nearly identical, while simple long-term investment policies can remove disparities at low cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a stylized model of repeated individual selections that influence population distributions, the optimal short-term fair policies exhibit a high Price of Fairness even when initial group distributions are nearly identical. However, simple investment policies can cause long-term group disparities to vanish while maintaining a low Price of Fairness.
What carries the argument
The Price of Fairness (PoF) ratio applied separately to short-term and long-term group fairness definitions inside a sequential selection model that updates population distributions after each round.
If this is right
- Short-term fairness rules can force decision-makers to accept substantially lower total utility.
- Long-term group disparities are not fixed and can be reduced by policies that allocate opportunities with future distributions in mind.
- The same fairness target produces very different utility costs depending on whether it is enforced only in the current period or across multiple periods.
- Investment-style selection rules achieve both low Price of Fairness and vanishing long-run disparities in the model.
Where Pith is reading between the lines
- Practitioners may want to measure how current selections shift group compositions before locking in short-term fairness rules.
- The framework suggests testing whether real hiring or lending systems display the same short-term cost versus long-term gain pattern.
- Extending the model to include noisy observations of group membership could show how robust the low long-term PoF result remains.
Load-bearing premise
The analysis rests on a simplified model of how selections change future population distributions together with particular definitions of short-term and long-term group fairness.
What would settle it
A real data set in which short-term fair policies show low Price of Fairness despite nearly identical groups, or in which long-term investment policies fail to shrink disparities, would contradict the central claims.
Figures
read the original abstract
Algorithmic decision-making in high-stakes settings can have profound impacts on individuals and populations. While much prior work studies fairness in static settings, recent results show that enforcing static fairness constraints may exacerbate long-run disparities. Motivated by this tension, we study a stylized sequential selection problem in which a decision-maker repeatedly selects individuals, affecting both immediate utility and the population distribution over time. We introduce notions of group fairness for both the short and long term and theoretically analyze the trade-off between fairness and utility via the Price of Fairness (PoF). We characterize optimal and fair policies in the short term and show that the PoF can be large even when group distributions are nearly identical. In contrast, we show that long-term disparities can vanish under simple investment policies that achieve a low PoF. We also empirically validate these theoretical observations using both synthetic and real datasets.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies a stylized sequential selection problem in which a decision-maker repeatedly selects individuals from groups, affecting both immediate utility and future population distributions via a dynamical model. It defines notions of short-term and long-term group fairness, characterizes optimal and fair policies in the short term, proves that the Price of Fairness (PoF) can be large even when group distributions are nearly identical, and shows that long-term disparities can vanish under simple investment policies that achieve low PoF. These theoretical results are empirically validated on synthetic and real datasets.
Significance. If the derivations hold, the work offers a precise theoretical treatment of fairness-utility trade-offs in dynamic settings, distinguishing short-term costs from long-term benefits. The explicit policy constructions, fully specified transition rules, and empirical validation on both synthetic and real data strengthen the contribution; the vanishing-disparity result under low-PoF policies is particularly noteworthy for its potential policy relevance.
major comments (2)
- [§3.2] §3.2, the short-term PoF characterization: the proof that PoF remains large even for nearly identical distributions (e.g., when the Wasserstein distance between group distributions approaches zero) relies on the specific form of the selection utility and the fairness constraint; it is unclear whether this holds under alternative utility functions or when the short-term horizon includes even one transition step from Eq. (2).
- [§4.2] §4.2, the long-term vanishing result: the claim that disparities vanish under the proposed investment policy with low PoF is shown for the given transition kernel, but the analysis does not address the sensitivity to the investment cost parameter; if this cost exceeds a threshold (implicit in the proof), the low-PoF guarantee may fail while still satisfying long-term fairness.
minor comments (3)
- [§2] The notation for group distributions and transition probabilities is introduced in §2 but reused with slight variations in later sections; a single consolidated table of symbols would improve readability.
- [Experiments] In the experimental section, the real-world dataset is referenced only by name without detailing preprocessing steps or exact fairness metric implementations; this should be expanded for reproducibility.
- [Figure 3] Figure 3 caption does not specify the exact parameter values used for the 'nearly identical' distribution case, making it hard to verify the visual claim against the theorem statement.
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the recommendation for minor revision. The comments help clarify the scope of our results. We address each major comment below and will update the manuscript accordingly.
read point-by-point responses
-
Referee: [§3.2] §3.2, the short-term PoF characterization: the proof that PoF remains large even for nearly identical distributions (e.g., when the Wasserstein distance between group distributions approaches zero) relies on the specific form of the selection utility and the fairness constraint; it is unclear whether this holds under alternative utility functions or when the short-term horizon includes even one transition step from Eq. (2).
Authors: We agree that the short-term PoF lower bound is derived for the linear utility u(x) = x and the specific group fairness constraint. This utility is standard for modeling selection quality in the literature, and the proof exploits the fact that fairness forces selection from the lower tail even when distributions are close in Wasserstein distance. We will add a remark in §3.2 noting that the result extends qualitatively to any strictly increasing utility (via a similar adversarial construction) but does not claim universality for arbitrary utilities. Regarding the horizon, short-term fairness is defined to apply only to the immediate selection step; the one-step transition in Eq. (2) is used only to define the long-term dynamics. We will explicitly state this separation in the revised text to avoid ambiguity. revision: partial
-
Referee: [§4.2] §4.2, the long-term vanishing result: the claim that disparities vanish under the proposed investment policy with low PoF is shown for the given transition kernel, but the analysis does not address the sensitivity to the investment cost parameter; if this cost exceeds a threshold (implicit in the proof), the low-PoF guarantee may fail while still satisfying long-term fairness.
Authors: We thank the referee for this observation. The vanishing-disparity result and the accompanying low-PoF bound are proven for the given transition kernel when the per-step investment cost lies below an implicit threshold that keeps the policy both fair and near-optimal. We will revise §4.2 to make this threshold explicit, derive the condition under which the low-PoF guarantee continues to hold, and add a short sensitivity analysis showing how PoF scales with the cost parameter. This addition will clarify the regime in which the policy remains attractive. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper defines a fully specified stylized sequential selection model with explicit transition rules for how selections and investments update group distributions. Short-term optimal and fair policies are characterized directly from the model's utility and fairness definitions, and the PoF bounds follow from these characterizations without reducing to fitted parameters or self-referential constructions. Long-term vanishing-disparity results are obtained from simple investment policies constructed within the same model. No load-bearing self-citations, ansatzes smuggled via prior work, or renamings of known results appear in the derivation steps; all claims remain self-contained against the stated assumptions and explicit constructions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Stylized model of how selections affect future group distributions
Reference graph
Works this paper leans on
-
[1]
Wealth dynamics over generations: Analysis and interventions
Krishna Acharya, Eshwar Ram Arunachaleswaran, Sampath Kannan, Aaron Roth, and Juba Ziani. Wealth dynamics over generations: Analysis and interventions. InSATML, pages 42–57, 2023
2023
-
[2]
Constrained policy optimization
Joshua Achiam, David Held, Aviv Tamar, and Pieter Abbeel. Constrained policy optimization. InICML, pages 22–31, 2017
2017
-
[3]
CRC Press, 1999
Eitan Altman.Constrained Markov Decision Processes. CRC Press, 1999
1999
-
[4]
Pipeline interventions
Eshwar Arunachaleswaran, Sampath Kannan, Aaron Roth, and Juba Ziani. Pipeline interventions. In ITCS, pages 8:1–8:20, 2021
2021
-
[5]
fairmlbook.org, 2019
Solon Barocas, Moritz Hardt, and Arvind Narayanan.Fairness and Machine Learning: Limitations and Opportunities. fairmlbook.org, 2019
2019
-
[6]
Fairness in criminal justice risk assessments: The state of the art.Sociological Methods & Research, 50(1):3–44, 2021
Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: The state of the art.Sociological Methods & Research, 50(1):3–44, 2021
2021
-
[7]
The price of fairness.Operations Research, 59 (1):17–31, 2011
Dimitris Bertsimas, Vivek Farias, and Nikolaos Trichakis. The price of fairness.Operations Research, 59 (1):17–31, 2011
2011
-
[8]
Report to the congress on credit scoring and its effects on the availability and affordability of credit, 2007
Board of Governors of the Federal Reserve System. Report to the congress on credit scoring and its effects on the availability and affordability of credit, 2007
2007
-
[9]
Three naive bayes approaches for discrimination-free classification.Data Mining and Knowledge Discovery, 21(2):277–292, 2010
Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification.Data Mining and Knowledge Discovery, 21(2):277–292, 2010. 12
2010
-
[10]
State augmented con- strained reinforcement learning: Overcoming the limitations of learning with rewards.IEEE Transactions on Automatic Control, 69(7):4275–4290, 2024
Miguel Calvo-Fullana, Santiago Paternain, Luiz Chamon, and Alejandro Ribeiro. State augmented con- strained reinforcement learning: Overcoming the limitations of learning with rewards.IEEE Transactions on Automatic Control, 69(7):4275–4290, 2024
2024
-
[11]
Fair classification with adversarial perturbations
Elisa Celis, Anay Mehrotra, and Nisheeth Vishnoi. Fair classification with adversarial perturbations. In NeurIPS, pages 8158–8171, 2021
2021
-
[12]
A survey on explainable deep reinforcement learning.CoRR, abs/2502.06869, 2025
Zelei Cheng, Jiahao Yu, and Xinyu Xing. A survey on explainable deep reinforcement learning.CoRR, abs/2502.06869, 2025
-
[13]
Towards return parity in markov decision processes
Jianfeng Chi, Jian Shen, Xinyi Dai, Weinan Zhang, Yuan Tian, and Han Zhao. Towards return parity in markov decision processes. InAISTATS, pages 1161–1178, 2022
2022
-
[14]
Algorithmic decision making and the cost of fairness
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. InKDD, pages 797–806. ACM, 2017
2017
-
[15]
Gaebler, Hamed Nilforoshan, Ravi Shroff, and Sharad Goel
Sam Corbett-Davies, Johann D. Gaebler, Hamed Nilforoshan, Ravi Shroff, and Sharad Goel. The measure and mismeasure of fairness.Journal of Machine Learning Research, 24:312:1–312:117, 2023
2023
-
[16]
On welfare-centric fair reinforcement learning.Reinforcement Learning Journal, 3:1124–1137, 2024
Cyrus Cousins, Kavosh Asadi, Elita Lobo, and Michael Littman. On welfare-centric fair reinforcement learning.Reinforcement Learning Journal, 3:1124–1137, 2024
2024
-
[17]
Sculley, and Yoni Halpern
Alexander D’Amour, Hansa Srinivasan, James Atwood, Pallavi Baljekar, D. Sculley, and Yoni Halpern. Fairness is not static: Deeper understanding of long term fairness via simulation studies. InFAccT, pages 525–534, 2020
2020
-
[18]
Algorithmic reparation.Big Data & Society, 8(2), 2021
Jenny Davis, Apryl Williams, and Michael Yang. Algorithmic reparation.Big Data & Society, 8(2), 2021
2021
-
[19]
Reinforcement learning with stepwise fairness constraints
Zhun Deng, He Sun, Steven Wu, Linjun Zhang, and David Parkes. Reinforcement learning with stepwise fairness constraints. InAISTATS, volume 206, pages 10594–10618, 2023
2023
-
[20]
Natural policy gradient primal- dual method for constrained markov decision processes
Dongsheng Ding, Kaiqing Zhang, Tamer Basar, and Mihailo Jovanovic. Natural policy gradient primal- dual method for constrained markov decision processes. InNeurIPS, 2020
2020
-
[21]
Provably efficient safe exploration via primal-dual policy optimization
Dongsheng Ding, Xiaohan Wei, Zhuoran Yang, Zhaoran Wang, and Mihailo Jovanovic. Provably efficient safe exploration via primal-dual policy optimization. InAISTATS, pages 3304–3312, 2021
2021
-
[22]
Fairness and utilization in allocating resources with uncertain demand
Kate Donahue and Jon Kleinberg. Fairness and utilization in allocating resources with uncertain demand. InFAccT, pages 658–668, 2020
2020
-
[23]
Strategic classification from revealed preferences
Jinshuo Dong, Aaron Roth, Zachary Schutzman, Bo Waggoner, and Zhiwei Steven Wu. Strategic classification from revealed preferences. InEC, pages 55–70, 2018
2018
-
[24]
Importance sampling for fair policy selection
Shayan Doroudi, Philip Thomas, and Emma Brunskill. Importance sampling for fair policy selection. In UAI, 2017
2017
-
[25]
Eric Eaton, Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, and Jessica Sor- rell. Intersectional fairness in reinforcement learning with large state and constraint spaces.CoRR, abs/2502.11828, 2025
-
[26]
Fair algorithms for learning in allocation problems
Hadi Elzayn, Shahin Jabbari, Christopher Jung, Michael Kearns, Seth Neel, Aaron Roth, and Zachary Schutzman. Fair algorithms for learning in allocation problems. InFAccT, pages 170–179, 2019
2019
-
[27]
Long-term resource allocation fairness in average markov decision process (AMDP) environment
Ganesh Ghalme, Vineet Nair, Vishakha Patil, and Yilun Zhou. Long-term resource allocation fairness in average markov decision process (AMDP) environment. InAAMAS, pages 525–533, 2022. 13
2022
-
[28]
Strategic classification
Moritz Hardt, Nimrod Megiddo, Christos Papadimitriou, and Mary Wootters. Strategic classification. In Madhu Sudan, editor,ITCS, pages 111–122, 2016
2016
-
[29]
Equality of opportunity in supervised learning
Moritz Hardt, Eric Price, and Nathan Srebro. Equality of opportunity in supervised learning. InNeurIPS, pages 3315–3323, 2016
2016
-
[30]
Multicalibration: Calibration for the (computationally-identifiable) masses
Úrsula Hébert-Johnson, Michael Kim, Omer Reingold, and Guy Rothblum. Multicalibration: Calibration for the (computationally-identifiable) masses. InICML, pages 1944–1953, 2018
1944
-
[31]
Allocating opportunities in a dynamic model of intergenerational mobility
Hoda Heidari and Jon Kleinberg. Allocating opportunities in a dynamic model of intergenerational mobility. InFAccT, pages 15–25, 2021
2021
-
[32]
Amoralframeworkforunderstanding fair ML through economic models of equality of opportunity
HodaHeidari, MicheleLoi, KrishnaGummadi, andAndreasKrause. Amoralframeworkforunderstanding fair ML through economic models of equality of opportunity. InFAccT, pages 181–190, 2019
2019
-
[33]
On the long-term impact of algorithmic decision policies: Effort unfairness and feature segregation through social learning
Hoda Heidari, Vedant Nanda, and Krishna Gummadi. On the long-term impact of algorithmic decision policies: Effort unfairness and feature segregation through social learning. InICML, pages 2692–2701, 2019
2019
-
[34]
Achieving long-term fairness in sequential decision making
Yaowei Hu and Lu Zhang. Achieving long-term fairness in sequential decision making. InAAAI, pages 9549–9557, 2022
2022
-
[35]
Fairness in reinforcement learning
Shahin Jabbari, Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. Fairness in reinforcement learning. InICML, pages 1617–1626, 2017
2017
-
[36]
Fairness in learning: Classic and contextual bandits
Matthew Joseph, Michael Kearns, Jamie Morgenstern, and Aaron Roth. Fairness in learning: Classic and contextual bandits. InNeurIPS, pages 325–333, 2016
2016
-
[37]
Preventing fairness gerrymandering: Auditing and learning for subgroup fairness
Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Wu. Preventing fairness gerrymandering: Auditing and learning for subgroup fairness. InICML, pages 2569–2577, 2018
2018
-
[38]
How do classifiers induce agents to invest effort strategically? In EC, pages 825–844, 2019
Jon Kleinberg and Manish Raghavan. How do classifiers induce agents to invest effort strategically? In EC, pages 825–844, 2019
2019
-
[39]
Delayed impact of fair machine learning
Lydia Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fair machine learning. InICML, pages 3156–3164, 2018
2018
-
[40]
Online fair allocation of reusable resources.Proceedings of the ACM on Measurement and Analysis of Computing Systems, 9(2):1–46, 2025
Qingsong Liu and Mohammad Hajiesmaili. Online fair allocation of reusable resources.Proceedings of the ACM on Measurement and Analysis of Computing Systems, 9(2):1–46, 2025
2025
-
[41]
Learning policies with zero or bounded constraint violation for constrained MDPs
Tao Liu, Ruida Zhou, Dileep Kalathil, Panganamala Kumar, and Chao Tian. Learning policies with zero or bounded constraint violation for constrained MDPs. InNeurIPS, pages 17183–17193, 2021
2021
- [42]
-
[43]
Yishay Mansour, Michal Moshkovitz, and Cynthia Rudin. There is no accuracy-interpretability tradeoff in reinforcement learning for mazes.CoRR, abs/2206.04266, 2022
-
[44]
The cost of fairness in binary classification
Aditya Menon and Robert Williamson. The cost of fairness in binary classification. InFAccT, pages 107–118, 2018
2018
-
[45]
From fair decision making to social equality
Hussein Mouzannar, Mesrob Ohannessian, and Nathan Srebro. From fair decision making to social equality. InFAccT, pages 359–368, 2019
2019
-
[46]
Truly no-regret learning in constrained mdps
Adrian Müller, Pragnya Alatur, Volkan Cevher, Giorgia Ramponi, and Niao He. Truly no-regret learning in constrained mdps. InICML, 2024. 14
2024
-
[47]
Fair influence maximization: A welfare optimization approach
Aida Rahmattalabi, Shahin Jabbari, Himabindu Lakkaraju, Phebe Vayanos, Max Izenberg, Ryan Brown, Eric Rice, and Milind Tambe. Fair influence maximization: A welfare optimization approach. InAAAI, pages 11630–11638, 2021
2021
-
[48]
Designing long-term group fair policies in dynamical systems
Miriam Rateike, Isabel Valera, and Patrick Forré. Designing long-term group fair policies in dynamical systems. InFAccT, pages 20–50, 2024
2024
-
[49]
Fairness in reinforcement learning: A survey
Anka Reuel and Devin Ma. Fairness in reinforcement learning: A survey. InAIES, pages 1218–1230. AAAI Press, 2024
2024
-
[50]
Enabling long-term fairness in dynamic resource allocation
Tareq Si Salem, Georgios Iosifidis, and Giovanni Neglia. Enabling long-term fairness in dynamic resource allocation. InSIGMETRICS, pages 31–32, 2023
2023
-
[51]
Group fairness in reinforcement learning.Transaction Machine Learning Research, 2023
Harsh Satija, Alessandro Lazaric, Matteo Pirotta, and Joelle Pineau. Group fairness in reinforcement learning.Transaction Machine Learning Research, 2023
2023
-
[52]
Cross-loss influence functions to explain deep network representations
Andrew Silva, Rohit Chopra, and Matthew Gombolay. Cross-loss influence functions to explain deep network representations. InAISTATS, volume 151, pages 1–17, 2022
2022
-
[53]
No-regret algorithms for fair resource allocation
Abhishek Sinha, Ativ Joshi, Rajarshi Bhattacharjee, Cameron Musco, and Mohammad Hajiesmaili. No-regret algorithms for fair resource allocation. InNeurIPS, 2023
2023
-
[54]
Seeding network influence in biased networks and the benefits of diversity
Ana-Andreea Stoica, Jessy Xinyi Han, and Augustin Chaintreau. Seeding network influence in biased networks and the benefits of diversity. InWWW, pages 2089–2098, 2020
2089
-
[55]
Group-fairness in influence maximization
Alan Tsang, Bryan Wilder, Eric Rice, Milind Tambe, and Yair Zick. Group-fairness in influence maximization. InIJCAI, pages 5997–6005, 2019
2019
-
[56]
Constrained cross-entropy method for safe reinforcement learning
Min Wen and Ufuk Topcu. Constrained cross-entropy method for safe reinforcement learning. In NeurIPS, pages 7461–7471, 2018
2018
-
[57]
Algorithms for fairness in sequential decision making
Min Wen, Osbert Bastani, and Ufuk Topcu. Algorithms for fairness in sequential decision making. In AISTATS, pages 1144–1152, 2021
2021
-
[58]
Policy design in long-run welfare dynamics
Jiduan Wu, Rediet Abebe, Moritz Hardt, and Ana-Andreea Stoica. Policy design in long-run welfare dynamics. InICLR, 2025
2025
-
[59]
Long-term fairness with unknown dynamics
Tongxin Yin, Reilly Raab, Mingyan Liu, and Yang Liu. Long-term fairness with unknown dynamics. CoRR, abs/2304.09362, 2023
-
[60]
How do fair decisions fare in long-term qualification? InNeurIPS, 2020
Xueru Zhang, Ruibo Tu, Yang Liu, Mingyan Liu, Hedvig Kjellström, Kun Zhang, and Cheng Zhang. How do fair decisions fare in long-term qualification? InNeurIPS, 2020. A Omitted Details from Section 3 A.1 Proof of Observation 1 (NoC 2 category) Proof. We prove this by showing that the probability threshold required for positive utility is strictly higher tha...
2020
-
[61]
stability
P x∈C1 DB(x)E[u(x)]≥ε· P x∈C1 DA(x)E[u(x)]. 2.At least oneof the following properties holds: •Either P x∈C1 DB(x)E[∆(x)]≥µ A −µ B −α; •Or there exists a subset of individualeC3 ⊆C 3 in groupB, such that X x∈ eC3 DB(x)E[u(x)] ≤ε· X x∈C1 DB[x]E[u(x)], and X x∈C1∪ eC3 DB(x)E[∆(x)]≥µ A −µ B −α . Then, PoF= 1− Fair-OPT(I) OPT(I) ≤1−ε/4. Proof. We show that und...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.