GESD: Beyond Outcome-Oriented Fairness

Gideon Popoola; John Sheppard

arxiv: 2605.15295 · v1 · pith:OKMTJRGFnew · submitted 2026-05-14 · 💻 cs.LG · cs.AI· cs.CY

GESD: Beyond Outcome-Oriented Fairness

Gideon Popoola , John Sheppard This is my paper

Pith reviewed 2026-05-19 16:26 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CY

keywords procedural fairnessexplainable AImachine learning fairnessexplanation stabilitygroup disparitymulti-objective optimizationmodel-agnostic metrics

0 comments

The pith

GESD measures fairness by tracking how consistently machine learning models explain their predictions across demographic subgroups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Group-level Explanation Stability Disparity (GESD) as a metric focused on procedural fairness rather than just final outcomes. It quantifies differences in the stability, robustness, and sensitivity of explanations generated for different protected groups. The authors embed GESD into a multi-objective optimization called FEU that balances model utility, traditional outcome fairness, and this new explanation fairness. A sympathetic reader would value this because high-stakes systems like loan or hiring models could then be audited for whether their reasoning processes treat groups equitably, not only whether the results match.

Core claim

GESD is an explainer-agnostic and model-agnostic metric that computes group-wise disparities in explanation stability, robustness, and sensitivity for a protected category. When incorporated into the FEU optimization framework, it jointly improves utility, outcome-based fairness, and explanation-based fairness on benchmark datasets, extending fairness analysis from decisions alone to the underlying explanations.

What carries the argument

Group-level Explanation Stability Disparity (GESD), which aggregates differences in explanation quality metrics (stability, robustness, sensitivity) across subgroups of a protected attribute.

If this is right

Practitioners can now diagnose bias in the reasoning steps of a model rather than only in its final classifications.
Optimization routines can be extended to penalize large explanation disparities without sacrificing predictive accuracy.
Audits of deployed systems gain a concrete, quantitative handle on whether explanations themselves are equitable.
The same framework can be applied to any black-box model and any post-hoc explainer.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If GESD becomes standard, regulators could require explanation-stability reports alongside accuracy and demographic-parity checks.
The metric might surface new failure modes where a model achieves outcome fairness only by giving unstable or contradictory reasons to one group.
Extending GESD to longitudinal settings could reveal whether explanation fairness drifts over time as models are retrained.

Load-bearing premise

Disparities in how stably, robustly, and sensitively a model explains its outputs for different groups serve as a direct signal of procedural unfairness.

What would settle it

A controlled study in which models with large measured GESD values produce explanations that human experts rate as equally trustworthy and consistent across groups, while models with low GESD show no such advantage.

Figures

Figures reproduced from arXiv: 2605.15295 by Gideon Popoola, John Sheppard.

**Figure 3.** Figure 3: FEU Stability on Recidivism Dataset it appears the results favor EOD over DP. This is due to an inherent trade-off between different fairness metrics and the difficulty in optimizing multiple fairness metrics. Overall, however, the outcome-oriented results show less bias on both metrics on all the datasets with our method. For the procedural-oriented metrics, FEU yields a higher (but statistically insignif… view at source ↗

**Figure 2.** Figure 2: Reweighing Stability on Recidivism Dataset [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Machine learning (ML) algorithms are increasingly deployed in high-stakes decision-making domains such as loan approvals, hiring, and recidivism predictions. While existing fairness metrics (e.g., statistical parity, equal opportunity) effectively quantify outcome-oriented disparities, they offer limited insight into the procedure or explanation behind biased decisions. To address this gap, we propose Group-level Explanation Stability Disparity (GESD), a \textit{procedural-oriented} fairness metric that measures disparities in the stability, robustness, and sensitivity of model explanations across different subgroups in a protected category. %GESD is explainer-agnostic, model-agnostic, and extends the scope of fairness analyses to the level of explainability. We further integrate GESD into a multi-objective optimization framework that jointly optimizes for utility, outcome-based fairness, and explanation-based fairness called FEU (Fairness--Explainability--Utility). Empirical results on multiple benchmark datasets show that GESD effectively captures group-wise discrepancies in explanation quality, and that FEU improves both utility and fairness over state-of-the-art methods. By bridging outcome-based and explanation-based fairness, GESD offers a comprehensive tool for diagnosing and mitigating bias in predictive modeling. Our code and datasets are available on GitHub {\hyperlink{https://github.com/horlahsunbo/GESD}{https://github.com/horlahsunbo/GESD}}

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GESD adds a stability-based fairness metric but the central claim looks vulnerable to subgroup distribution differences driving the results.

read the letter

The main thing to know is that this paper defines GESD as a metric for group differences in explanation stability, robustness, and sensitivity, then folds it into the FEU multi-objective optimizer that also tracks utility and standard outcome fairness. They present it as explainer- and model-agnostic and release code on GitHub, which is useful for anyone who wants to inspect or reuse the implementation. The motivation is clear: outcome metrics miss whether the reasoning behind decisions is consistent across groups, and this tries to fill that gap with something procedural. If the experiments actually isolate explanation quality, the framework could give practitioners a concrete way to trade off the three objectives on benchmark data. The soft spot is the stress-test concern. Stability measures usually involve input perturbations whose scale depends on feature variances and densities. When protected subgroups have different marginal distributions, the same procedure will produce systematically different explanation variability even if the model and explainer are fixed. The abstract asserts model-agnosticism and effective capture of explanation discrepancies without describing an explicit correction or matching step for distribution shift. If the full paper does not include ablations that hold distributions constant or normalize the perturbations, the empirical claim weakens because part of what GESD measures may simply be data heterogeneity. The summary also gives no numbers, error bars, or baseline details, so the reported improvements over prior methods are hard to gauge from the available text. This paper is for the fairness-plus-XAI crowd who already work with explanation tools and want to extend group fairness beyond outcomes. A reader comfortable with stability metrics would get the most value and could test the confound themselves. It deserves peer review so referees can check the experimental controls and see whether the gains survive proper distribution adjustments.

Referee Report

2 major / 1 minor

Summary. The paper proposes Group-level Explanation Stability Disparity (GESD), a procedural-oriented fairness metric measuring disparities in explanation stability, robustness, and sensitivity across protected subgroups. It integrates GESD into the FEU multi-objective optimization framework that jointly optimizes utility, outcome-based fairness, and explanation-based fairness. The authors claim that GESD captures group-wise discrepancies in explanation quality and that FEU improves both utility and fairness over state-of-the-art methods on benchmark datasets, with code and data released on GitHub.

Significance. If the central claims hold after addressing the noted concerns, the work would be significant for extending fairness analysis beyond outcome metrics to procedural aspects of explanations in high-stakes domains. The public release of code and datasets is a clear strength supporting reproducibility.

major comments (2)

[Abstract] Abstract: The claim that 'GESD effectively captures group-wise discrepancies in explanation quality' and that 'FEU improves both utility and fairness over state-of-the-art methods' is asserted without any quantitative results, error bars, data-split details, or baseline comparisons visible in the text. This absence leaves the empirical validation of the central contribution unsupported and is load-bearing for the paper's contribution.
[Section 3] Section 3 (GESD definition): GESD is defined directly via disparities in stability, robustness, and sensitivity, which are typically computed through input perturbations or sampling sensitive to feature statistics. No explicit correction for inter-group marginal distribution differences (e.g., variances or supports) is described, so observed disparities may reflect data heterogeneity rather than explanation quality. This assumption is load-bearing for the explainer- and model-agnostic claims and the assertion that GESD isolates procedural fairness.

minor comments (1)

[Abstract] Abstract: The GitHub link is provided; confirm that the repository includes full experimental scripts, hyperparameter settings, and exact dataset preprocessing steps to enable reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below with clarifications and planned revisions to improve the manuscript's rigor and clarity.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that 'GESD effectively captures group-wise discrepancies in explanation quality' and that 'FEU improves both utility and fairness over state-of-the-art methods' is asserted without any quantitative results, error bars, data-split details, or baseline comparisons visible in the text. This absence leaves the empirical validation of the central contribution unsupported and is load-bearing for the paper's contribution.

Authors: We agree that the abstract would benefit from more explicit ties to the quantitative evidence. The full manuscript reports these results in Sections 4 and 5, including tables with mean performance metrics and standard deviations across multiple data splits (e.g., 5-fold cross-validation) as well as direct comparisons to baselines. To address the concern, we will revise the abstract to include concise references to key empirical outcomes and pointers to the relevant tables and sections, making the support for the central claims more immediately visible. revision: yes
Referee: [Section 3] Section 3 (GESD definition): GESD is defined directly via disparities in stability, robustness, and sensitivity, which are typically computed through input perturbations or sampling sensitive to feature statistics. No explicit correction for inter-group marginal distribution differences (e.g., variances or supports) is described, so observed disparities may reflect data heterogeneity rather than explanation quality. This assumption is load-bearing for the explainer- and model-agnostic claims and the assertion that GESD isolates procedural fairness.

Authors: This is a substantive point. The current GESD formulation applies uniform perturbation strategies across groups without explicit normalization for differences in feature marginals, which could allow data heterogeneity to influence the measured disparities. We will revise Section 3 to explicitly discuss this potential confounding factor, introduce a normalized variant of GESD that accounts for group-specific variances and supports, and add supporting experiments to demonstrate that the reported disparities remain after such adjustments. These changes will strengthen the justification for the procedural fairness interpretation and the model-agnostic claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity: GESD and FEU are definitional proposals evaluated empirically

full rationale

The paper defines GESD directly as a metric that quantifies disparities in explanation stability, robustness, and sensitivity across protected subgroups, then embeds it in the FEU multi-objective optimizer. No equations or claims reduce a derived quantity back to a fitted input or self-referential definition by construction. Empirical results on benchmark datasets are presented as external validation rather than a closed loop. The derivation chain is therefore self-contained: the metric is introduced by definition, the framework combines it with existing utility and fairness terms, and performance is assessed on held-out data without the central claims collapsing into the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Based on abstract only; no explicit free parameters, axioms, or invented entities are detailed beyond standard assumptions in ML fairness and explainability.

pith-pipeline@v0.9.0 · 5768 in / 1207 out tokens · 74113 ms · 2026-05-19T16:26:09.804352+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

[1]

Big data’s disparate impact,

S. Barocas and A. D. Selbst, “Big data’s disparate impact,”Calif. L. Rev., vol. 104, p. 671, 2016

work page 2016
[2]

Investigating and mitigating the performance–fairness tradeoff via protected-category sampling,

G. Popoola and J. Sheppard, “Investigating and mitigating the performance–fairness tradeoff via protected-category sampling,”Elec- tronics, vol. 13, no. 15, p. 3024, 2024

work page 2024
[3]

Optimized pre-processing for discrimination prevention,

F. Calmon, D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K. R. Varshney, “Optimized pre-processing for discrimination prevention,” Advances in neural information processing systems, vol. 30, 2017

work page 2017
[4]

Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,

A. Chouldechova, “Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,”Big data, vol. 5, no. 2, pp. 153–163, 2017

work page 2017
[5]

Equality of opportunity in super- vised learning,

M. Hardt, E. Price, and N. Srebro, “Equality of opportunity in super- vised learning,”Advances in Neural Information Processing Systems, vol. 29, 2016

work page 2016
[6]

Algorithmic decision making and the cost of fairness,

S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq, “Algorithmic decision making and the cost of fairness,” inProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 797–806

work page 2017
[7]

On the Robustness of Interpretability Methods

D. Alvarez-Melis and T. S. Jaakkola, “On the robustness of inter- pretability methods,”arXiv preprint arXiv:1806.08049, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[8]

The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery

Z. C. Lipton, “The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.” Queue, vol. 16, no. 3, pp. 31–57, 2018

work page 2018
[9]

Marrying fairness and explainability in supervised learning,

P. A. Grabowicz, N. Perello, and A. Mishra, “Marrying fairness and explainability in supervised learning,” inProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 1905–1916

work page 2022
[10]

Towards robust interpretability with self-explaining neural networks,

D. Alvarez Melis and T. Jaakkola, “Towards robust interpretability with self-explaining neural networks,”Advances in neural information processing systems, vol. 31, 2018

work page 2018
[11]

Fairness and explainability: Bridging the gap towards fair model explanations,

Y . Zhao, Y . Wang, and T. Derr, “Fairness and explainability: Bridging the gap towards fair model explanations,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 9, 2023, pp. 11 363– 11 371

work page 2023
[12]

Generating diagnostic and actionable explanations for fair graph neural networks,

Z. Wang, Q. Zeng, W. Lin, M. Jiang, and K. C. Tan, “Generating diagnostic and actionable explanations for fair graph neural networks,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 19, 2024, pp. 21 690–21 698

work page 2024
[13]

Methods for interpreting and understanding deep neural networks,

G. Montavon, W. Samek, and K.-R. M ¨uller, “Methods for interpreting and understanding deep neural networks,”Digital Signal Processing, vol. 73, pp. 1–15, 2018

work page 2018
[14]

Evaluating and aggregating feature-based model explanations,

U. Bhatt, A. Weller, and J. M. Moura, “Evaluating and aggregating feature-based model explanations,” inProceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2020, pp. 3016–3022

work page 2020
[15]

http://arxiv.org/abs/2211.05667 arXiv:2211.05667 [cs]

Z. Chen, V . Subhash, M. Havasi, W. Pan, and F. Doshi-Velez, “What makes a good explanation?: A harmonized view of properties of explanations,”arXiv preprint arXiv:2211.05667, 2022

work page arXiv 2022
[16]

Fair feature subset selection using multiobjective genetic algorithm,

A. U. Rehman, A. Nadeem, and M. Z. Malik, “Fair feature subset selection using multiobjective genetic algorithm,” inProceedings of the Genetic and Evolutionary Computation Conference Companion, 2022, pp. 360–363

work page 2022
[17]

A fast and elitist multiobjective genetic algorithm: Nsga-ii,

K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,”IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, 2002

work page 2002
[18]

Uci machine learning repository,

A. Asuncion, D. Newmanet al., “Uci machine learning repository,” 2007

work page 2007
[19]

The effect of race/ethnicity on sentencing: Examining sentence type, jail length, and prison length,

K. L. Jordan and T. L. Freiburger, “The effect of race/ethnicity on sentencing: Examining sentence type, jail length, and prison length,” Journal of Ethnicity in Criminal Justice, vol. 13, no. 3, pp. 179–196, 2015

work page 2015
[20]

Using data mining to predict secondary school student performance,

P. Cortez and A. M. G. Silva, “Using data mining to predict secondary school student performance,” 2008

work page 2008
[21]

Mitigating unwanted bi- ases with adversarial learning,

B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted bi- ases with adversarial learning,” inProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018, pp. 335–340

work page 2018
[22]

A reductions approach to fair classification,

A. Agarwal, A. Beygelzimer, M. Dud ´ık, J. Langford, and H. Wal- lach, “A reductions approach to fair classification,” inInternational conference on machine learning. PMLR, 2018, pp. 60–69

work page 2018
[23]

Data preprocessing techniques for classi- fication without discrimination,

F. Kamiran and T. Calders, “Data preprocessing techniques for classi- fication without discrimination,”Knowledge and Information Systems, vol. 33, no. 1, pp. 1–33, 2012

work page 2012
[24]

The fairness-accuracy Pareto front,

S. Wei and M. Niethammer, “The fairness-accuracy Pareto front,” Statistical Analysis and Data Mining, vol. 15, no. 3, pp. 287–302, June 2022

work page 2022
[25]

Fairness- aware class imbalanced learning on multiple subgroups,

D. A. Tarzanagh, B. Hou, B. Tong, Q. Long, and L. Shen, “Fairness- aware class imbalanced learning on multiple subgroups,” inUncer- tainty in Artificial Intelligence. PMLR, 2023, pp. 2123–2133

work page 2023
[26]

Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations,

J. Dai, S. Upadhyay, U. Aivodji, S. H. Bach, and H. Lakkaraju, “Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations,” inProceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 203–214

work page 2022
[27]

Reliable post hoc explanations: Modeling uncertainty in explainability,

D. Slack, A. Hilgard, S. Singh, and H. Lakkaraju, “Reliable post hoc explanations: Modeling uncertainty in explainability,”Advances in Neural Information Processing Systems, vol. 34, pp. 9391–9404, 2021

work page 2021
[28]

An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints,

K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints,”IEEE transactions on evolutionary computation, vol. 18, no. 4, pp. 577–601, 2013

work page 2013
[29]

Jain and K

H. Jain and K. Deb, “An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, Part II: Handling constraints and extending to an adaptive approach,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 602–622, 2014

work page 2014

[1] [1]

Big data’s disparate impact,

S. Barocas and A. D. Selbst, “Big data’s disparate impact,”Calif. L. Rev., vol. 104, p. 671, 2016

work page 2016

[2] [2]

Investigating and mitigating the performance–fairness tradeoff via protected-category sampling,

G. Popoola and J. Sheppard, “Investigating and mitigating the performance–fairness tradeoff via protected-category sampling,”Elec- tronics, vol. 13, no. 15, p. 3024, 2024

work page 2024

[3] [3]

Optimized pre-processing for discrimination prevention,

F. Calmon, D. Wei, B. Vinzamuri, K. Natesan Ramamurthy, and K. R. Varshney, “Optimized pre-processing for discrimination prevention,” Advances in neural information processing systems, vol. 30, 2017

work page 2017

[4] [4]

Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,

A. Chouldechova, “Fair prediction with disparate impact: A study of bias in recidivism prediction instruments,”Big data, vol. 5, no. 2, pp. 153–163, 2017

work page 2017

[5] [5]

Equality of opportunity in super- vised learning,

M. Hardt, E. Price, and N. Srebro, “Equality of opportunity in super- vised learning,”Advances in Neural Information Processing Systems, vol. 29, 2016

work page 2016

[6] [6]

Algorithmic decision making and the cost of fairness,

S. Corbett-Davies, E. Pierson, A. Feller, S. Goel, and A. Huq, “Algorithmic decision making and the cost of fairness,” inProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2017, pp. 797–806

work page 2017

[7] [7]

On the Robustness of Interpretability Methods

D. Alvarez-Melis and T. S. Jaakkola, “On the robustness of inter- pretability methods,”arXiv preprint arXiv:1806.08049, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[8] [8]

The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery

Z. C. Lipton, “The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery.” Queue, vol. 16, no. 3, pp. 31–57, 2018

work page 2018

[9] [9]

Marrying fairness and explainability in supervised learning,

P. A. Grabowicz, N. Perello, and A. Mishra, “Marrying fairness and explainability in supervised learning,” inProceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 1905–1916

work page 2022

[10] [10]

Towards robust interpretability with self-explaining neural networks,

D. Alvarez Melis and T. Jaakkola, “Towards robust interpretability with self-explaining neural networks,”Advances in neural information processing systems, vol. 31, 2018

work page 2018

[11] [11]

Fairness and explainability: Bridging the gap towards fair model explanations,

Y . Zhao, Y . Wang, and T. Derr, “Fairness and explainability: Bridging the gap towards fair model explanations,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 9, 2023, pp. 11 363– 11 371

work page 2023

[12] [12]

Generating diagnostic and actionable explanations for fair graph neural networks,

Z. Wang, Q. Zeng, W. Lin, M. Jiang, and K. C. Tan, “Generating diagnostic and actionable explanations for fair graph neural networks,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 19, 2024, pp. 21 690–21 698

work page 2024

[13] [13]

Methods for interpreting and understanding deep neural networks,

G. Montavon, W. Samek, and K.-R. M ¨uller, “Methods for interpreting and understanding deep neural networks,”Digital Signal Processing, vol. 73, pp. 1–15, 2018

work page 2018

[14] [14]

Evaluating and aggregating feature-based model explanations,

U. Bhatt, A. Weller, and J. M. Moura, “Evaluating and aggregating feature-based model explanations,” inProceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization, 2020, pp. 3016–3022

work page 2020

[15] [15]

http://arxiv.org/abs/2211.05667 arXiv:2211.05667 [cs]

Z. Chen, V . Subhash, M. Havasi, W. Pan, and F. Doshi-Velez, “What makes a good explanation?: A harmonized view of properties of explanations,”arXiv preprint arXiv:2211.05667, 2022

work page arXiv 2022

[16] [16]

Fair feature subset selection using multiobjective genetic algorithm,

A. U. Rehman, A. Nadeem, and M. Z. Malik, “Fair feature subset selection using multiobjective genetic algorithm,” inProceedings of the Genetic and Evolutionary Computation Conference Companion, 2022, pp. 360–363

work page 2022

[17] [17]

A fast and elitist multiobjective genetic algorithm: Nsga-ii,

K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: Nsga-ii,”IEEE Transactions on Evolutionary Computation, vol. 6, no. 2, pp. 182–197, 2002

work page 2002

[18] [18]

Uci machine learning repository,

A. Asuncion, D. Newmanet al., “Uci machine learning repository,” 2007

work page 2007

[19] [19]

The effect of race/ethnicity on sentencing: Examining sentence type, jail length, and prison length,

K. L. Jordan and T. L. Freiburger, “The effect of race/ethnicity on sentencing: Examining sentence type, jail length, and prison length,” Journal of Ethnicity in Criminal Justice, vol. 13, no. 3, pp. 179–196, 2015

work page 2015

[20] [20]

Using data mining to predict secondary school student performance,

P. Cortez and A. M. G. Silva, “Using data mining to predict secondary school student performance,” 2008

work page 2008

[21] [21]

Mitigating unwanted bi- ases with adversarial learning,

B. H. Zhang, B. Lemoine, and M. Mitchell, “Mitigating unwanted bi- ases with adversarial learning,” inProceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, 2018, pp. 335–340

work page 2018

[22] [22]

A reductions approach to fair classification,

A. Agarwal, A. Beygelzimer, M. Dud ´ık, J. Langford, and H. Wal- lach, “A reductions approach to fair classification,” inInternational conference on machine learning. PMLR, 2018, pp. 60–69

work page 2018

[23] [23]

Data preprocessing techniques for classi- fication without discrimination,

F. Kamiran and T. Calders, “Data preprocessing techniques for classi- fication without discrimination,”Knowledge and Information Systems, vol. 33, no. 1, pp. 1–33, 2012

work page 2012

[24] [24]

The fairness-accuracy Pareto front,

S. Wei and M. Niethammer, “The fairness-accuracy Pareto front,” Statistical Analysis and Data Mining, vol. 15, no. 3, pp. 287–302, June 2022

work page 2022

[25] [25]

Fairness- aware class imbalanced learning on multiple subgroups,

D. A. Tarzanagh, B. Hou, B. Tong, Q. Long, and L. Shen, “Fairness- aware class imbalanced learning on multiple subgroups,” inUncer- tainty in Artificial Intelligence. PMLR, 2023, pp. 2123–2133

work page 2023

[26] [26]

Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations,

J. Dai, S. Upadhyay, U. Aivodji, S. H. Bach, and H. Lakkaraju, “Fairness via explanation quality: Evaluating disparities in the quality of post hoc explanations,” inProceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, 2022, pp. 203–214

work page 2022

[27] [27]

Reliable post hoc explanations: Modeling uncertainty in explainability,

D. Slack, A. Hilgard, S. Singh, and H. Lakkaraju, “Reliable post hoc explanations: Modeling uncertainty in explainability,”Advances in Neural Information Processing Systems, vol. 34, pp. 9391–9404, 2021

work page 2021

[28] [28]

An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints,

K. Deb and H. Jain, “An evolutionary many-objective optimization algorithm using reference-point-based nondominated sorting approach, part i: solving problems with box constraints,”IEEE transactions on evolutionary computation, vol. 18, no. 4, pp. 577–601, 2013

work page 2013

[29] [29]

Jain and K

H. Jain and K. Deb, “An evolutionary many-objective optimization algorithm using reference-point based nondominated sorting approach, Part II: Handling constraints and extending to an adaptive approach,” IEEE Transactions on Evolutionary Computation, vol. 18, no. 4, pp. 602–622, 2014

work page 2014