Fairness of Classifiers in the Presence of Constraints between Features
Pith reviewed 2026-05-09 19:15 UTC · model grok-4.3
The pith
A classifier decision is fair if it has a prime-implicant explanation containing no protected feature once constraints among features are taken into account.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A decision is fair exactly when there exists a prime implicant for that decision that contains no protected feature, where the implicant is computed with respect to the given constraints on the feature values. Ignoring the constraints can reverse the fairness status of the same decision. Three classifier-level fairness properties are defined: every decision has only fair explanations, every decision has at least one fair explanation, and the outcome is invariant under changes to protected features that respect the constraints. The paper determines how these three properties relate to one another and the complexity of deciding whether a given classifier satisfies each property.
What carries the argument
The constraint-adjusted prime implicant, which acts as the minimal reason for a decision and serves as the test for whether the decision can be explained without reference to any protected feature.
If this is right
- If every prime-implicant explanation of a decision is fair, then the decision satisfies the strongest of the three classifier fairness definitions.
- A classifier may satisfy the 'at least one fair explanation' definition while failing the 'only fair explanations' definition.
- Outcome invariance to changes in protected features is related to but not identical with the explanation-based fairness notions.
- Verifying any of the three fairness properties for a given classifier belongs to a specific complexity class determined by the paper.
Where Pith is reading between the lines
- The explanation-based test could be applied to audit already-trained classifiers by computing prime implicants on their input-output behavior without retraining.
- When constraints are only partially known or must be inferred from data, the fairness verdict may become unreliable or require additional uncertainty handling.
- The same constraint-adjusted implicant idea might be transferred to other minimal-explanation frameworks such as minimal sufficient reasons or abductive explanations.
Load-bearing premise
Prime implicants constitute a faithful and adequate mechanism for explaining why the classifier reached its decision, and the relevant constraints between features are known completely and can be incorporated into the implicant computation without creating new ambiguities.
What would settle it
A small, fully specified classifier together with a set of feature constraints for which there exists a decision that has a protected-feature-free prime implicant under the constrained definition, yet altering the protected features while obeying the constraints changes the classifier output.
Figures
read the original abstract
In Machine Learning, an accepted definition of fairness of a decision taken by a classifier is that it should not depend on protected features, such as gender. Unfortunately, when constraints exist between features, such dependencies can be obscured by the constraints. To avoid this problem, we propose that a decision be considered fair if it has a fair explanation. We define a fair explanation as a prime-implicant reason for the decision that does not contain any protected feature (where the constraints are taken into account in the definition of prime-implicant). Surprisingly, ignoring constraints can completely change the fairness of a decision (according to this definition) even in the absence of constraints between protected and unprotected features. Three possible definitions of fairness of a classifier are that for all its decisions (1) there are only fair explanations, (2) there is at least one fair explanation, or (3) changing protected features does not change the outcome. We identify the relationships between these different definitions of fairness and study the computational complexity of testing fairness of classifiers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes that a decision by a classifier is fair if it admits a fair explanation, defined as a prime implicant for the decision that excludes all protected features once inter-feature constraints are incorporated into the implicant computation. It shows that ignoring constraints can reverse a fairness verdict even when no constraints directly link protected and unprotected features. Three classifier-level fairness notions are introduced—universal fair explanations, existential fair explanations, and invariance of the outcome to changes in protected features—and the logical relationships among them are derived together with complexity results for verifying each notion.
Significance. If the derivations and complexity bounds hold, the work supplies a logically grounded, constraint-sensitive refinement of explanation-based fairness that addresses a common limitation in existing definitions. The observation that constraints alone can flip fairness verdicts is a useful theoretical insight with potential implications for auditing systems in domains with known feature dependencies. The explicit relationships among the three fairness notions and the complexity classification provide a clear map for future algorithmic work. The manuscript's definitional clarity and focus on prime implicants as the explanatory primitive are strengths, though practical adoption will depend on the availability of accurate constraint models.
major comments (1)
- The central claim that ignoring constraints can alter fairness even without protected-unprotected constraints rests on how constraints prune the space of valid assignments and thereby change the set of prime implicants. The manuscript should supply an explicit small example (with the constraint set, the classifier function, and the two prime-implicant sets) immediately after the definition of fair explanation so that readers can verify the flip without reconstructing the argument.
minor comments (2)
- Notation for prime implicants under constraints should be introduced once and used consistently; currently the abstract and later sections appear to overload the same symbol for the unconstrained and constrained cases.
- The complexity results would be easier to compare if a single table listed the three fairness notions together with their complexity classes and the proof techniques used.
Simulated Author's Rebuttal
We thank the referee for the careful reading, positive assessment of the work, and the constructive suggestion to improve clarity. We address the single major comment below and will revise the manuscript to incorporate the requested example.
read point-by-point responses
-
Referee: The central claim that ignoring constraints can alter fairness even without protected-unprotected constraints rests on how constraints prune the space of valid assignments and thereby change the set of prime implicants. The manuscript should supply an explicit small example (with the constraint set, the classifier function, and the two prime-implicant sets) immediately after the definition of fair explanation so that readers can verify the flip without reconstructing the argument.
Authors: We agree that an explicit example placed immediately after the definition of fair explanation would help readers verify the claim without having to reconstruct the argument from later sections. In the revised manuscript we will insert a small, self-contained example at that location. The example will include: (i) a concrete constraint set over the features, (ii) a simple classifier function, (iii) the set of prime implicants computed while respecting the constraints, and (iv) the set of prime implicants obtained when constraints are ignored. This will demonstrate that the fairness verdict for a given decision can reverse even though no constraint directly links a protected feature to an unprotected one. revision: yes
Circularity Check
No significant circularity
full rationale
The paper introduces three definitions of classifier fairness grounded in prime-implicant explanations that exclude protected features once constraints are folded into the implicant computation. The relationships among these definitions and the complexity results for testing them are direct logical consequences of the formulation; no equation or claim reduces by construction to a fitted parameter, a self-citation chain, or an ansatz imported from prior work by the same authors. The claimed effect of constraints on fairness is shown by explicit example and follows from the pruning of valid assignments, without any definitional loop or renaming of known results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Prime implicants provide a suitable minimal explanation for a classifier decision
- domain assumption All relevant constraints between features are known and can be incorporated into the definition of prime implicants
Reference graph
Works this paper leans on
-
[1]
Axiomatic foundations of explainability
Leila Amgoud and Jonathan Ben-Naim. Axiomatic foundations of explainability. In Luc De Raedt, ed- itor,IJCAI, pages 636–642. ijcai.org, 2022.doi: 10.24963/IJCAI.2022/90
-
[2]
Cambridge University Press, 2009
Sanjeev Arora and Boaz Barak.Compu- tational Complexity - A Modern Approach. Cambridge University Press, 2009. URL: http://www.cambridge.org/catalogue/ catalogue.asp?isbn=9780521424264
work page 2009
-
[3]
On the computational intelligibility of boolean classifiers
Gilles Audemard, Steve Bellart, Louenas Bounia, Frédéric Koriche, Jean-Marie Lagniez, and Pierre Marquis. On the computational intelligibility of boolean classifiers. In Meghyn Bienvenu, Gerhard Lakemeyer, and Esra Erdem, editors,KR, pages 74– 86, 2021.doi:10.24963/KR.2021/8
-
[4]
Deriving provably correct explanations for decision trees: The impact of domain theories
Gilles Audemard, Jean-Marie Lagniez, Pierre Mar- quis, and Nicolas Szczepanski. Deriving provably correct explanations for decision trees: The impact of domain theories. InIJCAI, pages 3688–3696. ij- cai.org, 2024. URL:https://www.ijcai.org/ proceedings/2024/408
work page 2024
-
[5]
Model interpretability through the lens of computational complexity
Pablo Barceló, Mikaël Monet, Jorge Pérez, and Bernardo Subercaseaux. Model interpretability through the lens of computational complexity. In Hugo Larochelle, Marc’Aurelio Ranzato, Raia Had- sell, Maria-Florina Balcan, and Hsuan-Tien Lin, edi- tors,NeurIPS, 2020
work page 2020
- [6]
-
[7]
Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal justice risk assessments: The state of the art.Sociological Methods & Research, 50(1):3–44, 2021
work page 2021
-
[8]
Fairness in machine learning: Lessons from political philosophy
Reuben Binns. Fairness in machine learning: Lessons from political philosophy. In Sorelle A. Friedler and Christo Wilson, editors,Conference on Fairness, Ac- countability and Transparency, FAT, volume 81 of Proceedings of Machine Learning Research, pages 149–159. PMLR, 2018
work page 2018
-
[9]
Alexandra Chouldechova. Fair prediction with dis- parate impact: A study of bias in recidivism prediction instruments.Big Data, 5(2):153–163, 2017
work page 2017
-
[10]
Martin C. Cooper and Leila Amgoud. Abductive ex- planations of classifiers under constraints: Complex- ity and properties. In Kobi Gal, Ann Nowé, Grze- gorz J. Nalepa, Roy Fairstein, and Roxana Radulescu, editors,ECAI, pages 469–476. IOS Press, 2023
work page 2023
-
[11]
Martin C. Cooper and João Marques-Silva. On the tractability of explaining decisions of classi- fiers. In Laurent D. Michel, editor,CP, volume 210 ofLIPIcs, pages 21:1–21:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2021.doi:10. 4230/LIPICS.CP.2021.21
work page 2021
-
[12]
Algorithmic decision making and the cost of fairness
Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision making and the cost of fairness. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 797–
-
[13]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. Fairness through awareness. In Shafi Goldwasser, editor,Innovations in Theoretical Computer Science, pages 214–226. ACM, 2012
work page 2012
-
[14]
Fairness testing: testing software for discrimi- nation
Sainyam Galhotra, Yuriy Brun, and Alexandra Me- liou. Fairness testing: testing software for discrimi- nation. In Eric Bodden, Wilhelm Schäfer, Arie van Deursen, and Andrea Zisman, editors,Proceedings of the 2017 11th Joint Meeting on Foundations of Soft- ware Engineering, ESEC/FSE, pages 498–510. ACM, 2017
work page 2017
-
[15]
Sufficient reasons for classifier decisions in the presence of domain con- straints
Niku Gorji and Sasha Rubin. Sufficient reasons for classifier decisions in the presence of domain con- straints. InAAAI, pages 5660–5667. AAAI Press, 2022.doi:10.1609/AAAI.V36I5.20507
-
[16]
The case for process fairness in learning: Feature selection for fair decision making
Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P Gummadi, and Adrian Weller. The case for process fairness in learning: Feature selection for fair decision making. InNIPS Symposium on Machine Learning and the Law, 2016
work page 2016
-
[17]
Nina Grgic-Hlaca, Muhammad Bilal Zafar, Krishna P. Gummadi, and Adrian Weller. Beyond distributive fairness in algorithmic decision making: Feature se- lection for procedurally fair learning. In Sheila A. McIlraith and Kilian Q. Weinberger, editors,AAAI, pages 51–60. AAAI Press, 2018
work page 2018
-
[18]
Equality of opportunity in supervised learning
Moritz Hardt, Eric Price, and Nati Srebro. Equality of opportunity in supervised learning. In Daniel D. Lee, Masashi Sugiyama, Ulrike von Luxburg, Isabelle Guyon, and Roman Garnett, editors,NIPS, pages 3315–3323, 2016
work page 2016
-
[19]
Cooper, Mohamed Siala, Emmanuel Hebrard, and João Marques-Silva
Alexey Ignatiev, Martin C. Cooper, Mohamed Siala, Emmanuel Hebrard, and João Marques-Silva. To- wards formal fairness in machine learning. In Helmut Simonis, editor,CP, volume 12333 ofLNCS, pages 846–867. Springer, 2020
work page 2020
-
[20]
SAT-based rigorous explanations for decision lists
Alexey Ignatiev and João Marques-Silva. SAT-based rigorous explanations for decision lists. In Chu-Min Li and Felip Manyà, editors,Theory and Applications of Satisfiability Testing - SAT, volume 12831 ofLNCS, pages 251–269. Springer, 2021.doi:10.1007/ 978-3-030-80223-3\_18
work page 2021
-
[21]
On explaining random forests with SAT
Yacine Izza and João Marques-Silva. On explaining random forests with SAT. In Zhi-Hua Zhou, editor, IJCAI, pages 2584–2591. ijcai.org, 2021.doi:10. 24963/IJCAI.2021/356
work page 2021
-
[22]
Davinder Kaur, Suleyman Uslu, Kaley J. Rittichier, and Arjan Durresi. Trustworthy artificial intelligence: A review.ACM Comput. Surv., 55(2), 2022
work page 2022
-
[23]
Avoiding discrimination through causal reasoning
Niki Kilbertus, Mateo Rojas-Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing, and Bernhard Schölkopf. Avoiding discrimination through causal reasoning. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fer- gus, S. V . N. Vishwanathan, and Roman Garnett, edi- tors,NIPS, pages 656–666, 2017
work page 2017
-
[24]
Kleinberg, Sendhil Mullainathan, and Man- ish Raghavan
Jon M. Kleinberg, Sendhil Mullainathan, and Man- ish Raghavan. Inherent trade-offs in the fair determi- nation of risk scores. In Christos H. Papadimitriou, editor,8th Innovations in Theoretical Computer Sci- ence Conference, ITCS, volume 67 ofLIPIcs, pages 43:1–43:23. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017
work page 2017
-
[25]
Matt J. Kusner, Joshua R. Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V . N. Vishwanathan, and Ro- man Garnett, editors,NIPS, pages 4066–4076, 2017
work page 2017
-
[26]
Delivering trustworthy AI through formal XAI
João Marques-Silva and Alexey Ignatiev. Delivering trustworthy AI through formal XAI. InAAAI, pages 12342–12350. AAAI Press, 2022.doi:10.1609/ AAAI.V36I11.21499
work page 2022
-
[27]
A survey on bias and fairness in machine learning.ACM Comput
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. A survey on bias and fairness in machine learning.ACM Comput. Surv., 54(6), 2021
work page 2021
-
[28]
Razieh Nabi and Ilya Shpitser. Fair inference on out- comes. In Sheila A. McIlraith and Kilian Q. Wein- berger, editors,AAAI, pages 1931–1940. AAAI Press, 2018
work page 1931
-
[29]
Ethics guidelines for trustworthy AI
High-Level Expert Group on Artificial Intelligence (AI HLEG) set up by the European Commission. Ethics guidelines for trustworthy AI. https://digital- strategy.ec.europa.eu/en/library/ethics-guidelines- trustworthy-ai, 2019
work page 2019
-
[30]
Explaining decisions in ML models: A parameterized complexity analy- sis
Sebastian Ordyniak, Giacomo Paesani, Mateusz Rychlicki, and Stefan Szeider. Explaining decisions in ML models: A parameterized complexity analy- sis. In Pierre Marquis, Magdalena Ortiz, and Maurice Pagnucco, editors,KR, 2024.doi:10.24963/KR. 2024/53
work page doi:10.24963/kr 2024
-
[31]
Nripsuta Ani Saxena, Karen Huang, Evan DeFilippis, Goran Radanovic, David C. Parkes, and Yang Liu. How do fairness definitions fare?: Examining public attitudes towards algorithmic definitions of fairness. In Vincent Conitzer, Gillian K. Hadfield, and Shannon Vallor, editors,AAAI/ACM Conference on AI, Ethics, and Society, AIES, pages 99–106. ACM, 2019
work page 2019
-
[32]
Foster, Nicholas Mat- tei, and John P
Candice Schumann, Jeffrey S. Foster, Nicholas Mat- tei, and John P. Dickerson. We need fairness and ex- plainability in algorithmic hiring. In Amal El Fal- lah Seghrouchni, Gita Sukthankar, Bo An, and Neil Yorke-Smith, editors,AAMAS, pages 1716–1720. In- ternational Foundation for Autonomous Agents and Multiagent Systems, 2020
work page 2020
-
[33]
A symbolic approach to explaining bayesian net- work classifiers
Andy Shih, Arthur Choi, and Adnan Darwiche. A symbolic approach to explaining bayesian net- work classifiers. In Jérôme Lang, editor,IJCAI, pages 5103–5111. ijcai.org, 2018.doi:10.24963/ IJCAI.2018/708
work page 2018
-
[34]
Fairness definitions ex- plained
Sahil Verma and Julia Rubin. Fairness definitions ex- plained. In Yuriy Brun, Brittany Johnson, and Alexan- dra Meliou, editors,Proceedings of the International Workshop on Software Fairness, FairWare@ICSE, pages 1–7. ACM, 2018
work page 2018
-
[35]
Procedural fairness in machine learning.CoRR, abs/2404.01877, 2024
Ziming Wang, Changwu Huang, and Xin Yao. Procedural fairness in machine learning.CoRR, abs/2404.01877, 2024
-
[36]
Stuckey, Nina Narodytska, and João Marques-Silva
Jinqiang Yu, Alexey Ignatiev, Peter J. Stuckey, Nina Narodytska, and João Marques-Silva. Eliminating the impossible, whatever remains must be true: On ex- tracting and applying background knowledge in the context of formal explanations. In Brian Williams, Yiling Chen, and Jennifer Neville, editors,AAAI, pages 4123–4131. AAAI Press, 2023.doi:10. 1609/AAAI....
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.