pith. sign in

arxiv: 2605.16041 · v1 · pith:LTEXOPHKnew · submitted 2026-05-15 · 📊 stat.ML · cs.LG

Explainable AI Isn't Enough! Rethinking Algorithmic Contestability

Pith reviewed 2026-05-19 19:04 UTC · model grok-4.3

classification 📊 stat.ML cs.LG
keywords algorithmic contestabilityexplainable AImachine learning decisionsrecoursepredictive multiplicityfeature valuesoverruling evidenceEU legislation
0
0 comments X

The pith

Standard XAI tools like counterfactuals only flag nearby errors and fall short of providing grounds to overturn algorithmic decisions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine learning systems decide on loans, hiring, and similar high-stakes matters, yet people lack clear ways to challenge mistakes. The paper defines contestability as beginning from the presumption that a decision could be wrong and then gathering evidence to reverse it, in contrast to recourse which takes the decision as given and suggests changes. It demonstrates that common explanation methods reveal only local inconsistencies and cannot justify full reversal on their own. The work identifies three stronger forms of evidence that can show a decision is indefensible by the system's own standards: predictive multiplicity, incorrect feature values, and neglected overruling evidence. It further connects these rights to existing EU legal provisions that already support access to such information.

Core claim

Contestability operates as the complement to recourse by presuming a decision may be incorrect and seeking evidence to challenge and potentially overturn it; standard XAI methods supply only neighborhood-level checks that fall short of this, whereas predictive multiplicity, incorrect feature values, and neglected overruling evidence each render the decision normatively indefensible according to the decision maker's own ethical standards.

What carries the argument

The operational definition of contestability, which shifts focus from validating the decision to collecting evidence that demonstrates its indefensibility under the decision maker's standards.

If this is right

  • Decisions become open to reversal once predictive multiplicity is established.
  • Incorrect feature values supply direct grounds for contesting the outcome.
  • Neglected overruling evidence can make continuation of the decision indefensible.
  • Existing EU legislation already provides individuals rights to some of these evidence forms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Systems could be required to surface alternative model predictions as a routine part of decision documentation.
  • Contestability checks might extend to auditing whether input data was verified at decision time.
  • Legal frameworks could evolve to mandate logging of potential overruling rules alongside each decision.

Load-bearing premise

The three types of evidence render decisions normatively indefensible according to the decision maker's own ethical standards.

What would settle it

A documented case in which one of the three evidence types is supplied yet the decision maker upholds the original outcome as consistent with its own standards would show the claim does not hold.

Figures

Figures reproduced from arXiv: 2605.16041 by Gunnar K\"onig, Kristof Meding, Timo Freiesleben.

Figure 1
Figure 1. Figure 1: XAI explanations may conflict with human intuitions. For example, (1) a counterfactual explanation may conflict with continuity rules, (2) a LIME explanation may conflict with monotonicity rules, and (3) anchors may contradict reason rules. Even under the strong assumption that these intuitions are correct, i.e. the rules apply, such conflicts imply an error only somewhere in the neighborhood of the explai… view at source ↗
read the original abstract

Machine learning systems increasingly make life-changing decisions about individuals, such as loan approvals, hiring, and cheating detection, raising a pressing question: how can individuals respond to negative decisions made by these opaque systems? While explainable artificial intelligence (XAI) has largely focused on algorithmic recourse -- helping individuals change their features to obtain a desired outcome -- the parallel problem of algorithmic contestability -- helping individuals review and correct erroneous algorithmic decisions -- has received far less attention, despite its central ethical and legal importance. We trace this neglect to the absence of clear formal definitions and a systematic operationalization of contestability as an algorithmic problem. To address it, we propose an operational definition of contestability as a natural complement to recourse: contestability starts from the presumption that a decision may be incorrect and focuses on identifying evidence to challenge and potentially overturn it, whereas recourse assumes the decision is valid and instead provides pathways for changing it. We show that standard XAI explanations, such as counterfactuals, LIME, or Anchors, even when combined with human intuitions about decision continuity or monotonicity, reveal only errors in the neighborhood of the individual, but provide insufficient grounds for overturning the decision at hand. Going thus beyond traditional XAI, we identify three types of evidence warranting reversal according to the decision maker's own ethical standards: predictive multiplicity, incorrect feature values, and neglected overruling evidence. We argue that these render decisions normatively indefensible and thus successfully contestable. Finally, we analyze how existing EU legislation connects to our framework and argue that individuals already hold some legal rights to these forms of evidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper claims that standard XAI techniques (counterfactuals, LIME, Anchors) are insufficient for algorithmic contestability because they only surface local errors and do not supply grounds for overturning a decision. It introduces an operational definition of contestability as the complement to recourse: contestability presumes the decision may be incorrect and seeks evidence to challenge it, while recourse assumes validity and seeks feature changes. The authors identify three evidence types—predictive multiplicity, incorrect feature values, and neglected overruling evidence—that they argue render a decision normatively indefensible according to the decision maker’s own ethical standards, and they link the framework to existing EU legal rights.

Significance. If the central conceptual distinction and the sufficiency of the three evidence types hold, the work would usefully reorient XAI research from individual-level recourse toward contestability as a distinct algorithmic and legal problem. The explicit contrast with recourse and the mapping to EU legislation supply a clear framing that could guide both technical development and policy discussion.

major comments (1)
  1. [Section proposing the operational definition of contestability] In the section proposing the operational definition of contestability, the claim that the three evidence types (predictive multiplicity, incorrect feature values, neglected overruling evidence) render decisions normatively indefensible according to the decision maker’s own ethical standards is not accompanied by any mechanism for eliciting those standards or demonstrating their violation in a concrete case. This assumption is load-bearing for the sufficiency argument yet remains implicit.
minor comments (1)
  1. [Abstract and introduction] The abstract and introduction would benefit from a short table contrasting the assumptions, goals, and required evidence of recourse versus contestability to make the central distinction immediately visible to readers.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and insightful comments, which help clarify the scope and implications of our framework. We address the major comment below.

read point-by-point responses
  1. Referee: In the section proposing the operational definition of contestability, the claim that the three evidence types (predictive multiplicity, incorrect feature values, neglected overruling evidence) render decisions normatively indefensible according to the decision maker’s own ethical standards is not accompanied by any mechanism for eliciting those standards or demonstrating their violation in a concrete case. This assumption is load-bearing for the sufficiency argument yet remains implicit.

    Authors: We thank the referee for identifying this point. Our framework is conceptual and normative: it posits that the three evidence types directly contradict the decision maker's own stated or implied standards (e.g., using accurate inputs, considering all relevant evidence, or acknowledging model uncertainty), thereby rendering the decision indefensible under those standards without requiring external imposition of values. We link this to existing EU legal rights, where decision makers must already articulate and adhere to their criteria. That said, we agree that practical deployment would benefit from explicit discussion of elicitation mechanisms. We will revise the manuscript to add a paragraph in the operational definition section outlining how decision makers' standards can be elicited via documented policies, ethical guidelines, or compliance records, and how violations can be demonstrated in concrete cases (e.g., by cross-referencing against those records). This strengthens the framework without altering its core claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity: operational definition and evidence types are constructed from external normative standards

full rationale

The paper proposes a new operational definition of contestability as the complement to recourse and then identifies three evidence types (predictive multiplicity, incorrect feature values, neglected overruling evidence) that are argued to warrant reversal under the decision maker's own ethical standards. This construction does not reduce any claimed result to a fitted parameter, self-referential loop, or prior self-citation by construction. The central steps rely on external ethical/legal benchmarks and logical argument rather than internal redefinition or statistical forcing, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on domain assumptions about ethical standards and legal connections rather than new mathematical entities or fitted parameters.

axioms (2)
  • domain assumption Contestability begins from the presumption that a decision may be incorrect, in contrast to recourse which assumes validity.
    This premise is introduced to differentiate the new concept from existing recourse work.
  • domain assumption Evidence types render decisions normatively indefensible according to the decision maker's own ethical standards.
    Central to arguing that the evidence warrants reversal.
invented entities (1)
  • algorithmic contestability as operational complement to recourse no independent evidence
    purpose: To focus on identifying evidence to challenge and overturn decisions rather than change features.
    New framing proposed to address the identified gap in XAI.

pith-pipeline@v0.9.0 · 5824 in / 1478 out tokens · 40003 ms · 2026-05-19T19:04:18.674887+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

115 extracted references · 115 canonical work pages · 1 internal anchor

  1. [1]

    A practical guide, 1st ed., Cham: Springer International Publishing , volume=

    The eu general data protection regulation (gdpr) , author=. A practical guide, 1st ed., Cham: Springer International Publishing , volume=. 2017 , publisher=

  2. [2]

    Joint European conference on machine learning and knowledge discovery in databases , pages=

    Sampling, intervention, prediction, aggregation: a generalized framework for model-agnostic interpretations , author=. Joint European conference on machine learning and knowledge discovery in databases , pages=. 2019 , organization=

  3. [3]

    European Union , year=

    The eu artificial intelligence act , author=. European Union , year=

  4. [4]

    Philosophy of Science , volume=

    Laplace’s demon and the adventures of his apprentices , author=. Philosophy of Science , volume=. 2014 , publisher=

  5. [5]

    New essays on semantic externalism and self-knowledge , pages=

    Some reflections on the acquisition of warrant by inference , author=. New essays on semantic externalism and self-knowledge , pages=

  6. [6]

    Tal, Eran , title =. The. 2020 , edition =

  7. [7]

    2004 , publisher=

    Evidentialism: Essays in epistemology , author=. 2004 , publisher=

  8. [8]

    EU internet law: Regulation and enforcement , pages=

    The right not to be subject to automated decisions based on profiling , author=. EU internet law: Regulation and enforcement , pages=. 2017 , publisher=

  9. [9]

    Available at SSRN 5194301 , year=

    The Right to Explanation in the AI Act , author=. Available at SSRN 5194301 , year=

  10. [10]

    Proceedings of the AAAI/ACM conference on AI, ethics, and society , volume=

    Habemus a right to an explanation: so what?--a framework on transparency-explainability functionality and tensions in the eu ai act , author=. Proceedings of the AAAI/ACM conference on AI, ethics, and society , volume=

  11. [11]

    Meaningful information

    “Meaningful information” and the right to explanation , author=. conference on fairness, accountability and transparency , pages=. 2018 , organization=

  12. [12]

    Journal of Political Philosophy , volume=

    The right to explanation , author=. Journal of Political Philosophy , volume=. 2022 , publisher=

  13. [13]

    , author=

    Judgment under Uncertainty: Heuristics and Biases: Biases in judgments reveal some heuristics of thinking under uncertainty. , author=. science , volume=. 1974 , publisher=

  14. [14]

    2023 , publisher=

    The intelligence of intuition , author=. 2023 , publisher=

  15. [15]

    Columbia Law Review , volume=

    The right to contest AI , author=. Columbia Law Review , volume=. 2021 , publisher=

  16. [16]

    right to explanation

    The right to contest automated decisions under the General Data Protection Regulation: Beyond the so-called “right to explanation” , author=. Regulation & Governance , volume=. 2022 , publisher=

  17. [17]

    Engineering Reports , volume=

    Analyzing the impact of loan features on bank loan prediction using Random Forest algorithm , author=. Engineering Reports , volume=. 2024 , publisher=

  18. [18]

    2023 , publisher=

    Fairness and machine learning: Limitations and opportunities , author=. 2023 , publisher=

  19. [19]

    Science advances , volume=

    The accuracy, fairness, and limits of predicting recidivism , author=. Science advances , volume=. 2018 , publisher=

  20. [20]

    Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems , pages=

    Disentangling fairness perceptions in algorithmic decision-making: the effects of explanations, human oversight, and contestability , author=. Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems , pages=

  21. [21]

    2007 , publisher=

    Epistemic injustice: Power and the ethics of knowing , author=. 2007 , publisher=

  22. [22]

    Stanford, Kyle , title =. The. 2023 , edition =

  23. [23]

    Minnesota studies in the philosophy of science , volume=

    Demystifying underdetermination , author=. Minnesota studies in the philosophy of science , volume=

  24. [24]

    Minds and Machines , volume=

    The epistemic basis of defeasible reasoning , author=. Minds and Machines , volume=. 1991 , publisher=

  25. [25]

    Henderson, Leah , title =. The. 2024 , edition =

  26. [26]

    2000 , publisher=

    A treatise of human nature , author=. 2000 , publisher=

  27. [27]

    Philosophy and Phenomenological Research , volume=

    Higher-order evidence and the limits of defeat , author=. Philosophy and Phenomenological Research , volume=. 2014 , publisher=

  28. [28]

    Studia Logica , volume=

    Dynamic logics of evidence-based beliefs , author=. Studia Logica , volume=. 2011 , publisher=

  29. [29]

    Patterns , volume=

    Is using ChatGPT cheating, plagiarism, both, neither, or forward thinking? , author=. Patterns , volume=. 2023 , publisher=

  30. [30]

    2009 , publisher=

    Science, policy, and the value-free ideal , author=. 2009 , publisher=

  31. [31]

    AI and Ethics , volume=

    A framework to contest and justify algorithmic decisions , author=. AI and Ethics , volume=. 2021 , publisher=

  32. [32]

    Proceedings of the ACM on Human-Computer Interaction , volume=

    Conceptualising contestability: Perspectives on contesting algorithmic decisions , author=. Proceedings of the ACM on Human-Computer Interaction , volume=. 2021 , publisher=

  33. [33]

    arXiv preprint arXiv:2504.10708 , year=

    Legally-Informed Explainable AI , author=. arXiv preprint arXiv:2504.10708 , year=

  34. [34]

    arXiv preprint arXiv:2506.01662 , year=

    Explainable AI Systems Must Be Contestable: Here's How to Make It Happen , author=. arXiv preprint arXiv:2506.01662 , year=

  35. [35]

    arXiv preprint arXiv:2102.10787 , year=

    Fair and Responsible AI: A focus on the ability to contest , author=. arXiv preprint arXiv:2102.10787 , year=

  36. [36]

    , author=

    The mythos of model interpretability: In machine learning, the concept of interpretability is both important and slippery. , author=. Queue , volume=. 2018 , publisher=

  37. [37]

    International Conference on Artificial Intelligence and Statistics , pages=

    On the privacy risks of algorithmic recourse , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2023 , organization=

  38. [38]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    On the fairness of causal algorithmic recourse , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  39. [39]

    Advances in Neural Information Processing Systems , volume=

    Towards robust and reliable algorithmic recourse , author=. Advances in Neural Information Processing Systems , volume=

  40. [40]

    Proceedings of the 2022 ACM conference on fairness, accountability, and transparency , pages=

    Model multiplicity: Opportunities, concerns, and solutions , author=. Proceedings of the 2022 ACM conference on fairness, accountability, and transparency , pages=

  41. [41]

    2010 , publisher=

    Street-level bureaucracy: Dilemmas of the individual in public service , author=. 2010 , publisher=

  42. [42]

    2025 , eprint=

    The Role of Hyperparameters in Predictive Multiplicity , author=. 2025 , eprint=

  43. [43]

    Proceedings of the 30th International Conference on Intelligent User Interfaces , pages=

    Counterfactual Explanations May Not Be the Best Algorithmic Recourse Approach , author=. Proceedings of the 30th International Conference on Intelligent User Interfaces , pages=

  44. [44]

    2019 AAAI Workshop on Artificial Intelligence Safety, SafeAI 2019 , year=

    Counterfactual explanations of machine learning predictions: opportunities and challenges for AI safety , author=. 2019 AAAI Workshop on Artificial Intelligence Safety, SafeAI 2019 , year=

  45. [45]

    Annual review of psychology , volume=

    Causality in thought , author=. Annual review of psychology , volume=. 2015 , publisher=

  46. [46]

    Open Problems in Mechanistic Interpretability

    Open problems in mechanistic interpretability , author=. arXiv preprint arXiv:2501.16496 , year=

  47. [47]

    2020 , publisher=

    Interpretable machine learning , author=. 2020 , publisher=

  48. [48]

    Will You Find These Shortcuts?

    Spurious correlations in machine learning: A survey , author=. arXiv preprint arXiv:2402.12715 , year=

  49. [49]

    Philosophy & Technology , volume=

    Escaping the impossibility of fairness: From formal to substantive algorithmic fairness , author=. Philosophy & Technology , volume=. 2022 , publisher=

  50. [50]

    Advances in Neural Information Processing Systems , volume=

    shapiq: Shapley interactions for machine learning , author=. Advances in Neural Information Processing Systems , volume=

  51. [51]

    Proceedings of the 3rd innovations in theoretical computer science conference , pages=

    Fairness through awareness , author=. Proceedings of the 3rd innovations in theoretical computer science conference , pages=

  52. [52]

    Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=

    Post-hoc explanations fail to achieve their purpose in adversarial contexts , author=. Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency , pages=

  53. [53]

    Statistical science , volume=

    Statistical modeling: The two cultures (with comments and a rejoinder by the author) , author=. Statistical science , volume=. 2001 , publisher=

  54. [54]

    2024 , subtitle =

    Supervised Machine Learning for Science , author =. 2024 , subtitle =

  55. [55]

    Machine learning , volume=

    Aleatoric and epistemic uncertainty in machine learning: An introduction to concepts and methods , author=. Machine learning , volume=. 2021 , publisher=

  56. [56]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Predictive multiplicity in probabilistic classification , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  57. [57]

    International conference on machine learning , pages=

    Predictive multiplicity in classification , author=. International conference on machine learning , pages=. 2020 , organization=

  58. [58]

    Advances in neural information processing systems , volume=

    Exploring the whole rashomon set of sparse decision trees , author=. Advances in neural information processing systems , volume=

  59. [59]

    Proceedings of the 41st International Conference on Machine Learning , pages=

    Position: amazing things come from having many good models , author=. Proceedings of the 41st International Conference on Machine Learning , pages=

  60. [60]

    Journal of Machine Learning Research , volume=

    All models are wrong, but many are useful: Learning a variable's importance by studying an entire class of prediction models simultaneously , author=. Journal of Machine Learning Research , volume=

  61. [61]

    Proceedings of the 5th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization , pages=

    Reconsidering Fairness Through Unawareness From the Perspective of Model Multiplicity , author=. Proceedings of the 5th ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization , pages=

  62. [62]

    Philosophical Studies , volume=

    Reasons for and reasons against , author=. Philosophical Studies , volume=. 2018 , publisher=

  63. [63]

    Weighing reasons , volume=

    An opinionated guide to the weight of reasons , author=. Weighing reasons , volume=. 2016 , publisher=

  64. [64]

    Proceedings of the 2022 ACM conference on fairness, accountability, and transparency , pages=

    Counterfactual shapley additive explanations , author=. Proceedings of the 2022 ACM conference on fairness, accountability, and transparency , pages=

  65. [65]

    Proceedings of the AAAI conference on artificial intelligence , volume=

    Anchors: High-precision model-agnostic explanations , author=. Proceedings of the AAAI conference on artificial intelligence , volume=

  66. [66]

    Gradient based feature attribution in explainable ai: A technical review,

    Gradient based feature attribution in explainable AI: A technical review , author=. arXiv preprint arXiv:2403.10415 , year=

  67. [67]

    Journal of Computational and Graphical Statistics , volume=

    Peeking inside the black box: Visualizing statistical learning with plots of individual conditional expectation , author=. Journal of Computational and Graphical Statistics , volume=. 2015 , publisher=

  68. [68]

    Choice, decision, and measurement , pages=

    Violations of monotonicity in judgment and decision making , author=. Choice, decision, and measurement , pages=. 2019 , publisher=

  69. [69]

    Minds and Machines , volume=

    The intriguing relation between counterfactual explanations and adversarial examples , author=. Minds and Machines , volume=. 2022 , publisher=

  70. [70]

    2014 , publisher=

    Case-based reasoning , author=. 2014 , publisher=

  71. [71]

    Philosophers' Imprint , volume=

    Epistemic Norms and Epistemic Accountability , author=. Philosophers' Imprint , volume=. 2018 , publisher=

  72. [72]

    Annual review of psychology , volume=

    Heuristic decision making , author=. Annual review of psychology , volume=. 2011 , publisher=

  73. [73]

    Two Means to an End Goal

    " Two Means to an End Goal": Connecting Explainability and Contestability in the Regulation of Public Sector AI , author=. arXiv preprint arXiv:2504.18236 , year=

  74. [74]

    Advances in Neural Information Processing Systems , volume=

    Performative validity of recourse explanations , author=. Advances in Neural Information Processing Systems , volume=

  75. [75]

    Available at SSRN 4520754 , year=

    Meaningful XAI based on user-centric design methodology: Combining legal and human-computer interaction (HCI) approaches to achieve meaningful algorithmic explainability , author=. Available at SSRN 4520754 , year=

  76. [76]

    Philosophy of science , volume=

    Inductive risk and values in science , author=. Philosophy of science , volume=. 2000 , publisher=

  77. [77]

    Artificial intelligence , volume=

    Explanation in artificial intelligence: Insights from the social sciences , author=. Artificial intelligence , volume=. 2019 , publisher=

  78. [78]

    Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=

    Algorithmic recourse: from counterfactual explanations to interventions , author=. Proceedings of the 2021 ACM conference on fairness, accountability, and transparency , pages=

  79. [79]

    Advances in neural information processing systems , volume=

    A unified approach to interpreting model predictions , author=. Advances in neural information processing systems , volume=

  80. [80]

    International Joint Conference on Rules and Reasoning , pages=

    Contestable black boxes , author=. International Joint Conference on Rules and Reasoning , pages=. 2020 , organization=

Showing first 80 references.