The Complexity of Auditing Disclosure-Robust Defeasible Explanations
Pith reviewed 2026-06-26 18:39 UTC · model grok-4.3
The pith
Verifying that a sufficient reason remains valid after any later disclosures is coNP-complete while finding the smallest such robust core is Σ₂^p-complete.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that auditing disclosure-robust defeasible explanations produces a four-cell complexity landscape in the size of the boundary atlas: prediction and standing anchors are in P via direct scans, verifying that a reason core is robust is coNP-complete, the decision version of minimal certified disclosure is NP-complete, and deciding existence of a robust core of size at most theta is Σ₂^p-complete, with the accepted case reaching the second level of the polynomial hierarchy.
What carries the argument
The boundary atlas of entry anchors and exit defeaters, which encodes the classifier so that prediction, standing anchors, and a reason's defeater frontier are obtained by polynomial-time scans and subset minimization without iterative fixpoint computation.
If this is right
- Prediction and standing anchors are read by direct polynomial-time scans of the atlas.
- Verifying robustness of a given reason core is coNP-complete.
- Deciding existence of a robust core of size at most theta is Σ₂^p-complete.
- The decision version of minimal certified disclosure is NP-complete while its optimization version is fixed-parameter tractable in the number of excluded worlds without defeaters.
- On standard tabular datasets with small Boolean abstraction the governing parameters stay small, keeping exact robust auditing tractable, while adversarial reductions produce linear-size robust cores.
Where Pith is reading between the lines
- Exact auditing may therefore remain practical only inside the small-parameter regime observed on tabular data.
- For general defeater cases the optimization version being open leaves room for further parameterized analysis.
- The Σ₂^p result suggests that heuristic or approximate auditing procedures will be needed once robust cores exceed low single digits.
Load-bearing premise
The defeasible classifier can be compiled into an explicit boundary atlas of entry anchors and exit defeaters such that all necessary scans and subset minimizations run in polynomial time without needing iterative fixpoint computation.
What would settle it
An explicit counterexample classifier whose boundary atlas makes robustness verification require a quantified Boolean formula beyond coNP, or whose smallest robust core size cannot be decided by an NP oracle.
Figures
read the original abstract
A formal explanation certifies a prediction with a subset-minimal sufficient reason. Under incremental disclosure, however, evidence arrives field by field, and a normally sufficient reason can be overturned by later information. We study the smallest reason core that remains sufficient under all admissible later disclosures; its size is the robustness radius. We compile a defeasible classifier into an explicit boundary atlas of entry anchors and exit defeaters, and chart the complexity of auditing it (all statements are in the atlas size). Prediction and standing anchors are read by polynomial-time scans of the atlas, without iterative fixpoint computation; a reason's defeater frontier is obtained by scanning and subset-minimizing the defeaters above it. But verifying that a reason core is robust is coNP-complete, and deciding whether a robust core of size at most theta exists is $\Sigma_2^p$-complete -- a four-cell P / coNP-complete / NP-complete / $\Sigma_2^p$-complete landscape, with the accepted (A(t)=1) case reaching the second level of the polynomial hierarchy. The decision version of minimal certified disclosure is NP-complete; its optimization version is fixed-parameter tractable in the number of excluded worlds without defeaters, with the general-defeater case open. On exact audits of depth-limited decision trees over standard tabular datasets under a deliberately small Boolean abstraction, the governing parameters fall in a small-parameter regime (robust cores in the low single digits), so exact robust auditing is tractable in these audited cubes; on adversarial instances built from our reductions the hardness bites, with robust cores of size Theta(n). To our knowledge this is the first $\Sigma_2^p$-complete audit query for disclosure-robust formal explanations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that compiling a defeasible classifier to an explicit boundary atlas of entry anchors and exit defeaters reduces auditing tasks for disclosure-robust explanations to a clean complexity landscape parameterized by atlas size: prediction and standing anchors are in P via direct scans, a reason's defeater frontier is obtained by subset-minimization, verifying robustness of a core is coNP-complete, existence of a robust core of size at most θ is Σ₂^p-complete, the decision version of minimal certified disclosure is NP-complete, and its optimization version is FPT in the number of excluded worlds without defeaters (general-defeater case open). It further reports that on depth-limited decision trees over tabular data the governing parameters are small, making exact audits tractable, while adversarial instances from the reductions exhibit hardness with Θ(n)-sized cores. This is presented as the first Σ₂^p-complete audit query for disclosure-robust formal explanations.
Significance. If the atlas compilation and reductions hold, the work supplies the first Σ₂^p-complete audit query in this setting together with an explicit four-cell P/coNP/NP/Σ₂^p landscape and an FPT result for one optimization variant. The avoidance of iterative fixpoint computation via direct atlas scans and subset-minimization is a concrete technical contribution. The empirical observation that robust cores remain small on standard datasets under the chosen Boolean abstraction supplies a practical counterpart to the hardness results.
minor comments (4)
- [§3] §3 (Boundary Atlas Construction): the precise polynomial-time bound for compiling an arbitrary defeasible classifier into the atlas is stated only at high level; an explicit statement of the compilation complexity (or a reference to an appendix containing the algorithm) would strengthen the claim that all subsequent results are parameterized solely by atlas size.
- [Theorem 4.3] Theorem 4.3 (Σ₂^p-completeness): the reduction sketch mentions guessing a core of size ≤θ and verifying robustness in coNP, but does not explicitly name the Σ₂^p machine or the witness length; adding one sentence clarifying the quantifier alternation would improve readability.
- [Table 1] Table 1 (Empirical results): the column reporting 'robust core size' should include the corresponding atlas size for each dataset so that readers can directly compare against the 'small-parameter regime' claim in the abstract.
- [§4.2] Notation: the symbol θ for the size bound is introduced without an explicit definition in the complexity statements; a one-line reminder in §4.2 would prevent any ambiguity with the robustness radius defined earlier.
Simulated Author's Rebuttal
We thank the referee for their positive summary of the manuscript, recognition of the four-cell complexity landscape, the first Σ₂^p-complete audit query result, the FPT result, and the empirical observations on decision trees. The recommendation of minor revision is noted. No specific major comments were raised in the report.
Circularity Check
No significant circularity
full rationale
The paper establishes complexity results (P, coNP-complete, NP-complete, Σ₂^p-complete) for auditing tasks on a compiled boundary atlas via explicit polynomial-time scans for anchors and subset-minimization for defeater frontiers, followed by standard polynomial-hierarchy reductions for the hardness claims. No derivation step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the atlas is defined independently, and all statements are parameterized by atlas size with reductions that are externally verifiable. The central claims remain independent of the paper's own inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard assumptions of computational complexity theory including the structure of the polynomial hierarchy and known completeness results for NP, coNP and Σ₂^p.
Reference graph
Works this paper leans on
-
[1]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Abduction-Based Explanations for Machine Learning Models , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2019 , doi=
2019
-
[2]
AIxIA 2020 -- Advances in Artificial Intelligence , series=
From Contrastive to Abductive Explanations and Back Again , author=. AIxIA 2020 -- Advances in Artificial Intelligence , series=. 2021 , doi=
2020
-
[3]
Journal of Artificial Intelligence Research , volume=
A Knowledge Compilation Map , author=. Journal of Artificial Intelligence Research , volume=. 2002 , doi=
2002
-
[4]
Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning (KR) , pages=
Monotonicity and Noise-Tolerance in Case-Based Reasoning with Abstract Argumentation , author=. Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning (KR) , pages=. 2021 , doi=
2021
-
[5]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Anchors: High-Precision Model-Agnostic Explanations , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2018 , doi=
2018
-
[6]
Artificial Intelligence , volume=
A Theory of Diagnosis from First Principles , author=. Artificial Intelligence , volume=. 1987 , doi=
1987
-
[7]
Computational Models of Argument (COMMA) , publisher=
Explanation for Case-Based Reasoning via Abstract Argumentation , author=. Computational Models of Argument (COMMA) , publisher=. 2016 , doi=
2016
-
[8]
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
Delivering Trustworthy AI through Formal XAI , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=. 2022 , doi=
2022
-
[9]
Journal of Artificial Intelligence Research , volume=
On Tackling Explanation Redundancy in Decision Trees , author=. Journal of Artificial Intelligence Research , volume=. 2022 , doi=
2022
-
[10]
Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI) , pages=
A Symbolic Approach to Explaining Bayesian Network Classifiers , author=. Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI) , pages=. 2018 , doi=
2018
-
[11]
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) , pages=
``Why Should I Trust You?'': Explaining the Predictions of Any Classifier , author=. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD) , pages=. 2016 , doi=
2016
-
[12]
Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning (KR) , pages=
On Tractable XAI Queries based on Compiled Representations , author=. Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning (KR) , pages=. 2020 , doi=
2020
-
[13]
Reasoning Web
Logic-Based Explainability in Machine Learning , author=. Reasoning Web. Causality, Explanations and Declarative Knowledge , series=. 2023 , doi=
2023
-
[14]
Advances in Neural Information Processing Systems (NeurIPS) , year=
A Unified Approach to Interpreting Model Predictions , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=
-
[15]
Stockmeyer, Larry J. , year=. The polynomial-time hierarchy , volume=. Theoretical Computer Science , publisher=. doi:10.1016/0304-3975(76)90061-x , number=
-
[16]
The Minimum Equivalent DNF Problem and Shortest Implicants , volume=
Umans, Christopher , year=. The Minimum Equivalent DNF Problem and Shortest Implicants , volume=. Journal of Computer and System Sciences , publisher=. doi:10.1006/jcss.2001.1775 , number=
-
[17]
The Computational Complexity of Understanding Binary Classifier Decisions , volume=
Waeldchen, Stephan and Macdonald, Jan and Hauch, Sascha and Kutyniok, Gitta , year=. The Computational Complexity of Understanding Binary Classifier Decisions , volume=. doi:10.1613/jair.1.12359 , journal=
-
[18]
Advances in Neural Information Processing Systems (NeurIPS) , year=
Model Interpretability through the Lens of Computational Complexity , author=. Advances in Neural Information Processing Systems (NeurIPS) , year=
-
[19]
ACM SIGKDD Explorations Newsletter , volume=
OpenML: networked science in machine learning , author=. ACM SIGKDD Explorations Newsletter , volume=. 2014 , doi=
2014
-
[20]
Scaling Up the Accuracy of Naive-
Kohavi, Ron , booktitle=. Scaling Up the Accuracy of Naive-
-
[21]
Human Pathology , volume=
Computer-derived nuclear features distinguish malignant from benign breast cytology , author=. Human Pathology , volume=. 1995 , doi=
1995
-
[22]
Scikit-learn: Machine Learning in
Pedregosa, Fabian and Varoquaux, Ga. Scikit-learn: Machine Learning in. Journal of Machine Learning Research , volume=
-
[23]
Audemard, G.; Koriche, F.; and Marquis, P. 2020. On Tractable XAI Queries based on Compiled Representations. In Proceedings of the 17th International Conference on Principles of Knowledge Representation and Reasoning (KR), 838--849
2020
- [24]
-
[25]
C yras, K.; Satoh, K.; and Toni, F. 2016. Explanation for Case-Based Reasoning via Abstract Argumentation. In Computational Models of Argument (COMMA), 243--254. IOS Press
2016
-
[26]
Darwiche, A.; and Marquis, P. 2002. A Knowledge Compilation Map. Journal of Artificial Intelligence Research, 17: 229--264
2002
-
[27]
Ignatiev, A.; Narodytska, N.; Asher, N.; and Marques-Silva, J. 2021. From Contrastive to Abductive Explanations and Back Again. In AIxIA 2020 -- Advances in Artificial Intelligence, LNCS, 335--355. Springer
2021
-
[28]
Ignatiev, A.; Narodytska, N.; and Marques-Silva, J. 2019. Abduction-Based Explanations for Machine Learning Models. Proceedings of the AAAI Conference on Artificial Intelligence, 33(01): 1511--1519
2019
-
[29]
Izza, Y.; Ignatiev, A.; and Marques-Silva, J. 2022. On Tackling Explanation Redundancy in Decision Trees. Journal of Artificial Intelligence Research, 75: 261--321
2022
-
[30]
Kohavi, R. 1996. Scaling Up the Accuracy of Naive- B ayes Classifiers: A Decision-Tree Hybrid. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD), 202--207
1996
-
[31]
A Unified Approach to Interpreting Model Predictions
Lundberg, S. M.; and Lee, S.-I. 2017. A Unified Approach to Interpreting Model Predictions. In Advances in Neural Information Processing Systems (NeurIPS). ArXiv:1705.07874
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[32]
Marques-Silva, J. 2023. Logic-Based Explainability in Machine Learning. In Reasoning Web. Causality, Explanations and Declarative Knowledge, LNCS, 24--104. Springer
2023
-
[33]
Marques-Silva, J.; and Ignatiev, A. 2022. Delivering Trustworthy AI through Formal XAI. Proceedings of the AAAI Conference on Artificial Intelligence, 36(11): 12342--12350
2022
-
[34]
Paulino-Passos, G.; and Toni, F. 2021. Monotonicity and Noise-Tolerance in Case-Based Reasoning with Abstract Argumentation. In Proceedings of the 18th International Conference on Principles of Knowledge Representation and Reasoning (KR), 508--518
2021
-
[35]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; Vanderplas, J.; Passos, A.; Cournapeau, D.; Brucher, M.; Perrot, M.; and Duchesnay, \'E . 2011. Scikit-learn: Machine Learning in P ython. Journal of Machine Learning Research, 12: 2825--2830
2011
-
[36]
Reiter, R. 1987. A Theory of Diagnosis from First Principles. Artificial Intelligence, 32(1): 57--95
1987
-
[37]
T.; Singh, S.; and Guestrin, C
Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. ``Why Should I Trust You?'': Explaining the Predictions of Any Classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 1135--1144
2016
-
[38]
T.; Singh, S.; and Guestrin, C
Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2018. Anchors: High-Precision Model-Agnostic Explanations. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1)
2018
-
[39]
Shih, A.; Choi, A.; and Darwiche, A. 2018. A Symbolic Approach to Explaining Bayesian Network Classifiers. In Proceedings of the 27th International Joint Conference on Artificial Intelligence (IJCAI), 5103--5111
2018
-
[40]
Stockmeyer, L. J. 1976. The polynomial-time hierarchy. Theoretical Computer Science, 3(1): 1--22
1976
-
[41]
Umans, C. 2001. The Minimum Equivalent DNF Problem and Shortest Implicants. Journal of Computer and System Sciences, 63(4): 597--611
2001
-
[42]
N.; Bischl, B.; and Torgo, L
Vanschoren, J.; van Rijn, J. N.; Bischl, B.; and Torgo, L. 2014. OpenML: networked science in machine learning. ACM SIGKDD Explorations Newsletter, 15(2): 49--60
2014
-
[43]
Waeldchen, S.; Macdonald, J.; Hauch, S.; and Kutyniok, G. 2021. The Computational Complexity of Understanding Binary Classifier Decisions. Journal of Artificial Intelligence Research, 70
2021
-
[44]
H.; Street, W
Wolberg, W. H.; Street, W. N.; Heisey, D. M.; and Mangasarian, O. L. 1995. Computer-derived nuclear features distinguish malignant from benign breast cytology. Human Pathology, 26(7): 792--796
1995
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.