Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles
Pith reviewed 2026-05-24 09:27 UTC · model grok-4.3
The pith
An efficient oracle and m-MARCO adaptation compute minimum-cost minimal explanations for tree ensemble predictions orders of magnitude faster than prior methods.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude when computing minimal explanations. We adapt an algorithm called MARCO (calling it m-MARCO) for the purpose of computing a single minimum explanation per prediction, and demonstrate an overall speedup factor of two compared to the MARCO algorithm which enumerates all minimal explanations. In several cases there are more than 100,000 minimal explanations to choose from for a single prediction, yet only a small portion are also minimum and the minimum explanations are significantly less verbose.
What carries the argument
The m-MARCO algorithm, an adaptation of MARCO that inherits its completeness and correctness properties while optimizing for a user-defined cost function to return one minimum explanation per prediction, backed by a fast oracle that checks explanation validity for tree ensembles.
Load-bearing premise
The adapted m-MARCO algorithm inherits the completeness and correctness properties of the original MARCO while correctly optimizing for a user-defined cost function rather than merely enumerating all minimal explanations.
What would settle it
A concrete prediction from a tree ensemble together with an explanation that the oracle labels correct but that a human or alternative verifier shows does not actually imply the model's output.
Figures
read the original abstract
The ability to explain why a machine learning model arrives at a particular prediction is crucial when used as decision support by human operators of critical systems. The provided explanations must be provably correct, and preferably without redundant information, called minimal explanations. In this paper, we aim at finding explanations for predictions made by tree ensembles that are not only minimal, but also minimum with respect to a cost function. To this end, we first present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude when computing minimal explanations. Secondly, we adapt an algorithm called MARCO from related works (calling it m-MARCO) for the purpose of computing a single minimum explanation per prediction, and demonstrate an overall speedup factor of two compared to the MARCO algorithm which enumerates all minimal explanations. Finally, we study the obtained explanations from a range of use cases, leading to further insights of their characteristics. In particular, we observe that in several cases, there are more than 100,000 minimal explanations to choose from for a single prediction. In these cases, we see that only a small portion of the minimal explanations are also minimum, and that the minimum explanations are significantly less verbose, hence motivating the aim of this work.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to introduce a highly efficient oracle for verifying the correctness of explanations for predictions from tree ensembles, outperforming prior state-of-the-art by orders of magnitude on minimal explanations; an adaptation m-MARCO of the MARCO algorithm that computes one minimum-cost (rather than merely minimal) explanation per prediction with a reported 2× overall speedup; and empirical observations across use cases showing that many predictions admit >100,000 minimal explanations, only a small fraction of which are minimum-cost, and that the minimum-cost ones are significantly less verbose.
Significance. If the oracle's runtime claims and the correctness/optimality of m-MARCO hold, the work would provide a practical advance for generating provably correct, cost-optimal minimal explanations for ensemble models in safety-critical settings. The empirical finding that minimal explanations are often extremely numerous and that minimum-cost ones are markedly shorter supplies a concrete motivation for moving beyond enumeration to optimization.
major comments (2)
- [Abstract / m-MARCO section] Abstract and the m-MARCO description: the central claim that m-MARCO returns a global minimum-cost explanation (not merely a minimal one) is load-bearing for the paper's title and contribution, yet the manuscript supplies no explicit argument showing how the adaptation inherits optimality guarantees from MARCO or implements cost-bounded pruning/branch-and-bound; standard MARCO enumerates all minimal hitting sets, so replacing it with a single-output procedure requires a correctness argument that the returned set is minimal under the user cost function rather than a locally low-cost one.
- [Experimental evaluation of the oracle] Oracle performance claims: the assertion that the new oracle surpasses prior state-of-the-art by “several orders of magnitude” when computing minimal explanations is central to the first contribution, but the experimental section must report the precise baselines, instance sizes, and statistical controls (error bars, number of runs) that support this; without them the speedup cannot be assessed as load-bearing evidence.
minor comments (2)
- [Introduction / Preliminaries] Notation for the cost function and the precise definition of “minimum explanation” versus “minimal explanation” should be introduced early and used consistently throughout.
- [Use-case study] The observation that “only a small portion of the minimal explanations are also minimum” would benefit from quantitative reporting (e.g., fraction or distribution) rather than qualitative description.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive feedback. We address each major comment below and will make the indicated revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract / m-MARCO section] Abstract and the m-MARCO description: the central claim that m-MARCO returns a global minimum-cost explanation (not merely a minimal one) is load-bearing for the paper's title and contribution, yet the manuscript supplies no explicit argument showing how the adaptation inherits optimality guarantees from MARCO or implements cost-bounded pruning/branch-and-bound; standard MARCO enumerates all minimal hitting sets, so replacing it with a single-output procedure requires a correctness argument that the returned set is minimal under the user cost function rather than a locally low-cost one.
Authors: We agree that the manuscript does not contain an explicit correctness argument establishing that m-MARCO returns a global minimum-cost explanation. The current description of the adaptation is high-level and does not detail how cost information is used for pruning or why the single output is guaranteed to be optimal rather than locally minimal. In the revision we will add a dedicated subsection that supplies this argument, describing the cost-bounded modifications to the MARCO enumeration procedure. revision: yes
-
Referee: [Experimental evaluation of the oracle] Oracle performance claims: the assertion that the new oracle surpasses prior state-of-the-art by “several orders of magnitude” when computing minimal explanations is central to the first contribution, but the experimental section must report the precise baselines, instance sizes, and statistical controls (error bars, number of runs) that support this; without them the speedup cannot be assessed as load-bearing evidence.
Authors: The referee correctly notes that the experimental section is missing the requested details needed to evaluate the reported speedups. We will revise the section to explicitly list the prior state-of-the-art baselines, report the exact sizes of the tree ensembles and test instances, and add statistical controls including the number of runs and error bars. revision: yes
Circularity Check
No circularity; algorithmic claims are self-contained
full rationale
The paper presents an efficient oracle for verifying explanation correctness and an adaptation (m-MARCO) of the external MARCO algorithm to compute one minimum-cost explanation. These are runtime and enumeration improvements with no mathematical derivations, fitted parameters, or self-definitions that reduce results to inputs by construction. Inheritance of completeness/correctness is asserted from cited related works (not self-citations), and no load-bearing step equates a 'prediction' or 'minimum' to its own definition. The analysis concerns external benchmarks (speedup factors, number of explanations) rather than quantities defined circularly.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The oracle correctly decides validity of any candidate explanation for tree-ensemble predictions.
- domain assumption The original MARCO algorithm's enumeration properties transfer directly to the minimum-cost variant.
Reference graph
Works this paper leans on
-
[1]
Audemard, G., Bellart, S., Bounia, L., Koriche, F., Lagniez, J.-M., \ Marquis, P. 2022 . Trading complexity for sparsity in random forest explanations \ In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence . AAAI Press
work page 2022
-
[2]
Boumazouza, R., Cheikh-Alili, F., Mazure, B., \ Tabia, K. 2021 . ASTERYX: A Model-Agnostic SaT-BasEd AppRoach for SYmbolic and Score-Based EXplanations , \ 120–129. Association for Computing Machinery, New York, NY, USA
work page 2021
-
[3]
Brain, M., Tinelli, C., R \"u mmer, P., \ Wahl, T. 2015 . An automatable formal semantics for ieee-754 floating-point arithmetic \ In 2015 IEEE 22nd Symposium on Computer Arithmetic , \ 160--167. IEEE
work page 2015
-
[4]
Breiman, L. 2001 . Random forests \ Machine learning , 45\/ (1), 5--32
work page 2001
-
[5]
Chen, T. \ \ Guestrin, C. 2016 . Xgboost: A scalable tree boosting system \ In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , KDD '16, \ 785–794, New York, NY, USA. Association for Computing Machinery
work page 2016
-
[6]
Chinneck, J. W. \ \ Dravnieks, E. W. 1991 . Locating minimal infeasible constraint sets in linear programs \ ORSA Journal on Computing , 3\/ (2), 157--168
work page 1991
-
[7]
Cousot, P. \ \ Cousot, R. 1977 . Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints \ In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of Programming Languages (POPL) , \ 238--252, Los Angeles, California. ACM Press, New York, NY
work page 1977
-
[8]
Darwiche, A. \ \ Hirth, A. 2020 . On the reasons behind decisions \ In In Proceedings of the 24th European Conference on Artificial Intelligence (ECAI) , \ 712--720. IOS Press
work page 2020
-
[9]
Dillig, I., Dillig, T., McMillan, K. L., \ Aiken, A. 2012 . Minimum satisfying assignments for smt \ In International Conference on Computer Aided Verification , \ 394--409. Springer
work page 2012
-
[10]
Friedman, J. H. 2001 . Greedy function approximation: A gradient boosting machine \ The Annals of Statistics , 29\/ (5), 1189--1232
work page 2001
-
[11]
Gunning, D., Vorm, E., Wang, J. Y., \ Turek, M. 2021 . DARPA 's explainable AI ( XAI ) program: A retrospective \ Applied AI Letters , 2\/ (4)
work page 2021
-
[12]
Hadji Misheva, B., Jaggi, D., Posth, J.-A., Gramespacher, T., \ Osterrieder, J. 2021 . Audience-dependent explanations for ai-based risk management tools: A survey \ Frontiers in Artificial Intelligence , 4
work page 2021
-
[13]
Ignatiev, A., Izza, Y., Stuckey, P. J., \ Marques-Silva, J. 2022 . Using maxsat for efficient explanations of tree ensembles \ In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence . AAAI Press
work page 2022
-
[14]
Ignatiev, A. \ \ Marques-Silva, J. 2021 . Sat-based rigorous explanations for decision lists \ In Li, C.-M. \ \ Many \`a , F. , Theory and Applications of Satisfiability Testing -- SAT 2021 , \ 251--269, Cham. Springer International Publishing
work page 2021
-
[15]
Ignatiev, A., Narodytska, N., \ Marques-Silva, J. 2019 . Abduction-based explanations for machine learning models \ In Proceedings of the AAAI Conference on Artificial Intelligence , 33, \ 1511--1519
work page 2019
-
[16]
Ignatiev, A., Previti, A., \ Marques-Silva, J. 2016 . On finding minimum satisfying assignments \ In International Conference on Principles and Practice of Constraint Programming , \ 287--297. Springer
work page 2016
-
[17]
Irsoy, O., Yildiz, O. T., \ Alpaydin, E. 2012 . Soft decision trees \ In International Conference on Pattern Recognition . IEEE
work page 2012
-
[18]
Izza, Y., Ignatiev, A., \ Marques-Silva, J. 2022 . On tackling explanation redundancy in decision trees \ Journal of Artificial Intelligence Research , 75 , 61--321
work page 2022
-
[19]
Izza, Y. \ \ Silva, J. M. 2021 . On explaining random forests with sat \ In 30th International Joint Conference on Artificial Intelligence . International Joint Conferences on Artifical Intelligence (IJCAI)
work page 2021
-
[20]
Kaur, D., Uslu, S., Rittichier, K. J., \ Durresi, A. 2022 . Trustworthy artificial intelligence: A review \ ACM Computing Surveys (CSUR) , 55\/ (2), 1--38
work page 2022
-
[21]
La Malfa, E., Zbrzezny, A., Michelmore, R., Paoletti, N., \ Kwiatkowska, M. 2021 . On guaranteed optimal robust explanations for nlp models \ In 30th International Joint Conference on Artificial Intelligence , \ 2658--2665. International Joint Conferences on Artifical Intelligence (IJCAI)
work page 2021
-
[22]
Liffiton, M. H. \ \ Malik, A. 2013 . Enumerating infeasibility: Finding multiple MUS es quickly \ In International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research , \ 160--175. Springer
work page 2013
-
[23]
H., Previti, A., Malik, A., \ Marques-Silva, J
Liffiton, M. H., Previti, A., Malik, A., \ Marques-Silva, J. 2016 . Fast, flexible mus enumeration \ Constraints , 21\/ (2), 223--250
work page 2016
-
[24]
C., Ignatiev, A., \ Narodytska, N
Marques-Silva, J., Gerspacher, T., Cooper, M. C., Ignatiev, A., \ Narodytska, N. 2021 . Explanations for monotonic classifiers \ In International Conference on Machine Learning , \ 7469--7479. PMLR
work page 2021
-
[25]
Marquis, P. 1991 . Extending abduction from propositional to first-order logic \ In International Workshop on Fundamentals of Artificial Intelligence Research , \ 141--155. Springer
work page 1991
-
[26]
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. 2011 . Scikit-learn: Machine learning in P ython \ Journal of machine learning research , 12 , 2825--2830
work page 2011
-
[27]
Previti, A. \ \ Marques-Silva, J. 2013 . Partial mus enumeration \ In Twenty-Seventh AAAI Conference on Artificial Intelligence
work page 2013
-
[28]
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., \ Gulin, A. 2018 . Catboost: unbiased boosting with categorical features \ In Advances in Neural Information Processing Systems (NIPS)
work page 2018
-
[29]
Quine, W. V. 1952 . The problem of simplifying truth functions \ The American mathematical monthly , 59\/ (8), 521--531
work page 1952
-
[30]
Ranzato, F. \ \ Zanella, M. 2020 . Abstract interpretation of decision tree ensemble classifiers \ In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence , \ 5478--5486. AAAI Press
work page 2020
-
[31]
Ras, G., Xie, N., van Gerven, M., \ Doran, D. 2022 . Explainable deep learning: A field guide for the uninitiated \ Journal of Artificial Intelligence Research , 73 , 329--397
work page 2022
-
[32]
Shih, A., Choi, A., \ Darwiche, A. 2018 . A symbolic approach to explaining bayesian network classifiers \ In Proceedings of the 27th International Joint Conference on Artificial Intelligence , \ 5103--5111
work page 2018
-
[33]
Shih, A., Choi, A., \ Darwiche, A. 2019 . Compiling bayesian network classifiers into decision graphs \ In Proceedings of the AAAI Conference on Artificial Intelligence , 33, \ 7966--7974
work page 2019
-
[34]
Shwartz-Ziv, R. \ \ Armon, A. 2022 . Tabular data: Deep learning is not all you need \ Information Fusion , 81 , 84--90
work page 2022
-
[35]
T \"o rnblom, J. \ \ Nadjm-Tehrani, S. 2019 . An abstraction-refinement approach to formal verification of tree ensembles \ In International Conference on Computer Safety, Reliability, and Security (SAFECOMP) , \ 301--313. Springer
work page 2019
-
[36]
T \"o rnblom, J. \ \ Nadjm-Tehrani, S. 2021 . Scaling up memory-efficient formal verification tools for tree ensembles \ arXiv preprint , arXiv:2105.02595
-
[37]
Wang, F., Wang, Q., Nie, F., Yu, W., \ Wang, R. 2018 . Efficient tree classifiers for large scale datasets \ Neurocomputing , 284
work page 2018
-
[38]
" write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.