pith. sign in

arxiv: 2303.09271 · v2 · submitted 2023-03-16 · 💻 cs.LG · cs.AI

Finding Minimum-Cost Explanations for Predictions made by Tree Ensembles

Pith reviewed 2026-05-24 09:27 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords explanationstree ensemblesminimal explanationsminimum costMARCO algorithmmachine learning interpretabilityoracleensemble models
0
0 comments X

The pith

An efficient oracle and m-MARCO adaptation compute minimum-cost minimal explanations for tree ensemble predictions orders of magnitude faster than prior methods.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to produce explanations for tree ensemble predictions that are both minimal and minimum with respect to a user-defined cost function. It first introduces an oracle that verifies explanation correctness far more quickly than existing approaches when finding minimal explanations. It then adapts the MARCO algorithm into m-MARCO to return one minimum-cost explanation per prediction rather than enumerating all minimal ones, yielding a factor-of-two overall speedup. The authors also examine real use cases and report that individual predictions can have over 100,000 minimal explanations, of which only a small share achieve minimum cost and all of which are markedly less verbose.

Core claim

We present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude when computing minimal explanations. We adapt an algorithm called MARCO (calling it m-MARCO) for the purpose of computing a single minimum explanation per prediction, and demonstrate an overall speedup factor of two compared to the MARCO algorithm which enumerates all minimal explanations. In several cases there are more than 100,000 minimal explanations to choose from for a single prediction, yet only a small portion are also minimum and the minimum explanations are significantly less verbose.

What carries the argument

The m-MARCO algorithm, an adaptation of MARCO that inherits its completeness and correctness properties while optimizing for a user-defined cost function to return one minimum explanation per prediction, backed by a fast oracle that checks explanation validity for tree ensembles.

Load-bearing premise

The adapted m-MARCO algorithm inherits the completeness and correctness properties of the original MARCO while correctly optimizing for a user-defined cost function rather than merely enumerating all minimal explanations.

What would settle it

A concrete prediction from a tree ensemble together with an explanation that the oracle labels correct but that a human or alternative verifier shows does not actually imply the model's output.

Figures

Figures reproduced from arXiv: 2303.09271 by Emil Karlsson, John T\"ornblom, Simin Nadjm-Tehrani.

Figure 1
Figure 1. Figure 1: A fictive bank-loan application processing system, where a positive outcome in￾dicates that the applicant should be denied a loan. Suppose the system processes a particular applicant with a low income, low education, no criminal records, and a couple of missed payments, and the system denies the applicant a bank-loan. When explaining to the applicant why the requested loan is denied, information about crim… view at source ↗
Figure 2
Figure 2. Figure 2: A Hasse diagram illustrating the power set lattice of a triple of constraints, where crossed over elements are unsatisfiable subsets, and bold elements are maximal satisfiable subsets. 2.3.2 The MARCO Algorithm MARCO (short for Mapping Regions of Constraints) is an algorithm that identifies the frontier between satisfiable and unsatisfiable subsets of constraints. It was originally pre￾sented in two indepe… view at source ↗
Figure 3
Figure 3. Figure 3: A decision tree (to the left) partition an input space into disjoint regions (to the right), and associates each region with an output value. decision rules in decision trees may be non-linear and multivariate. Although researchers have demonstrated that decision trees with non-linear (Irsoy et al., 2012) and multivari￾ate (Wang et al., 2018) decision rules can be useful, state-of-the-art implementations o… view at source ↗
Figure 4
Figure 4. Figure 4: Progression of the MaxSAT and Abstract Interpretation approach when comput￾ing minimal explanations, where the depth of trees are doubled compared to the original experiments presented by Ignatiev et al. (2022). solvers (e.g., Dillig et al., 2012), and recently used for computing a minimum explanation in the context of neural networks (La Malfa et al., 2021). Both algorithms are invoked by a program capabl… view at source ↗
Figure 5
Figure 5. Figure 5: Progression of the branch & bound approach (Algorithm 9) and the m-MARCO approach (Algorithm 7) when computing minimum explanations in parallel. 0 200 400 600 Accumulated thread time (h) 0.0 0.2 0.4 0.6 0.8 1.0 Fraction of solved instances Minimal Minimum Enumerate [PITH_FULL_IMAGE:figures/full_fig_p026_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Progress made while computing minimal explanations (Algorithm 6), minimum explanations (Algorithm 7), and when enumerating all minimal explanations (Al￾gorithm 3). a single minimum explanation. For more detailed results, see Appendix C where we list the elapsed runtime and number of timeouts for each individual model. 6.2 Characteristics of Explanations In this section, we take a closer look at the explana… view at source ↗
Figure 7
Figure 7. Figure 7: A boxplot illustrating costs of different explanations. [PITH_FULL_IMAGE:figures/full_fig_p028_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: A sequence diagram exemplifying how functions are invoked (solid arrows) and returned from (dashed arrows) in Algorithm 8 when all property checkers return P ass. Text associated with arrows describe an action taking place at a particular point in the sequence, with parameters enclosed by parentheses. Appendix B. Minimum Explanations using Branch and Bound In this section, we present an additional algorith… view at source ↗
read the original abstract

The ability to explain why a machine learning model arrives at a particular prediction is crucial when used as decision support by human operators of critical systems. The provided explanations must be provably correct, and preferably without redundant information, called minimal explanations. In this paper, we aim at finding explanations for predictions made by tree ensembles that are not only minimal, but also minimum with respect to a cost function. To this end, we first present a highly efficient oracle that can determine the correctness of explanations, surpassing the runtime performance of current state-of-the-art alternatives by several orders of magnitude when computing minimal explanations. Secondly, we adapt an algorithm called MARCO from related works (calling it m-MARCO) for the purpose of computing a single minimum explanation per prediction, and demonstrate an overall speedup factor of two compared to the MARCO algorithm which enumerates all minimal explanations. Finally, we study the obtained explanations from a range of use cases, leading to further insights of their characteristics. In particular, we observe that in several cases, there are more than 100,000 minimal explanations to choose from for a single prediction. In these cases, we see that only a small portion of the minimal explanations are also minimum, and that the minimum explanations are significantly less verbose, hence motivating the aim of this work.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to introduce a highly efficient oracle for verifying the correctness of explanations for predictions from tree ensembles, outperforming prior state-of-the-art by orders of magnitude on minimal explanations; an adaptation m-MARCO of the MARCO algorithm that computes one minimum-cost (rather than merely minimal) explanation per prediction with a reported 2× overall speedup; and empirical observations across use cases showing that many predictions admit >100,000 minimal explanations, only a small fraction of which are minimum-cost, and that the minimum-cost ones are significantly less verbose.

Significance. If the oracle's runtime claims and the correctness/optimality of m-MARCO hold, the work would provide a practical advance for generating provably correct, cost-optimal minimal explanations for ensemble models in safety-critical settings. The empirical finding that minimal explanations are often extremely numerous and that minimum-cost ones are markedly shorter supplies a concrete motivation for moving beyond enumeration to optimization.

major comments (2)
  1. [Abstract / m-MARCO section] Abstract and the m-MARCO description: the central claim that m-MARCO returns a global minimum-cost explanation (not merely a minimal one) is load-bearing for the paper's title and contribution, yet the manuscript supplies no explicit argument showing how the adaptation inherits optimality guarantees from MARCO or implements cost-bounded pruning/branch-and-bound; standard MARCO enumerates all minimal hitting sets, so replacing it with a single-output procedure requires a correctness argument that the returned set is minimal under the user cost function rather than a locally low-cost one.
  2. [Experimental evaluation of the oracle] Oracle performance claims: the assertion that the new oracle surpasses prior state-of-the-art by “several orders of magnitude” when computing minimal explanations is central to the first contribution, but the experimental section must report the precise baselines, instance sizes, and statistical controls (error bars, number of runs) that support this; without them the speedup cannot be assessed as load-bearing evidence.
minor comments (2)
  1. [Introduction / Preliminaries] Notation for the cost function and the precise definition of “minimum explanation” versus “minimal explanation” should be introduced early and used consistently throughout.
  2. [Use-case study] The observation that “only a small portion of the minimal explanations are also minimum” would benefit from quantitative reporting (e.g., fraction or distribution) rather than qualitative description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and constructive feedback. We address each major comment below and will make the indicated revisions to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / m-MARCO section] Abstract and the m-MARCO description: the central claim that m-MARCO returns a global minimum-cost explanation (not merely a minimal one) is load-bearing for the paper's title and contribution, yet the manuscript supplies no explicit argument showing how the adaptation inherits optimality guarantees from MARCO or implements cost-bounded pruning/branch-and-bound; standard MARCO enumerates all minimal hitting sets, so replacing it with a single-output procedure requires a correctness argument that the returned set is minimal under the user cost function rather than a locally low-cost one.

    Authors: We agree that the manuscript does not contain an explicit correctness argument establishing that m-MARCO returns a global minimum-cost explanation. The current description of the adaptation is high-level and does not detail how cost information is used for pruning or why the single output is guaranteed to be optimal rather than locally minimal. In the revision we will add a dedicated subsection that supplies this argument, describing the cost-bounded modifications to the MARCO enumeration procedure. revision: yes

  2. Referee: [Experimental evaluation of the oracle] Oracle performance claims: the assertion that the new oracle surpasses prior state-of-the-art by “several orders of magnitude” when computing minimal explanations is central to the first contribution, but the experimental section must report the precise baselines, instance sizes, and statistical controls (error bars, number of runs) that support this; without them the speedup cannot be assessed as load-bearing evidence.

    Authors: The referee correctly notes that the experimental section is missing the requested details needed to evaluate the reported speedups. We will revise the section to explicitly list the prior state-of-the-art baselines, report the exact sizes of the tree ensembles and test instances, and add statistical controls including the number of runs and error bars. revision: yes

Circularity Check

0 steps flagged

No circularity; algorithmic claims are self-contained

full rationale

The paper presents an efficient oracle for verifying explanation correctness and an adaptation (m-MARCO) of the external MARCO algorithm to compute one minimum-cost explanation. These are runtime and enumeration improvements with no mathematical derivations, fitted parameters, or self-definitions that reduce results to inputs by construction. Inheritance of completeness/correctness is asserted from cited related works (not self-citations), and no load-bearing step equates a 'prediction' or 'minimum' to its own definition. The analysis concerns external benchmarks (speedup factors, number of explanations) rather than quantities defined circularly.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the assumption that the oracle is correct for all candidate explanations and that the cost function is well-defined and independent of the enumeration process. No free parameters or invented entities are introduced.

axioms (2)
  • domain assumption The oracle correctly decides validity of any candidate explanation for tree-ensemble predictions.
    This is the load-bearing premise for the claimed orders-of-magnitude speedup when searching for minimal explanations.
  • domain assumption The original MARCO algorithm's enumeration properties transfer directly to the minimum-cost variant.
    Invoked when claiming that m-MARCO returns a true minimum without enumerating all minimal sets.

pith-pipeline@v0.9.0 · 5770 in / 1410 out tokens · 42288 ms · 2026-05-24T09:27:55.027129+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    Audemard, G., Bellart, S., Bounia, L., Koriche, F., Lagniez, J.-M., \ Marquis, P. 2022 . Trading complexity for sparsity in random forest explanations \ In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence . AAAI Press

  2. [2]

    Boumazouza, R., Cheikh-Alili, F., Mazure, B., \ Tabia, K. 2021 . ASTERYX: A Model-Agnostic SaT-BasEd AppRoach for SYmbolic and Score-Based EXplanations , \ 120–129. Association for Computing Machinery, New York, NY, USA

  3. [3]

    Brain, M., Tinelli, C., R \"u mmer, P., \ Wahl, T. 2015 . An automatable formal semantics for ieee-754 floating-point arithmetic \ In 2015 IEEE 22nd Symposium on Computer Arithmetic , \ 160--167. IEEE

  4. [4]

    Breiman, L. 2001 . Random forests \ Machine learning , 45\/ (1), 5--32

  5. [5]

    \ \ Guestrin, C

    Chen, T. \ \ Guestrin, C. 2016 . Xgboost: A scalable tree boosting system \ In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , KDD '16, \ 785–794, New York, NY, USA. Association for Computing Machinery

  6. [6]

    Chinneck, J. W. \ \ Dravnieks, E. W. 1991 . Locating minimal infeasible constraint sets in linear programs \ ORSA Journal on Computing , 3\/ (2), 157--168

  7. [7]

    \ \ Cousot, R

    Cousot, P. \ \ Cousot, R. 1977 . Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints \ In Proceedings of the 4th ACM SIGACT-SIGPLAN symposium on Principles of Programming Languages (POPL) , \ 238--252, Los Angeles, California. ACM Press, New York, NY

  8. [8]

    \ \ Hirth, A

    Darwiche, A. \ \ Hirth, A. 2020 . On the reasons behind decisions \ In In Proceedings of the 24th European Conference on Artificial Intelligence (ECAI) , \ 712--720. IOS Press

  9. [9]

    L., \ Aiken, A

    Dillig, I., Dillig, T., McMillan, K. L., \ Aiken, A. 2012 . Minimum satisfying assignments for smt \ In International Conference on Computer Aided Verification , \ 394--409. Springer

  10. [10]

    Friedman, J. H. 2001 . Greedy function approximation: A gradient boosting machine \ The Annals of Statistics , 29\/ (5), 1189--1232

  11. [11]

    Y., \ Turek, M

    Gunning, D., Vorm, E., Wang, J. Y., \ Turek, M. 2021 . DARPA 's explainable AI ( XAI ) program: A retrospective \ Applied AI Letters , 2\/ (4)

  12. [12]

    Hadji Misheva, B., Jaggi, D., Posth, J.-A., Gramespacher, T., \ Osterrieder, J. 2021 . Audience-dependent explanations for ai-based risk management tools: A survey \ Frontiers in Artificial Intelligence , 4

  13. [13]

    J., \ Marques-Silva, J

    Ignatiev, A., Izza, Y., Stuckey, P. J., \ Marques-Silva, J. 2022 . Using maxsat for efficient explanations of tree ensembles \ In Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence . AAAI Press

  14. [14]

    \ \ Marques-Silva, J

    Ignatiev, A. \ \ Marques-Silva, J. 2021 . Sat-based rigorous explanations for decision lists \ In Li, C.-M. \ \ Many \`a , F. , Theory and Applications of Satisfiability Testing -- SAT 2021 , \ 251--269, Cham. Springer International Publishing

  15. [15]

    Ignatiev, A., Narodytska, N., \ Marques-Silva, J. 2019 . Abduction-based explanations for machine learning models \ In Proceedings of the AAAI Conference on Artificial Intelligence , 33, \ 1511--1519

  16. [16]

    Ignatiev, A., Previti, A., \ Marques-Silva, J. 2016 . On finding minimum satisfying assignments \ In International Conference on Principles and Practice of Constraint Programming , \ 287--297. Springer

  17. [17]

    T., \ Alpaydin, E

    Irsoy, O., Yildiz, O. T., \ Alpaydin, E. 2012 . Soft decision trees \ In International Conference on Pattern Recognition . IEEE

  18. [18]

    Izza, Y., Ignatiev, A., \ Marques-Silva, J. 2022 . On tackling explanation redundancy in decision trees \ Journal of Artificial Intelligence Research , 75 , 61--321

  19. [19]

    \ \ Silva, J

    Izza, Y. \ \ Silva, J. M. 2021 . On explaining random forests with sat \ In 30th International Joint Conference on Artificial Intelligence . International Joint Conferences on Artifical Intelligence (IJCAI)

  20. [20]

    J., \ Durresi, A

    Kaur, D., Uslu, S., Rittichier, K. J., \ Durresi, A. 2022 . Trustworthy artificial intelligence: A review \ ACM Computing Surveys (CSUR) , 55\/ (2), 1--38

  21. [21]

    La Malfa, E., Zbrzezny, A., Michelmore, R., Paoletti, N., \ Kwiatkowska, M. 2021 . On guaranteed optimal robust explanations for nlp models \ In 30th International Joint Conference on Artificial Intelligence , \ 2658--2665. International Joint Conferences on Artifical Intelligence (IJCAI)

  22. [22]

    Liffiton, M. H. \ \ Malik, A. 2013 . Enumerating infeasibility: Finding multiple MUS es quickly \ In International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research , \ 160--175. Springer

  23. [23]

    H., Previti, A., Malik, A., \ Marques-Silva, J

    Liffiton, M. H., Previti, A., Malik, A., \ Marques-Silva, J. 2016 . Fast, flexible mus enumeration \ Constraints , 21\/ (2), 223--250

  24. [24]

    C., Ignatiev, A., \ Narodytska, N

    Marques-Silva, J., Gerspacher, T., Cooper, M. C., Ignatiev, A., \ Narodytska, N. 2021 . Explanations for monotonic classifiers \ In International Conference on Machine Learning , \ 7469--7479. PMLR

  25. [25]

    Marquis, P. 1991 . Extending abduction from propositional to first-order logic \ In International Workshop on Fundamentals of Artificial Intelligence Research , \ 141--155. Springer

  26. [26]

    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al. 2011 . Scikit-learn: Machine learning in P ython \ Journal of machine learning research , 12 , 2825--2830

  27. [27]

    \ \ Marques-Silva, J

    Previti, A. \ \ Marques-Silva, J. 2013 . Partial mus enumeration \ In Twenty-Seventh AAAI Conference on Artificial Intelligence

  28. [28]

    V., \ Gulin, A

    Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., \ Gulin, A. 2018 . Catboost: unbiased boosting with categorical features \ In Advances in Neural Information Processing Systems (NIPS)

  29. [29]

    Quine, W. V. 1952 . The problem of simplifying truth functions \ The American mathematical monthly , 59\/ (8), 521--531

  30. [30]

    \ \ Zanella, M

    Ranzato, F. \ \ Zanella, M. 2020 . Abstract interpretation of decision tree ensemble classifiers \ In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence , \ 5478--5486. AAAI Press

  31. [31]

    Ras, G., Xie, N., van Gerven, M., \ Doran, D. 2022 . Explainable deep learning: A field guide for the uninitiated \ Journal of Artificial Intelligence Research , 73 , 329--397

  32. [32]

    Shih, A., Choi, A., \ Darwiche, A. 2018 . A symbolic approach to explaining bayesian network classifiers \ In Proceedings of the 27th International Joint Conference on Artificial Intelligence , \ 5103--5111

  33. [33]

    Shih, A., Choi, A., \ Darwiche, A. 2019 . Compiling bayesian network classifiers into decision graphs \ In Proceedings of the AAAI Conference on Artificial Intelligence , 33, \ 7966--7974

  34. [34]

    \ \ Armon, A

    Shwartz-Ziv, R. \ \ Armon, A. 2022 . Tabular data: Deep learning is not all you need \ Information Fusion , 81 , 84--90

  35. [35]

    \ \ Nadjm-Tehrani, S

    T \"o rnblom, J. \ \ Nadjm-Tehrani, S. 2019 . An abstraction-refinement approach to formal verification of tree ensembles \ In International Conference on Computer Safety, Reliability, and Security (SAFECOMP) , \ 301--313. Springer

  36. [36]

    \ \ Nadjm-Tehrani, S

    T \"o rnblom, J. \ \ Nadjm-Tehrani, S. 2021 . Scaling up memory-efficient formal verification tools for tree ensembles \ arXiv preprint , arXiv:2105.02595

  37. [37]

    Wang, F., Wang, Q., Nie, F., Yu, W., \ Wang, R. 2018 . Efficient tree classifiers for large scale datasets \ Neurocomputing , 284

  38. [38]

    write newline

    " write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...