pith. sign in

arxiv: 2605.06561 · v1 · submitted 2026-05-07 · 💻 cs.LG

Optimal Counterfactual Search in Tree Ensembles: A Study Across Modeling and Solution Paradigms

Pith reviewed 2026-05-08 12:21 UTC · model grok-4.3

classification 💻 cs.LG
keywords counterfactual explanationstree ensemblesconstraint programmingMaxSATMILPoptimal recourseexplainable AIdecision trees
0
0 comments X

The pith

Constraint programming finds optimal minimal counterfactuals for tree ensembles more reliably than MaxSAT or MILP across most regimes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a constraint programming method called CPCF that encodes tree splits as fixed interval domains to compute the smallest feature changes needed to flip a model decision while respecting plausibility constraints. A sympathetic reader would care because suboptimal counterfactuals can assign unnecessarily costly recommendations to some individuals while giving others feasible recourse, undermining trust in automated decisions. The authors compare this CP approach against an extended MaxSAT formulation and the leading MILP method on ten datasets and three ensemble types, measuring scalability and performance under different distance metrics. They find that CP delivers the strongest overall results, with MaxSAT excelling on hard-voting forests and MILP staying competitive when inferences are precomputed and split counts remain moderate.

Core claim

We present CPCF, a compact finite-domain constraint programming formulation for optimal counterfactual search in tree ensembles. Numerical features are represented as interval domains induced by the split thresholds present in the trees, while discrete features use their native domains; this structure supports multiple distance objectives without any continuous search over split boundaries. When placed against an extended MaxSAT solver for soft-voting ensembles and the current state-of-the-art MILP formulation, CP achieves the best overall performance. The experiments further reveal clear regimes: CP is most versatile, MaxSAT handles hard-voting ensembles particularly efficiently, and MILP,

What carries the argument

CPCF, a constraint programming formulation that encodes numerical features as fixed interval domains induced by all split thresholds in the ensemble and discrete features with native finite domains, allowing direct optimization of distance objectives without boundary search.

If this is right

  • CP serves as the default solver for optimal counterfactual search on tree ensembles in most practical settings due to its versatility and strong anytime performance.
  • MaxSAT should be preferred when the ensemble uses hard voting, as it exploits the Boolean structure more efficiently than the other paradigms.
  • MILP remains competitive for amortized inference pipelines where the model is fixed in advance and the number of distinct split levels stays moderate.
  • Performance differences across paradigms grow with ensemble size and the choice of distance metric, so practitioners can match the solver to the voting type and split complexity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The interval-domain encoding may transfer to other combinatorial explanation tasks, such as finding minimal interventions in causal graphs derived from tree models.
  • Tool developers could maintain a small portfolio of solvers and route queries based on observed voting type and split count to improve average response time.
  • Exact optimality matters for fairness: if any method systematically returns higher-cost explanations for certain demographic groups, the choice of paradigm directly affects equitable recourse.

Load-bearing premise

All relevant split thresholds can be pre-extracted into fixed interval domains without missing optimal solutions that would require crossing or adjusting boundaries dynamically.

What would settle it

A dataset and trained tree ensemble where the true minimal-distance counterfactual requires a feature value strictly between two pre-extracted thresholds, causing the CP solver to return a strictly higher-distance valid solution or declare no solution exists.

Figures

Figures reproduced from arXiv: 2605.06561 by Awa Khouna, Julien Ferry, Thibaut Vidal, Youssouf Emine.

Figure 1
Figure 1. Figure 1: Decision diagram summarizing the best-performing optimal counterfactual formulation for view at source ↗
Figure 2
Figure 2. Figure 2: Cactus plots for the default ensemble configuration, aggregated over all ten datasets. Each view at source ↗
Figure 3
Figure 3. Figure 3: Anytime solution quality for CPCF and OCEAN for the default ensemble configuration, aggregated over all ten datasets and restricted to queries solved to optimality by both methods (which covers more than 99% of the queries for all three ensemble types). The curve reports the mean normalized cost of the best incumbent objective; lower is better. time includes both model-building time and solver time, where … view at source ↗
Figure 4
Figure 4. Figure 4: Median total time to optimality (error bars indicate the first and third quartiles) with view at source ↗
Figure 5
Figure 5. Figure 5: Distribution of total times to optimality for view at source ↗
read the original abstract

Trust in counterfactual explanations depends critically on whether their recommended changes are truly minimal: suboptimal explanations may vastly overshoot the actual changes needed to alter a decision, and heuristic errors can affect individuals unevenly, giving some users relevant recourse while assigning others unnecessarily costly recommendations. Consequently, we study the problem of computing optimal counterfactual explanations for tree ensembles under plausibility and actionability constraints. This is a combinatorial problem: for a fixed model, counterfactual search boils down to selecting consistent branching decisions and threshold-defined regions under a distance objective. We exploit this structure through CPCF, a constraint programming (CP) formulation in which numerical features are encoded as interval domains induced by split thresholds, while discrete features retain native finite-domain representations. This yields a compact finite-domain formulation that supports multiple distance objectives without continuous split-boundary search. We then place CPCF in a broader comparison across mathematical programming paradigms: we extend a maximum Boolean satisfiability (MaxSAT) formulation, originally designed for hard-voting random forests, to soft-voting ensembles, and compare against the current state-of-the-art mixed-integer linear programming (MILP) optimal approach. Across ten datasets and three types of tree ensembles, we analyze scalability, anytime performance, and sensitivity to distance metrics. We observe that CP achieves the best overall performance. More importantly, our results identify regimes in which the specific strengths of each paradigm make it best suited: CP is most versatile overall, MaxSAT handles hard-voting ensembles particularly well, and MILP remains competitive in amortized inference settings with a moderate number of split levels.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript introduces CPCF, a constraint programming formulation for optimal counterfactual explanations in tree ensembles. Numerical features are encoded via fixed interval domains induced by pre-extracted split thresholds, while discrete features use native finite domains; this supports multiple distance objectives (L1/L2) without continuous boundary search. The CP approach is compared against an extended MaxSAT formulation (now handling soft-voting) and the state-of-the-art MILP method across ten datasets and three ensemble types (random forests, gradient boosting, XGBoost). Experiments evaluate scalability, anytime performance, and sensitivity to distance metrics, concluding that CP achieves the best overall performance while MaxSAT excels on hard-voting ensembles and MILP remains competitive in amortized settings with moderate split levels.

Significance. If the empirical results hold, the work delivers a rigorous, multi-paradigm comparison that identifies concrete regimes where each solver is preferable, providing actionable guidance for deploying optimal counterfactual search in practice. The consistent findings across datasets and the exploitation of tree structure for compact finite-domain encodings strengthen the contribution to explainable AI and combinatorial optimization in ML. Strengths include direct experimental validation of performance claims without reliance on self-referential derivations.

minor comments (2)
  1. [§3.2] §3.2: The extension of the MaxSAT formulation to soft-voting ensembles is described at a high level; adding a short pseudocode snippet or explicit objective transformation would improve reproducibility.
  2. [Table 4, Figure 3] Table 4 and Figure 3: The anytime curves for CP and MILP on datasets with >20 split levels would benefit from explicit annotation of the point at which each method first reaches the optimal solution.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our work on CPCF, the multi-paradigm comparison, and the recommendation to accept the manuscript. The review correctly identifies the key strengths of the compact finite-domain CP encoding and the actionable guidance provided by the experimental regimes.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper is an empirical comparison of CP, MaxSAT, and MILP solvers for optimal counterfactual search in tree ensembles. All performance claims (CP most versatile overall, regime-specific strengths) rest on direct experimental measurements across ten datasets and three ensemble types, with no mathematical derivation or first-principles result that reduces to fitted inputs or self-referential definitions. The CPCF encoding pre-extracts split thresholds into fixed interval domains; this is a standard, non-circular modeling step that exactly captures the piecewise-constant structure of trees without missing optima, as any counterfactual must lie in one such cell. No self-citation chains, uniqueness theorems, or ansatzes are load-bearing for the central results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard assumption that decision tree splits induce fixed, exhaustive interval domains for numerical features and that distance objectives can be expressed directly over these domains.

axioms (1)
  • domain assumption Tree ensemble splits define fixed, exhaustive interval domains for all numerical features
    Invoked when encoding numerical features as intervals induced by split thresholds

pith-pipeline@v0.9.0 · 5593 in / 1123 out tokens · 25812 ms · 2026-05-08T12:21:10.960372+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 2 canonical work pages

  1. [1]

    Proceedings of the 38th International Conference on Machine Learning , pages =

    Axel Parmentier and Thibaut Vidal , title =. Proceedings of the 38th International Conference on Machine Learning , pages =

  2. [2]

    Proceedings of the 28th European Conference on Artificial Intelligence , year =

    Alesya Raevskaya and Tuomo Lehtonen , title =. Proceedings of the 28th European Conference on Artificial Intelligence , year =

  3. [3]

    Harvard Jounal of Law & Technology , volume =

    Counterfactual explanations without opening the black box: Automated decisions and the GDPR , author =. Harvard Jounal of Law & Technology , volume =. 2017 , publisher =

  4. [4]

    ACM Computing Surveys , volume =

    A survey of algorithmic recourse: contrastive explanations and consequential recommendations , author =. ACM Computing Surveys , volume =

  5. [5]

    International conference on artificial intelligence and statistics , pages =

    Model-agnostic counterfactual explanations for consequential decisions , author =. International conference on artificial intelligence and statistics , pages =

  6. [6]

    & Hines, K

    Sahil Verma and John P. Dickerson and Keegan Hines , title =. arXiv preprint arXiv:2010.10596 , year =

  7. [7]

    ACM Computing Surveys , volume =

    Counterfactual explanations and algorithmic recourses for machine learning: A review , author =. ACM Computing Surveys , volume =. 2024 , publisher =

  8. [8]

    2019 , eprint =

    Towards Realistic Individual Recourse and Actionable Explanations in Black-Box Decision Making Systems , author =. 2019 , eprint =

  9. [9]

    FACE: Feasible and Actionable Counterfactual Explanations , booktitle =

    Poyiadzi, Rafael and Sokol, Kacper and Santos-Rodriguez, Raul and De Bie, Tijl and Flach, Peter , year =. FACE: Feasible and Actionable Counterfactual Explanations , booktitle =

  10. [10]

    2020 , eprint =

    Interpretable Counterfactual Explanations Guided by Prototypes , author =. 2020 , eprint =

  11. [11]

    arXiv preprint arXiv:2101.10123 , year =

    Conditional generative models for counterfactual explanations , author =. arXiv preprint arXiv:2101.10123 , year =

  12. [12]

    ICML Workshop on Algorithmic Recourse , year =

    CounterNet: End-to-end training of counterfactual aware predictions , author =. ICML Workshop on Algorithmic Recourse , year =

  13. [13]

    Uncertainty in Artificial Intelligence , pages =

    Countergan: Generating counterfactuals for real-time recourse and interpretability using residual gans , author =. Uncertainty in Artificial Intelligence , pages =

  14. [14]

    1984 , publisher =

    Classification and Regression Trees , author =. 1984 , publisher =

  15. [15]

    Random Forests , volume =

    Breiman, Leo , journal =. Random Forests , volume =

  16. [16]

    Friedman , journal =

    Jerome H. Friedman , journal =. Greedy Function Approximation: A Gradient Boosting Machine , volume =

  17. [17]

    Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , booktitle =

    Tolomei, Gabriele and Silvestri, Fabrizio and Haines, Andrew and Lalmas, Mounia , year =. Interpretable Predictions of Tree-based Ensembles via Actionable Feature Tweaking , booktitle =

  18. [18]

    Proceedings of the Thirty-Sixth AAAI conference on artificial intelligence , volume =

    FOCUS: Flexible optimizable counterfactual explanations for tree ensembles , author =. Proceedings of the Thirty-Sixth AAAI conference on artificial intelligence , volume =

  19. [19]

    Proceedings of the 22nd

    Xgboost: A scalable tree boosting system , author=. Proceedings of the 22nd

  20. [20]

    2006 , publisher=

    Handbook of constraint programming , author=. 2006 , publisher=

  21. [21]

    2008 Eighth

    Isolation forest , author=. 2008 Eighth

  22. [22]

    Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence , author =

    Very Fast, Approximate Counterfactual Explanations for Decision Forests , volume =. Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence , author =. 2023 , pages =

  23. [23]

    2025 , issn =

    Efficient and effective counterfactual explanations for random forests , journal =. 2025 , issn =

  24. [24]

    Computer Aided Verification , volume =

    Armin Biere and Tobias Faller and Katalin Fazekas and Mathias Fleury and Nils Froleyks and Florian Pollitt , editor =. Computer Aided Verification , volume =

  25. [25]

    CP-SAT , version =

    Laurent Perron and Frédéric Didier , organization =. CP-SAT , version =

  26. [26]

    Scikit-learn: Machine Learning in Python , journal =

    Fabian Pedregosa and Ga. Scikit-learn: Machine Learning in Python , journal =. 2011 , volume =

  27. [27]

    2026 , eprint=

    Counterfactual Maps: What They Are and How to Find Them , author=. 2026 , eprint=

  28. [28]

    Data Mining and Knowledge Discovery , volume=

    Counterfactual explanations and how to find them: literature review and benchmarking , author=. Data Mining and Knowledge Discovery , volume=. 2024 , publisher=

  29. [29]

    Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

    Optimal action extraction for random forests and boosted trees , author=. Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining , pages=

  30. [30]

    Information Fusion , volume=

    Random forest explainability using counterfactual sets , author=. Information Fusion , volume=. 2020 , publisher=

  31. [31]

    SMT Workshop 2015 , year=

    PySMT: a solver-agnostic library for fast prototyping of SMT-based algorithms , author=. SMT Workshop 2015 , year=

  32. [32]

    2008 , booktitle =

    De Moura, Leonardo and Bj. 2008 , booktitle =

  33. [33]

    Translating Pseudo-Boolean Constraints into

    Een, Niklas and Sörensson, Niklas , year =. Translating Pseudo-Boolean Constraints into

  34. [34]

    Theory and Applications of Satisfiability Testing -- SAT 2015 , volume=

    Philipp, Tobias and Steinke, Peter , pages=. Theory and Applications of Satisfiability Testing -- SAT 2015 , volume=. 2015 , isbn=

  35. [35]

    and Martins, Pedro and Schiffer, Maximilian and Serra, Thiago and Vidal, Thibaut , title =

    Florio, Alexandre M. and Martins, Pedro and Schiffer, Maximilian and Serra, Thiago and Vidal, Thibaut , title =. Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence , volume=. 2023 , isbn =

  36. [36]

    2010 , howpublished =

    Charytanowicz, Magorzata and Niewczas, Jerzy and Kulczycki, Piotr and Kowalski, Piotr and Lukasik, Szymon , title =. 2010 , howpublished =

  37. [37]

    1990 , howpublished =

    Wolberg, WIlliam , title =. 1990 , howpublished =