Optimal Recourse Summaries via Bi-Objective Decision Tree Learning

Athanasios Voulodimos; Giorgos Stamou; Ioannis Chatzis; Jason Liartis

arxiv: 2605.07598 · v2 · pith:XPMPOXGZnew · submitted 2026-05-08 · 💻 cs.LG

Optimal Recourse Summaries via Bi-Objective Decision Tree Learning

Ioannis Chatzis , Jason Liartis , Athanasios Voulodimos , Giorgos Stamou This is my paper

Pith reviewed 2026-05-22 10:36 UTC · model grok-4.3

classification 💻 cs.LG

keywords recourse summariesactionable recoursedecision tree learningPareto frontbi-objective optimizationmachine learning fairnessglobal auditing

0 comments

The pith

SOGAR learns recourse summaries by solving a bi-objective decision tree problem to recover the full Pareto front of effectiveness versus cost.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to build global recourse summaries that split the population into subgroups and give every member of a subgroup the same action to flip an unfavorable classifier decision. It casts the design of those summaries as a bi-objective optimization task solved by learning optimal decision trees, with one objective pushing for actions that work for more people and the other pushing for lower-cost actions. Computing the entire Pareto front produces every non-dominated solution, so a practitioner can choose any desired balance after training finishes instead of locking in a single compromise during learning. This matters because local recourse actions do not aggregate cleanly for system-wide checks on fairness or bias, and earlier summary techniques forced users into one fixed trade-off that might not fit their needs.

Core claim

SOGAR formulates recourse summary learning as an optimal decision tree learning problem and finds the Pareto front, the complete set of solutions where improving one objective necessarily worsens the other. Using shallow axis-parallel decision trees and sparse leaf actions, SOGAR produces stable, low-cost, and effective recourse summaries that outperform existing approaches across effectiveness and cost metrics.

What carries the argument

Bi-objective optimal decision tree learning that enumerates the Pareto front of population partitions and shared leaf actions for recourse summaries.

If this is right

Any point on the effectiveness-cost trade-off can be selected after training without retraining the model.
The resulting summaries assign consistent actions within each population subgroup, supporting global auditing.
The summaries achieve better joint performance on both objectives than previous single-objective or heuristic approaches.
Stability of the summaries follows directly from the global partition produced by the decision tree.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bi-objective tree search could be applied to other partitioning tasks that balance multiple fairness or utility criteria.
Deeper or oblique trees might yield finer partitions on complex data, though the paper restricts itself to shallow axis-parallel ones.
The Pareto front view makes it possible to study how the optimal recourse action changes as the cost budget is tightened.

Load-bearing premise

Shallow axis-parallel decision trees with sparse leaf actions are expressive enough to produce stable global recourse summaries that improve on prior methods in both effectiveness and cost at the same time.

What would settle it

A benchmark dataset where SOGAR's Pareto summaries fail to dominate prior summary methods on effectiveness while keeping cost equal or lower would falsify the performance claim.

Figures

Figures reproduced from arXiv: 2605.07598 by Athanasios Voulodimos, Giorgos Stamou, Ioannis Chatzis, Jason Liartis.

**Figure 1.** Figure 1: SOGAR vs. baselines on Employee Attrition. Closest to (0, 0) is better. Blue dots show the full SOGAR trade-off; other methods produce a single solution each. To address this, we introduce Summaries of Optimal and Global Actionable Recourse (SOGAR), which formulates recourse summary learning as a bi-objective optimal decision tree problem, the objectives being the recourse cost and loss. Each summary is … view at source ↗

**Figure 2.** Figure 2: The figure depicts the end-to-end pipeline of SOGAR. From left to right, the process starts [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Pareto front on Employee Attrition (LightGBM, depth 3). Blue dashed line: train Pareto front. Orange dots: held-out test metrics. Average Euclidean distance: 0.04 ± 0.02 . Experiments presented in this section were conducted on a machine with i9-14900KS CPU (24 cores), 128GB RAM, and RTX 4090 GPU (24GB VRAM). The SOGAR optimization task is implemented as a STreeD task in C++17. The remainder of the implem… view at source ↗

**Figure 4.** Figure 4: Loss and cost disparity across the Pareto front. The cost gap is persistent while the loss gap [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗

**Figure 5.** Figure 5: Invalidity (∆I = 0.093) across the Pareto front. Female group average is consistently above male across the entire front. Findings. SOGAR generates 4,142 Paretooptimal solutions. Across the front, 96.4% of solutions exhibit higher invalidity for females than males on training instances (mean gap ∆I = +0.093; test instances: 95.9%, +0.081) [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: Solutions recovered by SOGAR on Bank (LightGBM) under different timeout values. Blue: [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

read the original abstract

Actionable Recourse provides individuals with actions they can take to change an unfavorable classifier outcome. While useful at the instance level, it is ill-suited for global auditing and bias detection, since aggregating local actions is costly and often inconsistent. Recourse Summaries address this limitation by partitioning the population and assigning one shared action per subgroup, enabling comparison across subgroups. Designing summaries involves a fundamental trade-off between recourse effectiveness and recourse cost, which existing methods do not adequately address. We introduce Summaries of Optimal and Global Actionable Recourse (SOGAR), which formulates recourse summary learning as an optimal decision tree learning problem and finds the Pareto front -- the complete set of solutions where improving one objective necessarily worsens the other. SOGAR enables post-hoc selection of the desired trade-off without retraining. Using shallow axis-parallel decision trees and sparse leaf actions, SOGAR produces stable, low-cost, and effective recourse summaries that outperform existing approaches across effectiveness and cost metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces SOGAR, which formulates recourse summary learning as a bi-objective optimal decision tree learning problem. It computes the Pareto front over the trade-off between recourse effectiveness and cost using shallow axis-parallel decision trees with sparse per-leaf actions, enabling post-hoc selection of the desired balance without retraining and claiming to produce stable summaries that outperform prior methods on both objectives.

Significance. If the central claims hold, the work would offer a principled mechanism for generating global recourse summaries suitable for classifier auditing and bias detection, with the Pareto-front property providing flexibility that existing single-objective methods lack. The bi-objective optimal-tree formulation is a clear technical contribution that avoids repeated retraining.

major comments (2)

[Abstract] Abstract: the claim that SOGAR 'outperform[s] existing approaches across effectiveness and cost metrics' is presented without any reference to datasets, baselines, quantitative deltas, error bars, or ablation studies; because the outperformance is load-bearing for the contribution, this absence prevents evaluation of whether the reported Pareto front is actually superior.
[Method] Method formulation (as described in the abstract): the approach rests on the assumption that shallow axis-parallel splits plus sparse leaf actions suffice to partition the input space into regions where a single action is near-optimal. In regimes with non-linear interactions or high-dimensional dependence, leaves may contain heterogeneous subgroups, forcing either high-cost or low-effectiveness actions and yielding a Pareto front that can be dominated by methods employing richer partition families.

minor comments (1)

[Abstract] Abstract: the acronym SOGAR is defined after its first use; expanding it on first appearance would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below, indicating planned revisions where appropriate while defending the core contributions on substantive grounds.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that SOGAR 'outperform[s] existing approaches across effectiveness and cost metrics' is presented without any reference to datasets, baselines, quantitative deltas, error bars, or ablation studies; because the outperformance is load-bearing for the contribution, this absence prevents evaluation of whether the reported Pareto front is actually superior.

Authors: The abstract is a concise summary of the paper's main claims and results. The full experimental evaluation, including the specific datasets (Adult, German Credit, and others), baseline methods (e.g., single-objective recourse summary approaches), quantitative deltas with standard deviations across multiple runs, and ablation studies on tree depth and action sparsity, is reported in the Experiments section with accompanying tables and figures. To improve self-containment of the abstract without altering its length substantially, we will add a brief clause referencing the key empirical outcomes on these benchmarks. revision: yes
Referee: [Method] Method formulation (as described in the abstract): the approach rests on the assumption that shallow axis-parallel splits plus sparse leaf actions suffice to partition the input space into regions where a single action is near-optimal. In regimes with non-linear interactions or high-dimensional dependence, leaves may contain heterogeneous subgroups, forcing either high-cost or low-effectiveness actions and yielding a Pareto front that can be dominated by methods employing richer partition families.

Authors: We acknowledge that restricting to shallow axis-parallel trees and sparse leaf actions is a deliberate modeling choice that prioritizes interpretability and tractability of the bi-objective optimization. This family may indeed leave residual heterogeneity in leaves under strong non-linear or high-dimensional interactions, potentially yielding a Pareto front that could be improved by richer partitions. However, the optimization explicitly searches for the best tree and actions within the chosen family, and our results show it produces non-dominated fronts relative to prior single-objective methods on the evaluated recourse tasks. We will add a limitations paragraph discussing this assumption and noting that extensions to oblique or deeper trees remain compatible with the bi-objective framework. revision: partial

Circularity Check

0 steps flagged

No circularity: new bi-objective DT formulation presented as modeling choice

full rationale

The paper formulates recourse summary learning as a bi-objective optimal decision tree problem to compute the Pareto front of effectiveness vs. cost trade-offs. This is introduced as a direct optimization setup using shallow axis-parallel trees and sparse leaf actions, without any quoted reduction of outputs to fitted inputs, self-definitional equations, or load-bearing self-citations that would make the central result equivalent to its premises by construction. The abstract and described approach treat the tree structure and action sparsity as explicit design decisions rather than derived quantities, and the outperformance claim rests on empirical comparison rather than algebraic equivalence to prior results. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The approach relies on standard assumptions of decision tree learning and multi-objective optimization; no new free parameters, axioms, or invented entities are introduced in the abstract description.

pith-pipeline@v0.9.0 · 5707 in / 1047 out tokens · 38160 ms · 2026-05-22T10:36:11.730658+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We define the optimization task o=⟨g,t,≻,⊕,c,s0⟩ ... comparison operator ≻ is the strict Pareto dominance ... combining operator ⊕ is element-wise addition
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat ≃ Nat recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

recourse summary tree optimization is a separable optimization task ... enabling the use of their dynamic programming framework, named STreeD

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 2 internal anchors

[1]

Strong optimal classification trees.arXiv preprint arXiv:2103.15965, 2021

Sina Aghaei, Andrés Gómez, and Phebe Vayanos. Strong optimal classification trees.arXiv preprint arXiv:2103.15965, 2021

work page arXiv 2021
[2]

Learning optimal decision trees using caching branch-and-bound search

Gaël Aglin, Siegfried Nijssen, and Pierre Schaus. Learning optimal decision trees using caching branch-and-bound search. InProceedings of the AAAI conference on artificial intelligence, volume 34, pages 3146–3153, 2020

work page 2020
[3]

Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20

work page doi:10.24432/c5xw20 1996
[4]

Optimal classification trees.Machine Learning, 106(7):1039–1082, July 2017

Dimitris Bertsimas and Jack Dunn. Optimal classification trees.Machine Learning, 106(7):1039–1082, July 2017

work page 2017
[5]

Interpretable Clustering via Optimal Trees

Dimitris Bertsimas, Agni Orfanoudaki, and Holly Wiberg. Interpretable clustering via optimal trees.arXiv preprint arXiv:1812.00539, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

Counterfactual metarules for local and global recourse.arXiv preprint arXiv:2405.18875, 2024

Tom Bewley, Salim I Amoukou, Saumitra Mishra, Daniele Magazzeni, and Manuela Veloso. Counterfactual metarules for local and global recourse.arXiv preprint arXiv:2405.18875, 2024

work page arXiv 2024
[7]

Classification and regression trees

L Breiman, JH Friedman, R Olshen, and CJ Stone. Classification and regression trees. 1984

work page 1984
[8]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 785–794, New York, NY , USA, 2016. Association for Computing Machinery

work page 2016
[9]

Counterfac- tuals and causability in explainable artificial intelligence: Theory, algorithms, and applications

Yu-Liang Chou, Catarina Moreira, Peter Bruza, Chun Ouyang, and Joaquim Jorge. Counterfac- tuals and causability in explainable artificial intelligence: Theory, algorithms, and applications. Information Fusion, 81:59–83, May 2022

work page 2022
[10]

Murtree: Optimal decision trees via dynamic programming and search.Journal of Machine Learning Research, 23(26):1–47, 2022

Emir Demirovi´c, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Peter J Stuckey. Murtree: Optimal decision trees via dynamic programming and search.Journal of Machine Learning Research, 23(26):1–47, 2022

work page 2022
[11]

A survey of methods for explaining black box models.ACM computing surveys (CSUR), 51(5):1–42, 2018

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models.ACM computing surveys (CSUR), 51(5):1–42, 2018

work page 2018
[12]

Statlog (German Credit Data)

Hans Hofmann. Statlog - german credit data. UCI Machine Learning Repository, 1994. DOI: https://doi.org/10.24432/C5NC77

work page doi:10.24432/c5nc77 1994
[13]

Optimal sparse decision trees.Advances in neural information processing systems, 32, 2019

Xiyang Hu, Cynthia Rudin, and Margo Seltzer. Optimal sparse decision trees.Advances in neural information processing systems, 32, 2019

work page 2019
[14]

Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5(1):15–17, 1976. 10

work page 1976
[15]

Counterfactual explanation trees: Transparent and consistent actionable recourse with decision trees

Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Yuichi Ike. Counterfactual explanation trees: Transparent and consistent actionable recourse with decision trees. InInternational Conference on Artificial Intelligence and Statistics, pages 1846–1870. PMLR, 2022

work page 2022
[16]

Algorithmic recourse: from counterfactual explanations to interventions

Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. Algorithmic recourse: from counterfactual explanations to interventions. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, page 353–362, New York, NY , USA, March 2021. Association for Computing Machinery

work page 2021
[17]

Glance: Global actions in a nutshell for counterfactual explainability, 2025

Loukas Kavouras, Eleni Psaroudaki, Konstantinos Tsopelas, Dimitrios Rontogiannis, Nikolaos Theologitis, Dimitris Sacharidis, Giorgos Giannopoulos, Dimitrios Tomaras, Kleopatra Markou, Dimitrios Gunopulos, Dimitris Fotakis, and Ioannis Emiris. Glance: Global actions in a nutshell for counterfactual explainability, 2025

work page 2025
[18]

Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

work page 2017
[19]

Globe-ce: A translation-based approach for global counterfactual explanations

Dan Ley, Saumitra Mishra, and Daniele Magazzeni. Globe-ce: A translation-based approach for global counterfactual explanations. (arXiv:2305.17021), December 2023. arXiv:2305.17021 [cs]

work page arXiv 2023
[20]

Generalized and scalable optimal sparse decision trees

Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, and Margo Seltzer. Generalized and scalable optimal sparse decision trees. (arXiv:2006.08690), November 2022. arXiv:2006.08690 [cs]

work page arXiv 2006
[21]

A Unified Approach to Interpreting Model Predictions

Scott Lundberg and Su-In Lee. A unified approach to interpreting model predictions. (arXiv:1705.07874), November 2017. arXiv:1705.07874 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[22]

Fast sparse decision tree optimization via reference ensembles

Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, and Margo Seltzer. Fast sparse decision tree optimization via reference ensembles. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 9604–9613, 2022

work page 2022
[23]

S. Moro, P. Rita, and P. Cortez. Bank marketing. UCI Machine Learning Repository, 2014. DOI: https://doi.org/10.24432/C5K306

work page doi:10.24432/c5k306 2014
[24]

Learning optimal decision trees with sat

Nina Narodytska, Alexey Ignatiev, Filipe Pereira, and Joao Marques-Silva. Learning optimal decision trees with sat. InInternational Joint Conference on Artificial Intelligence 2018, pages 1362–1368. Association for the Advancement of Artificial Intelligence (AAAI), 2018

work page 2018
[25]

Mining optimal decision trees from itemset lattices

Siegfried Nijssen and Elisa Fromont. Mining optimal decision trees from itemset lattices. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 530–539, 2007

work page 2007
[26]

5: Programs for Machine Learning

J Ross Quinlan.C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993

work page 1993
[27]

Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses.Advances in Neural Information Processing Systems, 33:12187–12198, 2020

Kaivalya Rawal and Himabindu Lakkaraju. Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses.Advances in Neural Information Processing Systems, 33:12187–12198, 2020

work page 2020
[28]

why should i trust you?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. " why should i trust you?" explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

work page 2016
[29]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature machine intelligence, 1(5):206–215, 2019

Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature machine intelligence, 1(5):206–215, 2019

work page 2019
[30]

Ibm hr analytics employee attrition & performance dataset

Pavan Subhash. Ibm hr analytics employee attrition & performance dataset. https://www. kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset ,

work page
[31]

Accessed: 2025-11-04

work page 2025
[32]

Actionable recourse in linear classification

Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, page 10–19, New York, NY , USA, January 2019. Association for Computing Machinery. 11

work page 2019
[33]

Necessary and sufficient con- ditions for optimal decision trees using dynamic programming.Advances in Neural Information Processing Systems, 36:9173–9212, 2023

Jacobus van der Linden, Mathijs de Weerdt, and Emir Demirovi´c. Necessary and sufficient con- ditions for optimal decision trees using dynamic programming.Advances in Neural Information Processing Systems, 36:9173–9212, 2023

work page 2023
[34]

Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018

Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018

work page 2018
[35]

The four-fifths rule is not disparate impact: a woeful tale of epistemic trespassing in algorithmic fairness

Elizabeth Anne Watkins and Jiahao Chen. The four-fifths rule is not disparate impact: a woeful tale of epistemic trespassing in algorithmic fairness. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, pages 764–775, 2024

work page 2024
[36]

translates

Rui Zhang, Rui Xin, Margo Seltzer, and Cynthia Rudin. Optimal sparse survival trees.Proceed- ings of machine learning research, 238:352, 2024. A Proof of Proposition 1 We prove Proposition 1 by showing that recourse summary tree optimization is aseparableoptimiza- tion task in the framework of [32]. By their Theorem 4.7 this guarantees that dynamic progra...

work page 2024
[37]

50) is necessary to control the search space on a dataset with over 30,000 training instances and 161 binarised features

The larger minimum leaf size for Adult (500 vs. 50) is necessary to control the search space on a dataset with over 30,000 training instances and 161 binarised features. Ablations on depth, sparsity, and bins are reported in Appendix D. Table 3: SOGAR default hyperparameters for all main experiments. Parameter Value Max depthd3 Max nodesm7 Min leaf sizeℓ5...

work page
[38]

Depth 2 still outperforms all baselines on the invalidity metric across every dataset and classifier combination, making it a practical default when compute is constrained

to 0.57 (depth 2) while runtime drops from 1,235 s to 51 s, a 24× speed-up for a 9.6% quality degradation. Depth 2 still outperforms all baselines on the invalidity metric across every dataset and classifier combination, making it a practical default when compute is constrained. The solutions are also maximally interpretable, producing exactly four subgro...

work page

[1] [1]

Strong optimal classification trees.arXiv preprint arXiv:2103.15965, 2021

Sina Aghaei, Andrés Gómez, and Phebe Vayanos. Strong optimal classification trees.arXiv preprint arXiv:2103.15965, 2021

work page arXiv 2021

[2] [2]

Learning optimal decision trees using caching branch-and-bound search

Gaël Aglin, Siegfried Nijssen, and Pierre Schaus. Learning optimal decision trees using caching branch-and-bound search. InProceedings of the AAAI conference on artificial intelligence, volume 34, pages 3146–3153, 2020

work page 2020

[3] [3]

Barry Becker and Ronny Kohavi. Adult. UCI Machine Learning Repository, 1996. DOI: https://doi.org/10.24432/C5XW20

work page doi:10.24432/c5xw20 1996

[4] [4]

Optimal classification trees.Machine Learning, 106(7):1039–1082, July 2017

Dimitris Bertsimas and Jack Dunn. Optimal classification trees.Machine Learning, 106(7):1039–1082, July 2017

work page 2017

[5] [5]

Interpretable Clustering via Optimal Trees

Dimitris Bertsimas, Agni Orfanoudaki, and Holly Wiberg. Interpretable clustering via optimal trees.arXiv preprint arXiv:1812.00539, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

Counterfactual metarules for local and global recourse.arXiv preprint arXiv:2405.18875, 2024

Tom Bewley, Salim I Amoukou, Saumitra Mishra, Daniele Magazzeni, and Manuela Veloso. Counterfactual metarules for local and global recourse.arXiv preprint arXiv:2405.18875, 2024

work page arXiv 2024

[7] [7]

Classification and regression trees

L Breiman, JH Friedman, R Olshen, and CJ Stone. Classification and regression trees. 1984

work page 1984

[8] [8]

Xgboost: A scalable tree boosting system

Tianqi Chen and Carlos Guestrin. Xgboost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, page 785–794, New York, NY , USA, 2016. Association for Computing Machinery

work page 2016

[9] [9]

Counterfac- tuals and causability in explainable artificial intelligence: Theory, algorithms, and applications

Yu-Liang Chou, Catarina Moreira, Peter Bruza, Chun Ouyang, and Joaquim Jorge. Counterfac- tuals and causability in explainable artificial intelligence: Theory, algorithms, and applications. Information Fusion, 81:59–83, May 2022

work page 2022

[10] [10]

Murtree: Optimal decision trees via dynamic programming and search.Journal of Machine Learning Research, 23(26):1–47, 2022

Emir Demirovi´c, Anna Lukina, Emmanuel Hebrard, Jeffrey Chan, James Bailey, Christopher Leckie, Kotagiri Ramamohanarao, and Peter J Stuckey. Murtree: Optimal decision trees via dynamic programming and search.Journal of Machine Learning Research, 23(26):1–47, 2022

work page 2022

[11] [11]

A survey of methods for explaining black box models.ACM computing surveys (CSUR), 51(5):1–42, 2018

Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. A survey of methods for explaining black box models.ACM computing surveys (CSUR), 51(5):1–42, 2018

work page 2018

[12] [12]

Statlog (German Credit Data)

Hans Hofmann. Statlog - german credit data. UCI Machine Learning Repository, 1994. DOI: https://doi.org/10.24432/C5NC77

work page doi:10.24432/c5nc77 1994

[13] [13]

Optimal sparse decision trees.Advances in neural information processing systems, 32, 2019

Xiyang Hu, Cynthia Rudin, and Margo Seltzer. Optimal sparse decision trees.Advances in neural information processing systems, 32, 2019

work page 2019

[14] [14]

Laurent Hyafil and Ronald L. Rivest. Constructing optimal binary decision trees is np-complete. Information Processing Letters, 5(1):15–17, 1976. 10

work page 1976

[15] [15]

Counterfactual explanation trees: Transparent and consistent actionable recourse with decision trees

Kentaro Kanamori, Takuya Takagi, Ken Kobayashi, and Yuichi Ike. Counterfactual explanation trees: Transparent and consistent actionable recourse with decision trees. InInternational Conference on Artificial Intelligence and Statistics, pages 1846–1870. PMLR, 2022

work page 2022

[16] [16]

Algorithmic recourse: from counterfactual explanations to interventions

Amir-Hossein Karimi, Bernhard Schölkopf, and Isabel Valera. Algorithmic recourse: from counterfactual explanations to interventions. InProceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAccT ’21, page 353–362, New York, NY , USA, March 2021. Association for Computing Machinery

work page 2021

[17] [17]

Glance: Global actions in a nutshell for counterfactual explainability, 2025

Loukas Kavouras, Eleni Psaroudaki, Konstantinos Tsopelas, Dimitrios Rontogiannis, Nikolaos Theologitis, Dimitris Sacharidis, Giorgos Giannopoulos, Dimitrios Tomaras, Kleopatra Markou, Dimitrios Gunopulos, Dimitris Fotakis, and Ioannis Emiris. Glance: Global actions in a nutshell for counterfactual explainability, 2025

work page 2025

[18] [18]

Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. Lightgbm: A highly efficient gradient boosting decision tree.Advances in neural information processing systems, 30, 2017

work page 2017

[19] [19]

Globe-ce: A translation-based approach for global counterfactual explanations

Dan Ley, Saumitra Mishra, and Daniele Magazzeni. Globe-ce: A translation-based approach for global counterfactual explanations. (arXiv:2305.17021), December 2023. arXiv:2305.17021 [cs]

work page arXiv 2023

[20] [20]

Generalized and scalable optimal sparse decision trees

Jimmy Lin, Chudi Zhong, Diane Hu, Cynthia Rudin, and Margo Seltzer. Generalized and scalable optimal sparse decision trees. (arXiv:2006.08690), November 2022. arXiv:2006.08690 [cs]

work page arXiv 2006

[21] [21]

A Unified Approach to Interpreting Model Predictions

Scott Lundberg and Su-In Lee. A unified approach to interpreting model predictions. (arXiv:1705.07874), November 2017. arXiv:1705.07874 [cs]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[22] [22]

Fast sparse decision tree optimization via reference ensembles

Hayden McTavish, Chudi Zhong, Reto Achermann, Ilias Karimalis, Jacques Chen, Cynthia Rudin, and Margo Seltzer. Fast sparse decision tree optimization via reference ensembles. In Proceedings of the AAAI conference on artificial intelligence, volume 36, pages 9604–9613, 2022

work page 2022

[23] [23]

S. Moro, P. Rita, and P. Cortez. Bank marketing. UCI Machine Learning Repository, 2014. DOI: https://doi.org/10.24432/C5K306

work page doi:10.24432/c5k306 2014

[24] [24]

Learning optimal decision trees with sat

Nina Narodytska, Alexey Ignatiev, Filipe Pereira, and Joao Marques-Silva. Learning optimal decision trees with sat. InInternational Joint Conference on Artificial Intelligence 2018, pages 1362–1368. Association for the Advancement of Artificial Intelligence (AAAI), 2018

work page 2018

[25] [25]

Mining optimal decision trees from itemset lattices

Siegfried Nijssen and Elisa Fromont. Mining optimal decision trees from itemset lattices. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 530–539, 2007

work page 2007

[26] [26]

5: Programs for Machine Learning

J Ross Quinlan.C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993

work page 1993

[27] [27]

Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses.Advances in Neural Information Processing Systems, 33:12187–12198, 2020

Kaivalya Rawal and Himabindu Lakkaraju. Beyond individualized recourse: Interpretable and interactive summaries of actionable recourses.Advances in Neural Information Processing Systems, 33:12187–12198, 2020

work page 2020

[28] [28]

why should i trust you?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. " why should i trust you?" explaining the predictions of any classifier. InProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

work page 2016

[29] [29]

Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature machine intelligence, 1(5):206–215, 2019

Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature machine intelligence, 1(5):206–215, 2019

work page 2019

[30] [30]

Ibm hr analytics employee attrition & performance dataset

Pavan Subhash. Ibm hr analytics employee attrition & performance dataset. https://www. kaggle.com/datasets/pavansubhasht/ibm-hr-analytics-attrition-dataset ,

work page

[31] [31]

Accessed: 2025-11-04

work page 2025

[32] [32]

Actionable recourse in linear classification

Berk Ustun, Alexander Spangher, and Yang Liu. Actionable recourse in linear classification. In Proceedings of the Conference on Fairness, Accountability, and Transparency, FAT* ’19, page 10–19, New York, NY , USA, January 2019. Association for Computing Machinery. 11

work page 2019

[33] [33]

Necessary and sufficient con- ditions for optimal decision trees using dynamic programming.Advances in Neural Information Processing Systems, 36:9173–9212, 2023

Jacobus van der Linden, Mathijs de Weerdt, and Emir Demirovi´c. Necessary and sufficient con- ditions for optimal decision trees using dynamic programming.Advances in Neural Information Processing Systems, 36:9173–9212, 2023

work page 2023

[34] [34]

Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018

Sandra Wachter, Brent Mittelstadt, and Chris Russell. Counterfactual explanations without opening the black box: Automated decisions and the gdpr, 2018

work page 2018

[35] [35]

The four-fifths rule is not disparate impact: a woeful tale of epistemic trespassing in algorithmic fairness

Elizabeth Anne Watkins and Jiahao Chen. The four-fifths rule is not disparate impact: a woeful tale of epistemic trespassing in algorithmic fairness. InProceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, pages 764–775, 2024

work page 2024

[36] [36]

translates

Rui Zhang, Rui Xin, Margo Seltzer, and Cynthia Rudin. Optimal sparse survival trees.Proceed- ings of machine learning research, 238:352, 2024. A Proof of Proposition 1 We prove Proposition 1 by showing that recourse summary tree optimization is aseparableoptimiza- tion task in the framework of [32]. By their Theorem 4.7 this guarantees that dynamic progra...

work page 2024

[37] [37]

50) is necessary to control the search space on a dataset with over 30,000 training instances and 161 binarised features

The larger minimum leaf size for Adult (500 vs. 50) is necessary to control the search space on a dataset with over 30,000 training instances and 161 binarised features. Ablations on depth, sparsity, and bins are reported in Appendix D. Table 3: SOGAR default hyperparameters for all main experiments. Parameter Value Max depthd3 Max nodesm7 Min leaf sizeℓ5...

work page

[38] [38]

Depth 2 still outperforms all baselines on the invalidity metric across every dataset and classifier combination, making it a practical default when compute is constrained

to 0.57 (depth 2) while runtime drops from 1,235 s to 51 s, a 24× speed-up for a 9.6% quality degradation. Depth 2 still outperforms all baselines on the invalidity metric across every dataset and classifier combination, making it a practical default when compute is constrained. The solutions are also maximally interpretable, producing exactly four subgro...

work page