pith. sign in

arxiv: 2606.00202 · v1 · pith:X5RA6FVLnew · submitted 2026-05-29 · 💻 cs.LG · cs.AI

From Rashomon Theory to PRAXIS: Efficient Decision Tree Rashomon Sets

Pith reviewed 2026-06-28 23:12 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords Rashomon setsdecision treesapproximation algorithmsmodel uncertaintyinterpretable machine learningsparse modelsrobust decision making
0
0 comments X

The pith

PRAXIS approximates the full Rashomon set of sparse decision trees using orders of magnitude less time and memory.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PRAXIS as an algorithm that approximates the Rashomon set—the collection of all near-optimal sparse decision trees—for a given dataset and objective. Prior exact methods require too much memory and runtime to handle realistic problems, restricting Rashomon sets to toy cases. PRAXIS delivers large efficiency gains while recovering nearly every model from the complete set on the datasets examined. This shift makes it practical to examine model diversity, quantify uncertainty, and inject domain preferences into model selection. A reader would care because these sets support robust, preference-aware decisions instead of committing to one model.

Core claim

PRAXIS is an algorithm to approximate the Rashomon set of sparse decision trees that achieves orders of magnitude improvement in runtime and memory usage while regularly recovering almost all of the models present in the full Rashomon set, thereby enabling scalable modeling of these sets on real-world datasets.

What carries the argument

The PRAXIS algorithm for approximating Rashomon sets of sparse decision trees by efficient search that avoids exhaustive enumeration.

If this is right

  • Researchers can now compute Rashomon sets for larger, real-world datasets instead of only small problems.
  • Practitioners gain a practical way to quantify model uncertainty and diversity for a fixed objective.
  • Domain knowledge can be incorporated into decisions by selecting among many near-optimal trees rather than one.
  • Downstream tasks such as robust prediction or preference-based model choice become feasible at scale.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method may extend naturally to other interpretable model classes once similar search structures are identified.
  • Widespread adoption could shift standard practice from single-model selection to set-based analysis in high-stakes domains.
  • It opens the possibility of new tools that let users interactively explore and filter large Rashomon sets.

Load-bearing premise

The approximation quality seen on the validation datasets will hold for other real-world data and the recovered models will retain the diversity and uncertainty properties of the exact set.

What would settle it

A new dataset on which PRAXIS recovers substantially fewer than almost all models from the exact Rashomon set or where the approximated set loses key diversity properties.

Figures

Figures reproduced from arXiv: 2606.00202 by Cynthia Rudin, Hayden McTavish, Margo Seltzer, Varun Babbar, Zakk Heile.

Figure 1
Figure 1. Figure 1: An illustration of PRAXIS and other Rashomon set algorithms on the News dataset (Fernandes et al., 2015b), λ = 0.02, ε = 0.03, depth = 5. PRAXIS runs orders of magnitude faster than competitors while ensuring near-perfect recall relative to optimal methods. find a model that maximizes fairness, obeys causal hypothe￾ses, uses certain features, and/or follows specific structural constraints. These modeling g… view at source ↗
Figure 2
Figure 2. Figure 2: Frontier cut for internal node nd′ . The path to the node is shaded in orange. The nodes that are summed in the frontier cut are depicted in blue. for a node N of a given tree corresponds to those subprob￾lems not already encountered along the path from the root to node N. Theorem 3.5 tells us that if, for each internal node of a tree, proxy algorithm completions along that node’s frontier cut remain withi… view at source ↗
Figure 3
Figure 3. Figure 3: Approximation quality of non-optimal methods for the 4 datasets where optimal methods either take > 40 hours or > 200 GB of RAM. Each plot shows the number of depth 5 trees with objective value within an x factor of the minimum objective found by any method. The dashed vertical line shows the Rashomon bound estimated as (1 + ϵ) times the minimum objective found by any method. lustration, on News with λ = 0… view at source ↗
Figure 4
Figure 4. Figure 4: Runtime vs. fraction of the Rashomon set (λ = 0.005, εmult = 0.03) recovered when stopping SORTeD early. The majority of time is spent computing the optimal tree, while early stopping yields only a small fraction of the set. 101 [PITH_FULL_IMAGE:figures/full_fig_p101_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Adult. Example near-optimal trees with accuracy and number of leaves shown below each tree. 103 [PITH_FULL_IMAGE:figures/full_fig_p103_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Bank. Example near-optimal trees with accuracy and number of leaves shown below each tree. 104 [PITH_FULL_IMAGE:figures/full_fig_p104_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Bike. Example near-optimal trees with accuracy and number of leaves shown below each tree. 105 [PITH_FULL_IMAGE:figures/full_fig_p105_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Churn. Example near-optimal trees with accuracy and number of leaves shown below each tree. 106 [PITH_FULL_IMAGE:figures/full_fig_p106_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Compas. Example near-optimal trees with accuracy and number of leaves shown below each tree. 107 [PITH_FULL_IMAGE:figures/full_fig_p107_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Covertype. Example near-optimal trees with accuracy and number of leaves shown below each tree. 108 [PITH_FULL_IMAGE:figures/full_fig_p108_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Droid. Example near-optimal trees with accuracy and number of leaves shown below each tree. 109 [PITH_FULL_IMAGE:figures/full_fig_p109_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Heloc. Example near-optimal trees with accuracy and number of leaves shown below each tree. (a) Accuracy: 82.9%, Leaves: 7 (b) Accuracy: 83.3%, Leaves: 8 (c) Accuracy: 83.6%, Leaves: 9 (d) Accuracy: 83.0%, Leaves: 8 [PITH_FULL_IMAGE:figures/full_fig_p110_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Magic. Example near-optimal trees with accuracy and number of leaves shown below each tree. 110 [PITH_FULL_IMAGE:figures/full_fig_p110_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Shopping. Example near-optimal trees with accuracy and number of leaves shown below each tree. 111 [PITH_FULL_IMAGE:figures/full_fig_p111_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Spambase. Example near-optimal trees with accuracy and number of leaves shown below each tree. 112 [PITH_FULL_IMAGE:figures/full_fig_p112_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Student. Example near-optimal trees with accuracy and number of leaves shown below each tree. 113 [PITH_FULL_IMAGE:figures/full_fig_p113_16.png] view at source ↗
read the original abstract

Standard machine learning pipelines often admit many near-optimal models. These "Rashomon sets" pose a range of challenges and opportunities for uncertainty-aware, robust decision making. They allow users to incorporate domain knowledge and preferences that would otherwise be difficult to specify directly in an objective, and they quantify diversity among valid models for a given training dataset and objective function. However, computation of Rashomon sets, even for simple, interpretable model classes such as sparse decision trees, continues to require immense memory and runtime resources. We present PRAXIS, an algorithm to approximate this Rashomon set with orders of magnitude improvement in runtime and memory usage. We validate that PRAXIS regularly recovers almost all of the full Rashomon set. PRAXIS allows researchers and practitioners to scalably model the Rashomon set for real-world datasets. Code for PRAXIS is available at https://github.com/zakk-h/PRAXIS

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces PRAXIS, an algorithm for efficiently approximating Rashomon sets of sparse decision trees. It claims orders-of-magnitude improvements in runtime and memory over exact enumeration methods while regularly recovering nearly the entire Rashomon set on validation datasets, thereby enabling scalable use of Rashomon sets for real-world data.

Significance. If the approximation quality and preservation of model diversity hold, the work would meaningfully advance practical access to Rashomon sets beyond toy scales, supporting uncertainty-aware and preference-incorporating decision making with interpretable models. The public release of code is a clear strength that aids reproducibility.

major comments (1)
  1. The abstract and validation claims rest on recovery rates and efficiency gains, yet no explicit definition or section details the precise recovery metric (e.g., exact set overlap, model equivalence under some tolerance) or the diversity/uncertainty preservation tests; without this, it is difficult to assess whether the reported 'almost all' recovery supports the downstream claims about modeling the full Rashomon set.
minor comments (2)
  1. Notation for the Rashomon set and the approximation parameters should be introduced with a dedicated preliminary section or table for clarity.
  2. The experimental section would benefit from explicit reporting of dataset characteristics (size, feature types) and hardware used for the runtime comparisons.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the need for greater clarity on the recovery metric. We will revise the manuscript to explicitly define this metric and related evaluation details, strengthening the presentation of our validation results.

read point-by-point responses
  1. Referee: The abstract and validation claims rest on recovery rates and efficiency gains, yet no explicit definition or section details the precise recovery metric (e.g., exact set overlap, model equivalence under some tolerance) or the diversity/uncertainty preservation tests; without this, it is difficult to assess whether the reported 'almost all' recovery supports the downstream claims about modeling the full Rashomon set.

    Authors: We agree that an explicit definition is necessary. In the revised manuscript we will add a new subsection (tentatively 5.2) titled 'Recovery Metric and Evaluation Protocol' that defines recovery rate as the fraction of models in the exact Rashomon set (enumerated via the baseline method) that appear identically in the PRAXIS output, using exact structural equivalence of the decision trees (no tolerance). We will also report the complementary false-positive rate (models returned by PRAXIS but absent from the exact set). Regarding diversity and uncertainty preservation, the current experiments already compare the range of leaf predictions and the coverage of the Pareto front of accuracy-complexity trade-offs between the exact and approximated sets; we will make these comparisons explicit and add a short paragraph explaining why exact-set overlap on the tested data implies that downstream uses (preference incorporation, uncertainty quantification) remain supported. These additions directly address the concern that 'almost all' recovery may not suffice for the modeling claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents PRAXIS as a new algorithmic approximation method for Rashomon sets of decision trees, with claims of runtime/memory gains and empirical recovery rates validated on datasets. No derivation chain, equations, or self-citations appear in the provided abstract or context that reduce a claimed result to a fitted input or self-referential definition by construction. The central contribution is an empirical algorithm whose performance is assessed externally rather than defined tautologically from its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.1-grok · 5701 in / 1000 out tokens · 18525 ms · 2026-06-28T23:12:51.244337+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

17 extracted references · 7 canonical work pages · 1 internal anchor

  1. [1]

    Searching for exotic particles in high-energy physics with deep learning.Nature Communications, 5:4308, 2014

    doi: 10.1038/ncomms5308. URL http://dx. doi.org/10.1038/ncomms5308. Bao, M., Zhou, A., Zottola, A. S., Brubach, B., Desmarais, S., Horowitz, S. A., Lum, K., and Venkatasubramanian, S. It’s compaslicated: The messy relationship between rai datasets and algorithmic fairness benchmarks. In Proceedings of the Thirty-Fifth Conference on Neural In- formation Pr...

  2. [2]

    Blackard

    DOI: https://doi.org/10.24432/C50K5N. Blanc, G., Lange, J., Pabbaraju, C., Sullivan, C., Tan, L.- Y ., and Tiwari, M. Harnessing the power of choices in decision tree learning. InAdvances in Neural Information Processing Systems, volume 36, 2024. Bock, R. MAGIC Gamma Telescope. UCI Machine Learning Repository, 2004. DOI: https://doi.org/10.24432/C52C8B. B...

  3. [3]

    TabArena: A Living Benchmark for Machine Learning on Tabular Data

    doi: 10.15585/mmwr.mm6643a2. URL https: //doi.org/10.15585/mmwr.mm6643a2. Cattral, R. and Oppacher, F. Poker Hand. UCI Machine Learning Repository, 2002. DOI: https://doi.org/10.24432/C5KW38. Chaouki, A., Read, J., and Bifet, A. Branches: Efficiently seeking optimal sparse decision trees via ao. InForty- second International Conference on Machine Learning...

  4. [4]

    10 Beyond Rational Illusion: Behaviorally Realistic Strategic Classification Horowitz, G

    URL https://community.fico.com/s/ explainable-machine-learning-challenge . FICO Explainable Machine Learning Challenge. Guyon, I., Sun-Hosoya, L., Boull ´e, M., Escalante, H. J., Escalera, S., Liu, Z., Jajetic, D., Ray, B., Saeed, M., Sebag, M., Statnikov, A., Tu, W., and Viegas, E. Analysis of the AutoML challenge series 2015-2018. InAutoML, Springer ser...

  5. [5]

    Malani, P

    PMLR, 2020. Malani, P. N., Kullgren, J., and Solway, E. National poll on healthy aging (NPHA), United States, april 2017, 2019. URL https://doi.org/10.3886/ ICPSR37305.v1. Available from the Inter-university Consortium for Political and Social Research and UCI Machine Learning Repository. Marcoulides, G. A.Discovering Knowledge in Data: An Introduction to...

  6. [6]

    Mazumder, R., Meng, X., and Wang, H

    Available in UCI Machine Learning Repository. Mazumder, R., Meng, X., and Wang, H. Quant-BnB: A scal- able branch-and-bound method for optimal decision trees with continuous features. InInternational Conference on Machine Learning, volume 162, pp. 15255–15277. PMLR, 17–23 Jul 2022. McTavish, H., Zhong, C., Achermann, R., Karimalis, I., Chen, J., Rudin, C....

  7. [7]

    org/CorpusID:261516162

    URL https://api.semanticscholar. org/CorpusID:261516162. Sullivan, C., Tiwari, M., and Thrun, S. Maptree: Beating “optimal” decision trees with bayesian decision trees. In Proceedings of the AAAI Conference on Artificial Intelli- gence, volume 38, pp. 9019–9026, 2024. Thrun, S. The MONK’s problems-a performance comparison of different learning algorithms,...

  8. [8]

    DOI: https://doi.org/10.24432/C5V312. Wnek, J. MONK’s Problems. UCI Ma- chine Learning Repository, 1993. DOI: https://doi.org/10.24432/C5R30R. Xin, R., Zhong, C., Chen, Z., Takagi, T., Seltzer, M., and Rudin, C. Exploring the whole Rashomon set of sparse decision trees. InAdvances in Neural Information Pro- cessing Systems, volume 35, pp. 14071–14084, 202...

  9. [9]

    A single copy of the original dataset

  10. [10]

    the dependency graph structure, which can be constructed to include only nodes with a constant amount of storage space, where there are no more nodes in the dependency graph than there are nodes (split nodes or leaves) across the whole Rashomon set

  11. [11]

    Since the number of leaves and internal nodes is bounded by twice the number of leaves, we know object (2) isO(P t∈R |t|)

    Row indices for currently active subproblems in the dependency graph. Since the number of leaves and internal nodes is bounded by twice the number of leaves, we know object (2) isO(P t∈R |t|). We know object (1) is O(nk), so combining them we have proven our claimed memory complexity. (Note that (3) is also withinO(nk)so does not affect the complexity). ....

  12. [12]

    the returned AND/OR graphGcontains (represents) the proxy treef px, (Gis nonempty)

  13. [13]

    In particular, the root node satisfies G.min objective≤P(D, d)

    the minimum objective stored at the root OR-node satisfies G.min objective≤P(D, d). In particular, the root node satisfies G.min objective≤P(D, d). Proof.We argue by induction on the depthd. Base case (d= 0).Thenf px is a leaf predicting some labelb∈ {0,1}, (denote this label wlog asb ′) hence P(D,0) =C b′,(16) 25 Efficient Decision Tree Rashomon Sets one...

  14. [14]

    A pure greedy (information-gain) depth-dtree achieves accuracy at most 1 2 +δ

  15. [15]

    Every treeTreturned byP RAXIS(S, d, γ, ε abs)achieves accuracy at least1−δ. Proof. Fix δ >0 and define η:=δ/(1 +ε mult). By Theorem A.7 of Babbar et al. (2025), for this η and depth d there exist D and n such that, with high probability over S∼ D n: (i) LICKETYSPLIT returns a depth- d tree with error at most η, and (ii) pure greedy achieves error at least...

  16. [16]

    the returned AND/OR graphGcontains the ProxyWrapper treef pw(D, d)

  17. [17]

    Proof.Claim.If PRAXIS is invoked on(D, d)withε abs ≥P(D, d), then the returned graph containsf pw(D, d)

    In particular, G.min objective≤Obj(f pw(D, d), D, γ)≤Obj(f px, D, γ). Proof.Claim.If PRAXIS is invoked on(D, d)withε abs ≥P(D, d), then the returned graph containsf pw(D, d). Base case:d= 0. By construction, PROXYWRAPPERreturns a leaf (in fact, the majority leaf). Since εabs ≥P(D,0) and PRAXIS inserts all feasible leaf actions whose objective is ≤ε abs, t...