Treant: Training Evasion-Aware Decision Trees
Pith reviewed 2026-05-25 11:12 UTC · model grok-4.3
The pith
Treant trains decision trees to stay accurate while resisting evasion attacks by minimizing a threat-model loss at each split.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Treant is a decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. It is based on two key technical ingredients, robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on three publicly available datasets show that Treant generates decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques.
What carries the argument
Robust splitting and attack invariance, which together ensure each tree split accounts for evasion possibilities while preserving the validity of the learning process.
If this is right
- Treant ensembles achieve comparable accuracy to standard trees on clean inputs.
- The ensembles exhibit low error rates under evasion attacks matching the threat model.
- The method outperforms prior adversarial training approaches on the evaluated datasets.
- The focus remains on non-perceptual tasks where decision trees are already effective.
Where Pith is reading between the lines
- The same loss-minimization idea might transfer to other tree-based ensembles such as random forests if the threat model is kept fixed.
- Combining Treant with post-training defenses could further strengthen resistance when the initial threat model is only approximate.
- The retained interpretability of the resulting trees could support auditing in domains that require both robustness and explainability.
Load-bearing premise
The formal threat model used during training accurately represents the evasion attacks that will occur at test time.
What would settle it
Training and testing Treant trees on the public datasets against evasion attacks that follow the assumed threat model but finding no gain in robustness compared with ordinary trees.
Figures
read the original abstract
Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and interpretable, decision tree ensembles have received only limited attention by the security and machine learning communities so far, leading to a sub-optimal state of the art for adversarial learning techniques. We thus propose Treant, a novel decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. Treant is based on two key technical ingredients: robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on three publicly available datasets show that Treant is able to generate decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes Treant, a decision-tree ensemble learning algorithm that incorporates a formal threat model into the training process by minimizing an evasion-aware loss at each split. It relies on two technical ingredients—robust splitting and attack invariance—to guarantee soundness, and reports that the resulting models achieve both high accuracy and strong resistance to evasion attacks, outperforming prior adversarial learning methods on three public datasets.
Significance. If the threat model is representative, Treant would fill a notable gap by supplying a sound, interpretable alternative to gradient-based adversarial training for non-perceptual tasks. The algorithmic framing (robust splitting plus attack invariance) is a concrete strength that could be reused beyond the specific threat model chosen here.
major comments (2)
- [§3] §3 (Threat Model): The central soundness claim and the reported robustness gains rest on the assumption that the chosen perturbation sets and attacker objectives match the evasion attacks that will actually occur at test time. No empirical validation or sensitivity analysis is provided showing that the model is neither under- nor over-approximating real attacker capabilities on the three evaluation datasets; if the mismatch is material, both the formal guarantee and the empirical outperformance become non-transferable.
- [§5] §5 (Experiments): The abstract and experimental claims assert that Treant ensembles are “nearly insensitive” while remaining accurate, yet the text supplies neither error bars on the robustness metrics nor basic dataset statistics (class balance, feature ranges, attack success rates under the threat model). Without these, it is impossible to judge whether the reported gains over SOTA are statistically reliable or driven by particular dataset characteristics.
minor comments (2)
- Notation for the evasion-aware loss and the attack-invariance property should be introduced with a single, self-contained definition rather than being referenced piecemeal across sections.
- Figure captions and axis labels in the experimental plots would benefit from explicit mention of the threat-model parameters used for each curve.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below.
read point-by-point responses
-
Referee: [§3] §3 (Threat Model): The central soundness claim and the reported robustness gains rest on the assumption that the chosen perturbation sets and attacker objectives match the evasion attacks that will actually occur at test time. No empirical validation or sensitivity analysis is provided showing that the model is neither under- nor over-approximating real attacker capabilities on the three evaluation datasets; if the mismatch is material, both the formal guarantee and the empirical outperformance become non-transferable.
Authors: The soundness guarantees and reported results hold with respect to the explicitly defined threat model in §3. We selected the perturbation sets based on standard practices for the given datasets. We agree that further discussion of potential mismatches would strengthen the presentation. In revision we will add a dedicated paragraph in §3 discussing the threat-model assumptions and a limited sensitivity analysis on one dataset. revision: partial
-
Referee: [§5] §5 (Experiments): The abstract and experimental claims assert that Treant ensembles are “nearly insensitive” while remaining accurate, yet the text supplies neither error bars on the robustness metrics nor basic dataset statistics (class balance, feature ranges, attack success rates under the threat model). Without these, it is impossible to judge whether the reported gains over SOTA are statistically reliable or driven by particular dataset characteristics.
Authors: We will add error bars (standard deviation over 10 independent runs) to all robustness and accuracy figures in the revised experimental section. We will also insert a new table in §5.1 reporting class balance, feature ranges, and attack success rates under the threat model for each dataset. revision: yes
Circularity Check
No circularity: algorithmic construction with independent threat model and empirical validation
full rationale
The paper describes Treant as a tree-construction algorithm that minimizes an evasion-aware loss derived from an explicitly stated formal threat model, using the two ingredients of robust splitting and attack invariance to ensure soundness. No equations, fitted parameters, or predictions are presented that reduce by construction to the same data or to a self-citation chain; the derivation is a standard optimization procedure whose soundness claim is conditional on the threat model matching reality (an external assumption, not a definitional loop). Central performance claims rest on experimental results on three public datasets rather than on any self-referential renaming or imported uniqueness theorem. This is the normal non-circular case for an algorithmic adversarial-learning paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
L. Huang, A. D. Joseph, B. Nelson, B. I. P. Rubinstein, and J. D. Tygar, “Adversarial machine learning,” in AISec, 2011, pp. 43–58
work page 2011
-
[2]
Wild patterns: Ten years after the rise of adversarial machine learning,
B. Biggio and F. Roli, “Wild patterns: Ten years after the rise of adversarial machine learning,” Pattern Recognition, vol. 84, pp. 317– 331, 2018
work page 2018
-
[3]
Evasion attacks against machine learning at test time,
B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli, “Evasion attacks against machine learning at test time,” in ECML PKDD, 2013, pp. 387–402
work page 2013
-
[4]
Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,
A. M. Nguyen, J. Yosinski, and J. Clune, “Deep neural networks are easily fooled: High confidence predictions for unrecognizable images,” in CVPR, 2015, pp. 427–436
work page 2015
-
[5]
The limitations of deep learning in adversarial settings,
N. Papernot, P. D. McDaniel, S. Jha, M. Fredrikson, Z. B. Celik, and A. Swami, “The limitations of deep learning in adversarial settings,” in EuroS&P, 2016, pp. 372–387
work page 2016
-
[6]
Deepfool: A simple and accurate method to fool deep neural networks,
S. Moosavi-Dezfooli, A. Fawzi, and P. Frossard, “Deepfool: A simple and accurate method to fool deep neural networks,” in CVPR, 2016, pp. 2574–2582
work page 2016
-
[7]
D. Lowd and C. Meek, “Adversarial learning,” in SIGKDD, 2005, pp. 641–647
work page 2005
-
[8]
Support vector machines under adversarial label noise,
B. Biggio, B. Nelson, and P. Laskov, “Support vector machines under adversarial label noise,” in ACML, 2011, pp. 97–112
work page 2011
-
[9]
Intriguing properties of neural networks,
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. J. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” in ICLR, 2014
work page 2014
-
[10]
Explaining and harnessing adversarial examples,
I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” in ICLR, 2015
work page 2015
-
[11]
Chollet, Deep Learning with Python , 1st ed
F. Chollet, Deep Learning with Python , 1st ed. Greenwich, CT, USA: Manning Publications Co., 2017
work page 2017
-
[12]
Interpretable predictions of tree-based ensembles via actionable feature tweaking,
G. Tolomei, F. Silvestri, A. Haines, and M. Lalmas, “Interpretable predictions of tree-based ensembles via actionable feature tweaking,” in SIGKDD, 2017, pp. 465–474
work page 2017
-
[13]
Towards deep learning models resistant to adversarial attacks,
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, “Towards deep learning models resistant to adversarial attacks,” in ICLR, 2018
work page 2018
-
[14]
E. B. Hunt, J. Marin, and P. J. Stone, “Experiments in induction,” 1966
work page 1966
-
[15]
L. Breiman, “Random forests,” Machine Learning , vol. 45, no. 1, pp. 5–32, 2001
work page 2001
-
[16]
Combining email models for false positive reduction,
S. Hershkop and S. J. Stolfo, “Combining email models for false positive reduction,” in SIGKDD, 2005, pp. 98–107
work page 2005
-
[17]
Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems,
R. Perdisci, G. Gu, and W. Lee, “Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems,” in ICDM, 2006, pp. 488–498
work page 2006
-
[18]
T. P. Tran, P. Tsai, and T. Jan, “An adjustable combination of linear regression and modified probabilistic neural network for anti-spam filtering,” in ICPR, 2008, pp. 1–4
work page 2008
-
[19]
Multiple classifier systems for robust classifier design in adversarial environments,
B. Biggio, G. Fumera, and F. Roli, “Multiple classifier systems for robust classifier design in adversarial environments,” Int. J. Machine Learning & Cybernetics, vol. 1, no. 1-4, pp. 27–41, 2010
work page 2010
-
[20]
Adversarial example defense: Ensembles of weak defenses are not strong,
W. He, J. Wei, X. Chen, N. Carlini, and D. Song, “Adversarial example defense: Ensembles of weak defenses are not strong,” in WOOT, 2017
work page 2017
-
[21]
Principles of risk minimization for learning theory,
V . Vapnik, “Principles of risk minimization for learning theory,” in Advances in neural information processing systems , 1992, pp. 831–838
work page 1992
-
[22]
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Wadsworth, 1984
work page 1984
-
[23]
Constructing optimal binary decision trees is np-complete,
L. Hyafil and R. L. Rivest, “Constructing optimal binary decision trees is np-complete,” Inf. Process. Lett. , vol. 5, no. 1, pp. 15–17, 1976
work page 1976
-
[24]
Automatic construction of decision trees from data: A multi-disciplinary survey,
S. K. Murthy, “Automatic construction of decision trees from data: A multi-disciplinary survey,” Data Min. Knowl. Discov. , vol. 2, no. 4, pp. 345–389, 1998
work page 1998
-
[25]
Greedy function approximation: a gradient boosting machine,
J. H. Friedman, “Greedy function approximation: a gradient boosting machine,” Annals of statistics , pp. 1189–1232, 2001
work page 2001
-
[26]
Near-optimal evasion of convex- inducing classifiers,
B. Nelson, B. I. P. Rubinstein, L. Huang, A. D. Joseph, S. Lau, S. J. Lee, S. Rao, A. Tran, and J. D. Tygar, “Near-optimal evasion of convex- inducing classifiers,” in AISTATS, 2010, pp. 549–556
work page 2010
-
[27]
Security evaluation of pattern classifiers under attack,
B. Biggio, G. Fumera, and F. Roli, “Security evaluation of pattern classifiers under attack,” IEEE Trans. Knowl. Data Eng. , vol. 26, no. 4, pp. 984–996, 2014
work page 2014
-
[28]
Practical evasion of a learning-based classifier: A case study,
N. Srndic and P. Laskov, “Practical evasion of a learning-based classifier: A case study,” in S&P, 2014, pp. 197–211
work page 2014
-
[29]
Evasion and hardening of tree ensemble classifiers,
A. Kantchelian, J. D. Tygar, and A. D. Joseph, “Evasion and hardening of tree ensemble classifiers,” in ICML, 2016, pp. 2387–2396
work page 2016
-
[30]
Towards evaluating the robustness of neural networks,
N. Carlini and D. A. Wagner, “Towards evaluating the robustness of neural networks,” in S&P, 2017, pp. 39–57
work page 2017
-
[31]
Evading classifiers by morphing in the dark,
H. Dang, Y . Huang, and E. Chang, “Evading classifiers by morphing in the dark,” in CCS, 2017, pp. 119–133
work page 2017
-
[32]
Support vector machines under adversarial label contamination,
H. Xiao, B. Biggio, B. Nelson, H. Xiao, C. Eckert, and F. Roli, “Support vector machines under adversarial label contamination,” Neurocomput- ing, vol. 160, pp. 53–62, 2015
work page 2015
-
[33]
Towards deep neural network architectures robust to adversarial examples,
S. Gu and L. Rigazio, “Towards deep neural network architectures robust to adversarial examples,” in ICLR, Workshop Track Proceedings, 2015
work page 2015
-
[34]
Distillation as a defense to adversarial perturbations against deep neural networks,
N. Papernot, P. D. McDaniel, X. Wu, S. Jha, and A. Swami, “Distillation as a defense to adversarial perturbations against deep neural networks,” in S&P, 2016, pp. 582–597
work page 2016
-
[35]
Robust decision trees against adversarial examples,
H. Chen, H. Zhang, D. S. Boning, and C. Hsieh, “Robust decision trees against adversarial examples,” in ICML, 2019, pp. 1122–1131
work page 2019
-
[36]
S. Nawar and A. Mouazen, “Comparison between random forests, artificial neural networks and gradient boosted machines methods of on- line vis-nir spectroscopy measurements of soil total nitrogen and total carbon,” Sensors, vol. 17, no. 10, p. 2428, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.