Splitting criteria for ordinal decision trees: an experimental study
Pith reviewed 2026-05-23 07:03 UTC · model grok-4.3
The pith
The Ordinal Gini splitting criterion reduces mean absolute error by more than 3.02% compared to standard Gini in decision trees for ordinal classification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes through extensive experiments that the Ordinal Gini criterion is the strongest among the ordinal splitting methods tested, delivering more than a 3.02% reduction in mean absolute error relative to the conventional Gini criterion across 45 ordinal classification datasets.
What carries the argument
Ordinal splitting criteria, particularly Ordinal Gini, which adjust the calculation of node impurity to incorporate the ordered distances between class labels rather than treating them as unrelated.
If this is right
- Decision trees using OGini achieve lower mean absolute error than those using Gini on ordinal tasks.
- The performance advantage appears consistently across multiple ordinal evaluation metrics on the tested datasets.
- Ordinal criteria like WIG and RI also improve over their nominal versions but less than OGini does.
- Providing the full code and datasets allows direct verification and extension of these comparisons.
Where Pith is reading between the lines
- If the superiority holds, standard machine learning libraries could incorporate OGini as a default option for ordered targets.
- Similar order-aware adjustments might improve other tree induction algorithms beyond basic decision trees.
- Domains with naturally ordered outcomes, such as risk assessment, may see practical gains from switching splitting rules.
Load-bearing premise
The collection of 45 public datasets and the specific error measures used are representative of ordinal classification problems in general.
What would settle it
Finding an ordinal dataset or set of problems where the standard Gini criterion yields a lower mean absolute error than OGini would challenge the superiority claim.
Figures
read the original abstract
Ordinal Classification (OC) addresses those classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as mutually exclusive and unordered, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has significant consequences. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are among the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work provides a comprehensive survey of ordinal splitting criteria, standardising the notations used in the literature to enhance clarity and consistency. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering $45$ publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. The results have been statistically analysed, highlighting that OGini stands out as the best ordinal splitting criterion to date, reducing the mean absolute error achieved by Gini by more than 3.02%. To promote reproducibility, all source code developed, a detailed guide for reproducing the results, the 45 OC datasets, and the individual results for all the evaluated methodologies are provided.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper surveys ordinal splitting criteria for decision trees, standardizes notations from the literature, and experimentally compares three ordinal criteria (OGini, WIG, RI) to their nominal counterparts (Gini, information gain) by embedding them in decision trees. Using 45 public ordinal classification datasets and standard OC metrics, it reports that OGini yields the best performance, reducing mean absolute error by more than 3.02% relative to Gini, and supplies full reproducibility artifacts including code, datasets, and per-run results.
Significance. If the observed ranking is robust, the work supplies a useful standardized benchmark for ordinal tree induction and identifies a practically strong default criterion. The explicit release of all source code, a reproduction guide, the 45 datasets, and individual results constitutes a clear strength for reproducibility.
major comments (2)
- [Abstract] Abstract: the headline claim that OGini is 'the best ordinal splitting criterion to date' and reduces MAE by >3.02% rests on the ranking across the 45 datasets, yet no stratification, meta-analysis, or regression by dataset properties (class cardinality, sample size, imbalance, domain) is described; without such analysis the generalization step from this corpus to other ordinal problems remains unsupported.
- [Abstract] Abstract / experimental protocol: the manuscript states that 'the results have been statistically analysed' but supplies no description of the exact tests, handling of ties, or correction for multiple comparisons across criteria and metrics; these details are load-bearing for validating the reported superiority.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript to improve the support for our claims and the description of our experimental protocol.
read point-by-point responses
-
Referee: [Abstract] Abstract: the headline claim that OGini is 'the best ordinal splitting criterion to date' and reduces MAE by >3.02% rests on the ranking across the 45 datasets, yet no stratification, meta-analysis, or regression by dataset properties (class cardinality, sample size, imbalance, domain) is described; without such analysis the generalization step from this corpus to other ordinal problems remains unsupported.
Authors: The study is presented as an experimental benchmark on a corpus of 45 publicly available ordinal datasets that already span a range of class cardinalities, sample sizes, imbalance levels, and application domains. The headline claim is therefore scoped to performance within this corpus rather than a universal assertion. We acknowledge that explicit stratification or meta-regression would strengthen statements about broader generalization. We will add a short analysis correlating performance differences with key dataset properties (e.g., number of classes and sample size) and will revise the abstract wording to make the scope of the claim clearer. revision: partial
-
Referee: [Abstract] Abstract / experimental protocol: the manuscript states that 'the results have been statistically analysed' but supplies no description of the exact tests, handling of ties, or correction for multiple comparisons across criteria and metrics; these details are load-bearing for validating the reported superiority.
Authors: We agree that the precise statistical procedures must be documented. The analysis employed the Wilcoxon signed-rank test on per-dataset metric differences, with ties handled by the standard mid-rank method and Bonferroni correction applied across the set of pairwise comparisons. We will insert a dedicated paragraph in the experimental protocol section (and a brief reference in the abstract) that fully specifies the tests, significance level, tie handling, and multiplicity correction. revision: yes
Circularity Check
No circularity: direct empirical comparison on external datasets
full rationale
The paper performs an experimental comparison of existing ordinal splitting criteria (OGini, WIG, RI) against nominal baselines on 45 publicly available OC datasets using standard evaluation metrics. No derivation chain exists; claims rest on observed empirical rankings rather than any quantity defined in terms of itself, a fitted parameter renamed as a prediction, or a self-citation chain. The 45 datasets are external public resources, and code is released for reproducibility. This is a standard empirical study with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard non-parametric statistical tests are appropriate for comparing classifier performance across multiple datasets.
Reference graph
Works this paper leans on
-
[1]
P . A. Gutiérrez, M. Pérez-Ortíz, J. Sánchez-Monedero, F. Fernández-Navarro, C. Hervás-Martínez, Ordinal regression methods: survey and experimental study, IEEE Transactions on Knowledge and Data Engineering 28 (1) (2015) 127–146. doi:10.1109/TKDE.2015.2457911
-
[2]
J. Large, E. K. Kemsley, N. Wellner, I. Goodall, A. Bagnall, Detecting forged al- cohol non-invasively through vibrational spectroscopy and machine learning, in: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Confer- ence, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part I 22, Springer, 2018, pp. 298–309. doi:10....
-
[3]
D. Guijo-Rubio, J. Briceño, P . A. Gutiérrez, M. D. Ayllón, R. Ciria, C. Hervás- Martínez, Statistical methods versus machine learning techniques for donor- recipient matching in liver transplantation, PLoS One 16 (5) (2021) e0252068. doi:10.1371/journal.pone.0252068
-
[4]
S. G. Armato III, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P . Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, E. A. Ho ffman, et al., The lung 27 image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans, Medical physics 38 (2) (2011) 915–931. doi:10.1118/1.3528204
-
[5]
J. S. Cardoso, R. Sousa, Classification models with global constraints for ordinal data, in: 2010 Ninth International Conference on Machine Learning and Appli- cations, IEEE, 2010, pp. 71–77. doi:10.1109/ICMLA.2010.18
-
[6]
A. M. Durán-Rosal, J. Camacho-Cañamón, P . A. Gutiérrez, M. V . Guiote Moreno, E. Rodríguez-Cáceres, J. A. V allejo Casas, C. Hervás-Martínez, Ordinal classifi- cation of the a ffectation level of 3d-images in parkinson diseases, Scientific Re- ports 11 (1) (2021) 7067. doi:10.1038/s41598-021-86538-y
-
[7]
Y . Lei, H. Zhu, J. Zhang, H. Shan, Meta ordinal regression forest for medical image classification with ordinal labels, IEEE /CAA Journal of Automatica Sinica 9 (7) (2022) 1233–1247. doi:10.1109/JAS.2022.105668
-
[8]
X. Xian, J. Li, K. Liu, Causation-based monitoring and diagnosis for multivari- ate categorical processes with ordinal information, IEEE Transactions on Au- tomation Science and Engineering 16 (2) (2018) 886–897. doi:10.1109/TASE. 2018.2873365
-
[9]
S. Tang, T. Lu, X. Liu, H. Zhou, Y . Zhang, Catnet: Convolutional attention and transformer for monocular depth estimation, Pattern Recognition 145 (2024) 109982. doi:10.1016/j.patcog.2023.109982
-
[10]
A. M. Gómez-Orellana, D. Guijo-Rubio, P . A. Gutiérrez, C. Hervás-Martínez, V . M. V argas, Orfeo: Ordinal classifier and regressor fusion for estimating an ordinal categorical target, Engineering Applications of Artificial Intelligence 133 (2024) 108462. doi:10.1016/j.engappai.2024.108462
-
[11]
L. Goldmann, J. Crook, R. Calabrese, A new ordinal mixed-data sampling model with an application to corporate credit rating levels, European Journal of Opera- tional Research 314 (3) (2024) 1111–1126. doi:10.1016/j.ejor.2023.10. 017. 28
-
[12]
R. He, T. Tan, L. Davis, Z. Sun, Learning structured ordinal measures for video based face recognition, Pattern Recognition 75 (2018) 4–14. doi:10.1016/j. patcog.2017.02.005
work page doi:10.1016/j 2018
-
[13]
R. Xu, Z. Wang, J. Chen, L. Zhou, Facial expression intensity estimation us- ing label-distribution-learning-enhanced ordinal regression, Multimedia Systems 30 (1) (2024) 13. doi:10.1007/s00530-023-01219-2
-
[14]
V . M. V argas, A. M. Gómez-Orellana, D. Guijo-Rubio, F. Bérchez-Moreno, P . A. Gutiérrez, C. Hervás-Martínez, Age estimation using soft labelling ordinal clas- sification approaches, in: Conference of the Spanish Association for Artificial In- telligence, Springer, 2024, pp. 40–49. doi:10.1007/978-3-031-62799-6_5
-
[15]
H. Zhu, H. Shan, Y . Zhang, L. Che, X. Xu, J. Zhang, J. Shi, F.-Y . Wang, Con- volutional ordinal regression forest for image ordinal estimation, IEEE trans- actions on neural networks and learning systems 33 (8) (2021) 4084–4095. doi:10.1109/TNNLS.2021.3055816
-
[16]
C. Peláez-Rodríguez, J. Pérez-Aracil, C. M. Marina, L. Prieto-Godino, C. Casanova-Mateo, P . A. Gutiérrez, S. Salcedo-Sanz, A general explicable fore- casting framework for weather events based on ordinal classification and induc- tive rules combined with fuzzy logic, Knowledge-Based Systems 291 (2024) 111556. doi:10.1016/j.knosys.2024.111556
-
[17]
J. Wang, M. Shuo, L. Wang, F. Sun, R. Pan, W. Gao, K. Shi, Objective evalu- ation of fabric smoothness appearance with an ordinal classification framework based on label noise estimation, Textile Research Journal 91 (3-4) (2021) 316–
work page 2021
-
[18]
doi:10.1177/0040517520939574
-
[19]
S. Baccianella, A. Esuli, F. Sebastiani, Feature selection for ordinal text classi- fication, Neural computation 26 (3) (2014) 557–591. doi:10.1162/NECO_a_ 00558
-
[20]
P . Yildirim, U. K. Birant, D. Birant, EBOC: Ensemble-based ordinal classification 29 in transportation, Journal of Advanced Transportation 2019 (1) (2019) 7482138. doi:/10.1155/2019/7482138
-
[21]
C.-W. Seah, I. W. Tsang, Y .-S. Ong, Transductive ordinal regression, IEEE transactions on neural networks and learning systems 23 (7) (2012) 1074–1086. doi:10.1109/TNNLS.2012.2198240
-
[22]
M. Lázaro, A. R. Figueiras-Vidal, Neural network for ordinal classification of imbalanced data by minimizing a bayesian cost, Pattern Recognition 137 (2023) 109303. doi:10.1016/j.patcog.2023.109303
-
[23]
L. Kook, L. Herzog, T. Hothorn, O. Dürr, B. Sick, Deep and interpretable re- gression models for ordinal outcomes, Pattern Recognition 122 (2022) 108263. doi:10.1016/j.patcog.2021.108263
-
[24]
V . M. V argas, P . A. Gutiérrez, C. Hervás-Martínez, Unimodal regularisation based on beta distribution for deep ordinal regression, Pattern Recognition 122 (2022) 108310. doi:10.1016/j.patcog.2021.108310
- [25]
-
[26]
Y . Lei, Z. Li, Y . Li, J. Zhang, H. Shan, Core: Learning consistent ordinal repre- sentations with convex optimization for image ordinal estimation, Pattern Recog- nition 156 (2024) 110748. doi:10.1016/j.patcog.2024.110748
-
[27]
G. Tutz, Ordinal regression: A review and a taxonomy of models, Wiley In- terdisciplinary Reviews: Computational Statistics 14 (2) (2022) e1545. doi: 10.1002/wics.1545
- [28]
-
[29]
B. Leo, J. H. Friedman, C. J. Stone, R. A. Olsehn, Classification and regres- sion trees. wadsworth statistics /probability, Monterey, CA: Wadsworth&Brooks (1984)
work page 1984
-
[30]
J. R. Quinlan, Induction of decision trees, Machine learning 1 (1986) 81–106. doi:10.1007/BF00116251
-
[31]
J. R. Quinlan, C4. 5: programs for machine learning, Elsevier, 2014
work page 2014
-
[32]
C. E. Shannon, A mathematical theory of communication, The Bell system tech- nical journal 27 (3) (1948) 379–423. doi:10.1063/1.3067010
-
[33]
H. Mamdouh Farghaly, T. Abd El-Hafeez, A new feature selection method based on frequent and associated itemsets for text classification, Concurrency and Com- putation: Practice and Experience 34 (25) (2022) e7258. doi:10.1002/cpe. 7258
work page doi:10.1002/cpe 2022
-
[34]
L. Jiao, H. Y ang, F. Wang, Z.-g. Liu, Q. Pan, Dtec: Decision tree-based evidential clustering for interpretable partition of uncertain data, Pattern Recognition 144 (2023) 109846. doi:10.1016/j.patcog.2023.109846
-
[35]
G. Mostafa, H. Mahmoud, T. Abd El-Hafeez, M. E. ElAraby, Feature reduction for hepatocellular carcinoma prediction using machine learning algorithms, Jour- nal of Big Data 11 (1) (2024) 88. doi:10.1186/s40537-024-00944-3
-
[37]
R. Potharst, A. J. Feelders, Classification trees for problems with monotonicity constraints, ACM SIGKDD Explorations Newsletter 4 (1) (2002) 1–10. doi: 10.1145/568574.568577
-
[38]
Q. Hu, X. Che, L. Zhang, D. Zhang, M. Guo, D. Y u, Rank entropy-based decision trees for monotonic classification, IEEE Transactions on Knowledge and Data Engineering 24 (11) (2011) 2052–2064. doi:10.1109/TKDE.2011.149. 31
-
[39]
M. Kelbert, I. Stuhl, Y . Suhov, Weighted entropy: basic inequalities, Modern Stochastics: Theory and Applications 4 (3) (2017) 233–252. doi:10.1007/ s00010-015-0396-5
work page 2017
-
[40]
G. Singer, R. Anuar, I. Ben-Gal, A weighted information-gain measure for ordinal classification trees, Expert Systems with Applications 152 (2020) 113375. doi: 10.1016/j.eswa.2020.113375
-
[41]
G. Singer, M. Marudi, Ordinal decision-tree-based ensemble approaches: The case of controlling the daily local growth rate of the covid-19 epidemic, Entropy 22 (8) (2020) 871. doi:10.3390/e22080871
-
[42]
G. Singer, I. Cohen, An objective-based entropy approach for interpretable deci- sion tree models in support of human resource management: The case of absen- teeism at work, Entropy 22 (8) (2020) 821. doi:10.3390/e22080821
-
[43]
F. Xia, W. Zhang, J. Wang, An e ffective tree-based algorithm for ordinal regres- sion, IEEE Intell. Informatics Bull. 7 (1) (2006) 22–26
work page 2006
-
[44]
R. Hornung, Ordinalforest: Ordinal forests: Prediction and variable ranking with ordinal target variables, R package version 2 (2018)
work page 2018
-
[45]
S. Janitza, G. Tutz, A.-L. Boulesteix, Random forest for ordinal responses: prediction and variable selection, Computational Statistics & Data Analysis 96 (2016) 57–73. doi:10.1016/j.csda.2015.10.005
-
[46]
G. Tutz, Ordinal trees and random forests: Score-free recursive partitioning and improved ensembles, Journal of Classification 39 (2) (2022) 241–263. doi:10. 1007/s00357-021-09406-4
work page 2022
-
[47]
W. Buntine, T. Niblett, A further comparison of splitting rules for decision-tree induction, Machine Learning 8 (1992) 75–85. doi:10.1007/BF00994006
-
[48]
F. Xia, W. Zhang, J. Wang, An e ffective tree-based algorithm for ordinal regres- sion., IEEE Intell. Informatics Bull. 7 (1) (2006) 22–26. 32
work page 2006
-
[49]
R. Piccarreta, Classification trees for ordinal variables, Computational Statistics 23 (3) (2008) 407–427. doi:10.1007/s00180-007-0077-5
-
[50]
V . M. V argas, P . A. Gutiérrez, R. Rosati, L. Romeo, E. Frontoni, C. Hervás- Martínez, Exponential loss regularisation for encouraging ordinal constraint to shotgun stocks quality assessment, Applied Soft Computing 138 (2023) 110191. doi:10.1016/j.asoc.2023.110191
-
[51]
E. S. Epstein, A scoring system for probability forecasts of ranked categories, Journal of Applied Meteorology (1962-1982) 8 (6) (1969) 985–987. doi:10. 1175/1520-0450(1969)008<0985:ASSFPF>2.0.CO;2
work page 1962
-
[52]
R. A. Fisher, Theory of statistical estimation, in: Mathematical proceedings of the Cambridge philosophical society, V ol. 22, Cambridge University Press, 1925, pp. 700–725. doi:10.1017/S0305004100009580
-
[53]
R. G. Miller Jr, Beyond ANOV A: basics of applied statistics, CRC press, 1997. doi:10.1201/b15236
-
[54]
Hornung, Ordinal forests, Journal of Classification 37 (1) (2020) 4–17
R. Hornung, Ordinal forests, Journal of Classification 37 (1) (2020) 4–17. doi: 10.1007/s00357-018-9302-x . 33 Splitting criteria for ordinal decision trees: an experimental study Rafael Ayllón-Gavilána,b, Francisco José Martínez-Estudillo c, David Guijo-Rubio d,∗, César Hervás-Martínezd, Pedro A. Gutiérrez d aDepartment of Clinical-Epidemiological Researc...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.