pith. sign in

arxiv: 2412.13697 · v3 · submitted 2024-12-18 · 💻 cs.LG

Splitting criteria for ordinal decision trees: an experimental study

Pith reviewed 2026-05-23 07:03 UTC · model grok-4.3

classification 💻 cs.LG
keywords ordinal classificationdecision treessplitting criteriaordinal ginimean absolute errorexperimental studymachine learningimpurity measures
0
0 comments X

The pith

The Ordinal Gini splitting criterion reduces mean absolute error by more than 3.02% compared to standard Gini in decision trees for ordinal classification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper surveys and standardizes splitting criteria for decision trees when class labels carry a natural order rather than being unrelated categories. It implements three order-aware criteria inside a common tree learner and runs them against their nominal versions on forty-five public datasets. Experiments using ordinal-specific error measures show that Ordinal Gini produces the lowest average deviation from true labels. A reader would care because ordered outcomes appear in rating scales, severity levels, and ranking tasks where treating classes as equal leads to avoidable mistakes.

Core claim

The paper establishes through extensive experiments that the Ordinal Gini criterion is the strongest among the ordinal splitting methods tested, delivering more than a 3.02% reduction in mean absolute error relative to the conventional Gini criterion across 45 ordinal classification datasets.

What carries the argument

Ordinal splitting criteria, particularly Ordinal Gini, which adjust the calculation of node impurity to incorporate the ordered distances between class labels rather than treating them as unrelated.

If this is right

  • Decision trees using OGini achieve lower mean absolute error than those using Gini on ordinal tasks.
  • The performance advantage appears consistently across multiple ordinal evaluation metrics on the tested datasets.
  • Ordinal criteria like WIG and RI also improve over their nominal versions but less than OGini does.
  • Providing the full code and datasets allows direct verification and extension of these comparisons.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the superiority holds, standard machine learning libraries could incorporate OGini as a default option for ordered targets.
  • Similar order-aware adjustments might improve other tree induction algorithms beyond basic decision trees.
  • Domains with naturally ordered outcomes, such as risk assessment, may see practical gains from switching splitting rules.

Load-bearing premise

The collection of 45 public datasets and the specific error measures used are representative of ordinal classification problems in general.

What would settle it

Finding an ordinal dataset or set of problems where the standard Gini criterion yields a lower mean absolute error than OGini would challenge the superiority claim.

Figures

Figures reproduced from arXiv: 2412.13697 by C\'esar Herv\'as-Mart\'inez, David Guijo-Rubio, Francisco Jos\'e Mart\'inez-Estudillo, Pedro Antonio Guti\'errez, Rafael Ayll\'on-Gavil\'an.

Figure 1
Figure 1. Figure 1: Synthetic OC problem where samples are represented in a 2D feature space. These samples are [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Representation of two different ordinal distributions for a problem with four classes, and associated values for the different impurity measures considered in this work. the classes within it are not contiguous. Conversely, the split B adheres to an ordinal structure: the left sub-node includes patterns for C1 and C2, and the right sub-node contains patterns for C3 and C4, preserving the order relationship… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of two node splits based on nominal and ordinal splitting criteria. The split A does not [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Confusion matrices obtained in the test set by Gini and OGini in the [PITH_FULL_IMAGE:figures/full_fig_p023_4.png] view at source ↗
read the original abstract

Ordinal Classification (OC) addresses those classification tasks where the labels exhibit a natural order. Unlike nominal classification, which treats all classes as mutually exclusive and unordered, OC takes the ordinal relationship into account, producing more accurate and relevant results. This is particularly critical in applications where the magnitude of classification errors has significant consequences. Despite this, OC problems are often tackled using nominal methods, leading to suboptimal solutions. Although decision trees are among the most popular classification approaches, ordinal tree-based approaches have received less attention when compared to other classifiers. This work provides a comprehensive survey of ordinal splitting criteria, standardising the notations used in the literature to enhance clarity and consistency. Three ordinal splitting criteria, Ordinal Gini (OGini), Weighted Information Gain (WIG), and Ranking Impurity (RI), are compared to the nominal counterparts of the first two (Gini and information gain), by incorporating them into a decision tree classifier. An extensive repository considering $45$ publicly available OC datasets is presented, supporting the first experimental comparison of ordinal and nominal splitting criteria using well-known OC evaluation metrics. The results have been statistically analysed, highlighting that OGini stands out as the best ordinal splitting criterion to date, reducing the mean absolute error achieved by Gini by more than 3.02%. To promote reproducibility, all source code developed, a detailed guide for reproducing the results, the 45 OC datasets, and the individual results for all the evaluated methodologies are provided.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper surveys ordinal splitting criteria for decision trees, standardizes notations from the literature, and experimentally compares three ordinal criteria (OGini, WIG, RI) to their nominal counterparts (Gini, information gain) by embedding them in decision trees. Using 45 public ordinal classification datasets and standard OC metrics, it reports that OGini yields the best performance, reducing mean absolute error by more than 3.02% relative to Gini, and supplies full reproducibility artifacts including code, datasets, and per-run results.

Significance. If the observed ranking is robust, the work supplies a useful standardized benchmark for ordinal tree induction and identifies a practically strong default criterion. The explicit release of all source code, a reproduction guide, the 45 datasets, and individual results constitutes a clear strength for reproducibility.

major comments (2)
  1. [Abstract] Abstract: the headline claim that OGini is 'the best ordinal splitting criterion to date' and reduces MAE by >3.02% rests on the ranking across the 45 datasets, yet no stratification, meta-analysis, or regression by dataset properties (class cardinality, sample size, imbalance, domain) is described; without such analysis the generalization step from this corpus to other ordinal problems remains unsupported.
  2. [Abstract] Abstract / experimental protocol: the manuscript states that 'the results have been statistically analysed' but supplies no description of the exact tests, handling of ties, or correction for multiple comparisons across criteria and metrics; these details are load-bearing for validating the reported superiority.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript to improve the support for our claims and the description of our experimental protocol.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that OGini is 'the best ordinal splitting criterion to date' and reduces MAE by >3.02% rests on the ranking across the 45 datasets, yet no stratification, meta-analysis, or regression by dataset properties (class cardinality, sample size, imbalance, domain) is described; without such analysis the generalization step from this corpus to other ordinal problems remains unsupported.

    Authors: The study is presented as an experimental benchmark on a corpus of 45 publicly available ordinal datasets that already span a range of class cardinalities, sample sizes, imbalance levels, and application domains. The headline claim is therefore scoped to performance within this corpus rather than a universal assertion. We acknowledge that explicit stratification or meta-regression would strengthen statements about broader generalization. We will add a short analysis correlating performance differences with key dataset properties (e.g., number of classes and sample size) and will revise the abstract wording to make the scope of the claim clearer. revision: partial

  2. Referee: [Abstract] Abstract / experimental protocol: the manuscript states that 'the results have been statistically analysed' but supplies no description of the exact tests, handling of ties, or correction for multiple comparisons across criteria and metrics; these details are load-bearing for validating the reported superiority.

    Authors: We agree that the precise statistical procedures must be documented. The analysis employed the Wilcoxon signed-rank test on per-dataset metric differences, with ties handled by the standard mid-rank method and Bonferroni correction applied across the set of pairwise comparisons. We will insert a dedicated paragraph in the experimental protocol section (and a brief reference in the abstract) that fully specifies the tests, significance level, tie handling, and multiplicity correction. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical comparison on external datasets

full rationale

The paper performs an experimental comparison of existing ordinal splitting criteria (OGini, WIG, RI) against nominal baselines on 45 publicly available OC datasets using standard evaluation metrics. No derivation chain exists; claims rest on observed empirical rankings rather than any quantity defined in terms of itself, a fitted parameter renamed as a prediction, or a self-citation chain. The 45 datasets are external public resources, and code is released for reproducibility. This is a standard empirical study with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work is an empirical comparison that relies on existing splitting criteria and standard machine-learning evaluation practices; no new free parameters, mathematical axioms, or invented entities are introduced.

axioms (1)
  • domain assumption Standard non-parametric statistical tests are appropriate for comparing classifier performance across multiple datasets.
    The abstract states that results have been statistically analysed.

pith-pipeline@v0.9.0 · 5821 in / 1074 out tokens · 30922 ms · 2026-05-23T07:03:31.535263+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

  1. [1]

    P . A. Gutiérrez, M. Pérez-Ortíz, J. Sánchez-Monedero, F. Fernández-Navarro, C. Hervás-Martínez, Ordinal regression methods: survey and experimental study, IEEE Transactions on Knowledge and Data Engineering 28 (1) (2015) 127–146. doi:10.1109/TKDE.2015.2457911

  2. [2]

    Large, E

    J. Large, E. K. Kemsley, N. Wellner, I. Goodall, A. Bagnall, Detecting forged al- cohol non-invasively through vibrational spectroscopy and machine learning, in: Advances in Knowledge Discovery and Data Mining: 22nd Pacific-Asia Confer- ence, PAKDD 2018, Melbourne, VIC, Australia, June 3-6, 2018, Proceedings, Part I 22, Springer, 2018, pp. 298–309. doi:10....

  3. [3]

    Guijo-Rubio, J

    D. Guijo-Rubio, J. Briceño, P . A. Gutiérrez, M. D. Ayllón, R. Ciria, C. Hervás- Martínez, Statistical methods versus machine learning techniques for donor- recipient matching in liver transplantation, PLoS One 16 (5) (2021) e0252068. doi:10.1371/journal.pone.0252068

  4. [4]

    S. G. Armato III, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P . Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, E. A. Ho ffman, et al., The lung 27 image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans, Medical physics 38 (2) (2011) 915–931. doi:10.1118/1.3528204

  5. [5]

    J. S. Cardoso, R. Sousa, Classification models with global constraints for ordinal data, in: 2010 Ninth International Conference on Machine Learning and Appli- cations, IEEE, 2010, pp. 71–77. doi:10.1109/ICMLA.2010.18

  6. [6]

    A. M. Durán-Rosal, J. Camacho-Cañamón, P . A. Gutiérrez, M. V . Guiote Moreno, E. Rodríguez-Cáceres, J. A. V allejo Casas, C. Hervás-Martínez, Ordinal classifi- cation of the a ffectation level of 3d-images in parkinson diseases, Scientific Re- ports 11 (1) (2021) 7067. doi:10.1038/s41598-021-86538-y

  7. [7]

    Y . Lei, H. Zhu, J. Zhang, H. Shan, Meta ordinal regression forest for medical image classification with ordinal labels, IEEE /CAA Journal of Automatica Sinica 9 (7) (2022) 1233–1247. doi:10.1109/JAS.2022.105668

  8. [8]

    X. Xian, J. Li, K. Liu, Causation-based monitoring and diagnosis for multivari- ate categorical processes with ordinal information, IEEE Transactions on Au- tomation Science and Engineering 16 (2) (2018) 886–897. doi:10.1109/TASE. 2018.2873365

  9. [9]

    S. Tang, T. Lu, X. Liu, H. Zhou, Y . Zhang, Catnet: Convolutional attention and transformer for monocular depth estimation, Pattern Recognition 145 (2024) 109982. doi:10.1016/j.patcog.2023.109982

  10. [10]

    A. M. Gómez-Orellana, D. Guijo-Rubio, P . A. Gutiérrez, C. Hervás-Martínez, V . M. V argas, Orfeo: Ordinal classifier and regressor fusion for estimating an ordinal categorical target, Engineering Applications of Artificial Intelligence 133 (2024) 108462. doi:10.1016/j.engappai.2024.108462

  11. [11]

    Goldmann, J

    L. Goldmann, J. Crook, R. Calabrese, A new ordinal mixed-data sampling model with an application to corporate credit rating levels, European Journal of Opera- tional Research 314 (3) (2024) 1111–1126. doi:10.1016/j.ejor.2023.10. 017. 28

  12. [12]

    R. He, T. Tan, L. Davis, Z. Sun, Learning structured ordinal measures for video based face recognition, Pattern Recognition 75 (2018) 4–14. doi:10.1016/j. patcog.2017.02.005

  13. [13]

    R. Xu, Z. Wang, J. Chen, L. Zhou, Facial expression intensity estimation us- ing label-distribution-learning-enhanced ordinal regression, Multimedia Systems 30 (1) (2024) 13. doi:10.1007/s00530-023-01219-2

  14. [14]

    V . M. V argas, A. M. Gómez-Orellana, D. Guijo-Rubio, F. Bérchez-Moreno, P . A. Gutiérrez, C. Hervás-Martínez, Age estimation using soft labelling ordinal clas- sification approaches, in: Conference of the Spanish Association for Artificial In- telligence, Springer, 2024, pp. 40–49. doi:10.1007/978-3-031-62799-6_5

  15. [15]

    H. Zhu, H. Shan, Y . Zhang, L. Che, X. Xu, J. Zhang, J. Shi, F.-Y . Wang, Con- volutional ordinal regression forest for image ordinal estimation, IEEE trans- actions on neural networks and learning systems 33 (8) (2021) 4084–4095. doi:10.1109/TNNLS.2021.3055816

  16. [16]

    Peláez-Rodríguez, J

    C. Peláez-Rodríguez, J. Pérez-Aracil, C. M. Marina, L. Prieto-Godino, C. Casanova-Mateo, P . A. Gutiérrez, S. Salcedo-Sanz, A general explicable fore- casting framework for weather events based on ordinal classification and induc- tive rules combined with fuzzy logic, Knowledge-Based Systems 291 (2024) 111556. doi:10.1016/j.knosys.2024.111556

  17. [17]

    J. Wang, M. Shuo, L. Wang, F. Sun, R. Pan, W. Gao, K. Shi, Objective evalu- ation of fabric smoothness appearance with an ordinal classification framework based on label noise estimation, Textile Research Journal 91 (3-4) (2021) 316–

  18. [18]

    doi:10.1177/0040517520939574

  19. [19]

    Baccianella, A

    S. Baccianella, A. Esuli, F. Sebastiani, Feature selection for ordinal text classi- fication, Neural computation 26 (3) (2014) 557–591. doi:10.1162/NECO_a_ 00558

  20. [20]

    Yildirim, U

    P . Yildirim, U. K. Birant, D. Birant, EBOC: Ensemble-based ordinal classification 29 in transportation, Journal of Advanced Transportation 2019 (1) (2019) 7482138. doi:/10.1155/2019/7482138

  21. [21]

    C.-W. Seah, I. W. Tsang, Y .-S. Ong, Transductive ordinal regression, IEEE transactions on neural networks and learning systems 23 (7) (2012) 1074–1086. doi:10.1109/TNNLS.2012.2198240

  22. [22]

    Lázaro, A

    M. Lázaro, A. R. Figueiras-Vidal, Neural network for ordinal classification of imbalanced data by minimizing a bayesian cost, Pattern Recognition 137 (2023) 109303. doi:10.1016/j.patcog.2023.109303

  23. [23]

    L. Kook, L. Herzog, T. Hothorn, O. Dürr, B. Sick, Deep and interpretable re- gression models for ordinal outcomes, Pattern Recognition 122 (2022) 108263. doi:10.1016/j.patcog.2021.108263

  24. [24]

    V . M. V argas, P . A. Gutiérrez, C. Hervás-Martínez, Unimodal regularisation based on beta distribution for deep ordinal regression, Pattern Recognition 122 (2022) 108310. doi:10.1016/j.patcog.2021.108310

  25. [25]

    V . M. V argas, A. M. Durán-Rosal, D. Guijo-Rubio, P . A. Gutiérrez, C. Hervás- Martínez, Generalised triangular distributions for ordinal deep learning: Novel proposal and optimisation, Information Sciences 648 (2023) 119606. doi:10. 1016/j.ins.2023.119606

  26. [26]

    Y . Lei, Z. Li, Y . Li, J. Zhang, H. Shan, Core: Learning consistent ordinal repre- sentations with convex optimization for image ordinal estimation, Pattern Recog- nition 156 (2024) 110748. doi:10.1016/j.patcog.2024.110748

  27. [27]

    Tutz, Ordinal regression: A review and a taxonomy of models, Wiley In- terdisciplinary Reviews: Computational Statistics 14 (2) (2022) e1545

    G. Tutz, Ordinal regression: A review and a taxonomy of models, Wiley In- terdisciplinary Reviews: Computational Statistics 14 (2) (2022) e1545. doi: 10.1002/wics.1545

  28. [28]

    Marudi, I

    M. Marudi, I. Ben-Gal, G. Singer, A decision tree-based method for ordinal clas- sification problems, IISE Transactions 56 (9) (2024) 960–974. doi:10.1080/ 24725854.2022.2081745. 30

  29. [29]

    B. Leo, J. H. Friedman, C. J. Stone, R. A. Olsehn, Classification and regres- sion trees. wadsworth statistics /probability, Monterey, CA: Wadsworth&Brooks (1984)

  30. [30]

    J. R. Quinlan, Induction of decision trees, Machine learning 1 (1986) 81–106. doi:10.1007/BF00116251

  31. [31]

    J. R. Quinlan, C4. 5: programs for machine learning, Elsevier, 2014

  32. [32]

    C. E. Shannon, A mathematical theory of communication, The Bell system tech- nical journal 27 (3) (1948) 379–423. doi:10.1063/1.3067010

  33. [33]

    Mamdouh Farghaly, T

    H. Mamdouh Farghaly, T. Abd El-Hafeez, A new feature selection method based on frequent and associated itemsets for text classification, Concurrency and Com- putation: Practice and Experience 34 (25) (2022) e7258. doi:10.1002/cpe. 7258

  34. [34]

    L. Jiao, H. Y ang, F. Wang, Z.-g. Liu, Q. Pan, Dtec: Decision tree-based evidential clustering for interpretable partition of uncertain data, Pattern Recognition 144 (2023) 109846. doi:10.1016/j.patcog.2023.109846

  35. [35]

    Mostafa, H

    G. Mostafa, H. Mahmoud, T. Abd El-Hafeez, M. E. ElAraby, Feature reduction for hepatocellular carcinoma prediction using machine learning algorithms, Jour- nal of Big Data 11 (1) (2024) 88. doi:10.1186/s40537-024-00944-3

  36. [37]

    Potharst, A

    R. Potharst, A. J. Feelders, Classification trees for problems with monotonicity constraints, ACM SIGKDD Explorations Newsletter 4 (1) (2002) 1–10. doi: 10.1145/568574.568577

  37. [38]

    Q. Hu, X. Che, L. Zhang, D. Zhang, M. Guo, D. Y u, Rank entropy-based decision trees for monotonic classification, IEEE Transactions on Knowledge and Data Engineering 24 (11) (2011) 2052–2064. doi:10.1109/TKDE.2011.149. 31

  38. [39]

    Kelbert, I

    M. Kelbert, I. Stuhl, Y . Suhov, Weighted entropy: basic inequalities, Modern Stochastics: Theory and Applications 4 (3) (2017) 233–252. doi:10.1007/ s00010-015-0396-5

  39. [40]

    Singer, R

    G. Singer, R. Anuar, I. Ben-Gal, A weighted information-gain measure for ordinal classification trees, Expert Systems with Applications 152 (2020) 113375. doi: 10.1016/j.eswa.2020.113375

  40. [41]

    Singer, M

    G. Singer, M. Marudi, Ordinal decision-tree-based ensemble approaches: The case of controlling the daily local growth rate of the covid-19 epidemic, Entropy 22 (8) (2020) 871. doi:10.3390/e22080871

  41. [42]

    Singer, I

    G. Singer, I. Cohen, An objective-based entropy approach for interpretable deci- sion tree models in support of human resource management: The case of absen- teeism at work, Entropy 22 (8) (2020) 821. doi:10.3390/e22080821

  42. [43]

    F. Xia, W. Zhang, J. Wang, An e ffective tree-based algorithm for ordinal regres- sion, IEEE Intell. Informatics Bull. 7 (1) (2006) 22–26

  43. [44]

    Hornung, Ordinalforest: Ordinal forests: Prediction and variable ranking with ordinal target variables, R package version 2 (2018)

    R. Hornung, Ordinalforest: Ordinal forests: Prediction and variable ranking with ordinal target variables, R package version 2 (2018)

  44. [45]

    Janitza, G

    S. Janitza, G. Tutz, A.-L. Boulesteix, Random forest for ordinal responses: prediction and variable selection, Computational Statistics & Data Analysis 96 (2016) 57–73. doi:10.1016/j.csda.2015.10.005

  45. [46]

    Tutz, Ordinal trees and random forests: Score-free recursive partitioning and improved ensembles, Journal of Classification 39 (2) (2022) 241–263

    G. Tutz, Ordinal trees and random forests: Score-free recursive partitioning and improved ensembles, Journal of Classification 39 (2) (2022) 241–263. doi:10. 1007/s00357-021-09406-4

  46. [47]

    Buntine, T

    W. Buntine, T. Niblett, A further comparison of splitting rules for decision-tree induction, Machine Learning 8 (1992) 75–85. doi:10.1007/BF00994006

  47. [48]

    F. Xia, W. Zhang, J. Wang, An e ffective tree-based algorithm for ordinal regres- sion., IEEE Intell. Informatics Bull. 7 (1) (2006) 22–26. 32

  48. [49]

    Piccarreta, Classification trees for ordinal variables, Computational Statistics 23 (3) (2008) 407–427

    R. Piccarreta, Classification trees for ordinal variables, Computational Statistics 23 (3) (2008) 407–427. doi:10.1007/s00180-007-0077-5

  49. [50]

    V . M. V argas, P . A. Gutiérrez, R. Rosati, L. Romeo, E. Frontoni, C. Hervás- Martínez, Exponential loss regularisation for encouraging ordinal constraint to shotgun stocks quality assessment, Applied Soft Computing 138 (2023) 110191. doi:10.1016/j.asoc.2023.110191

  50. [51]

    E. S. Epstein, A scoring system for probability forecasts of ranked categories, Journal of Applied Meteorology (1962-1982) 8 (6) (1969) 985–987. doi:10. 1175/1520-0450(1969)008<0985:ASSFPF>2.0.CO;2

  51. [52]

    R. A. Fisher, Theory of statistical estimation, in: Mathematical proceedings of the Cambridge philosophical society, V ol. 22, Cambridge University Press, 1925, pp. 700–725. doi:10.1017/S0305004100009580

  52. [53]

    R. G. Miller Jr, Beyond ANOV A: basics of applied statistics, CRC press, 1997. doi:10.1201/b15236

  53. [54]

    Hornung, Ordinal forests, Journal of Classification 37 (1) (2020) 4–17

    R. Hornung, Ordinal forests, Journal of Classification 37 (1) (2020) 4–17. doi: 10.1007/s00357-018-9302-x . 33 Splitting criteria for ordinal decision trees: an experimental study Rafael Ayllón-Gavilána,b, Francisco José Martínez-Estudillo c, David Guijo-Rubio d,∗, César Hervás-Martínezd, Pedro A. Gutiérrez d aDepartment of Clinical-Epidemiological Researc...