From Kellgren-Lawrence to Calcium Pyrophosphate Crystal Deposition: A Soft-Labelling Framework for Knee Osteoarthritis Assessmen

C\'esar Herv\'as-Mart\'inez; Edoardo Cipolletta; Emilio Filippucci; Francisco B\'erchez-Moreno; Luca Romeo; Maria Chiara Fiorentino; Pedro A. Guti\'errez; Riccardo Rosati; V\'ictor M. Vargas

arxiv: 2605.28176 · v1 · pith:GLJJL7DXnew · submitted 2026-05-27 · 💻 cs.CV

From Kellgren-Lawrence to Calcium Pyrophosphate Crystal Deposition: A Soft-Labelling Framework for Knee Osteoarthritis Assessmen

Francisco B\'erchez-Moreno , Riccardo Rosati , Maria Chiara Fiorentino , V\'ictor M. Vargas , Edoardo Cipolletta , Emilio Filippucci , Luca Romeo , Pedro A. Guti\'errez

show 1 more author

C\'esar Herv\'as-Mart\'inez

This is my paper

Pith reviewed 2026-06-29 13:07 UTC · model grok-4.3

classification 💻 cs.CV

keywords knee osteoarthritissoft labellingdeep learningX-ray gradingKellgren-LawrenceCPPDordinal classificationmedical imaging

0 comments

The pith

Soft-labelling with unimodal distributions improves ordinal grading of knee osteoarthritis on X-rays over one-hot labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops an ordinal deep learning framework that replaces one-hot labels with soft unimodal probability distributions for grading knee X-rays on both the Kellgren-Lawrence and CPPD scales. It tests four formulations—binomial, beta, triangular, and exponential—on a dataset of 2172 images including 968 jointly annotated for both tasks. All strategies outperformed the conventional one-hot baseline, with the triangular formulation reaching the highest QWK of 0.796 and lowest MAE of 0.438 on CPPD grading, and the beta formulation reaching QWK of 0.777 and MAE of 0.529 on KL grading. A sympathetic reader would care because this addresses the mismatch between standard classification losses and the ordinal, uncertain nature of clinical severity scores while respecting the observed asymmetry between the two scales.

Core claim

The central claim is that an ordinal DL framework based on soft-labelling, replacing one-hot targets with unimodal probability distributions centred on the annotated grade, consistently outperforms nominal one-hot supervision for both KL and CPPD grading tasks. Specifically, the triangular formulation achieved the highest QWK and lowest MAE for CPPD (QWK = 0.796; MAE = 0.438), while the beta-based approach provided the best overall performance for KL (QWK = 0.777; MAE = 0.529; AMAE = 0.523; MMAE = 0.775), with all soft-labelling strategies demonstrating statistically significant improvements over the baseline (p < 0.001).

What carries the argument

Soft-labelling via unimodal probability distributions (binomial, beta, triangular, exponential) centred on the annotated grade, used as targets instead of one-hot vectors.

If this is right

All four soft-labelling strategies improve Quadratic Weighted Kappa and reduce Mean Absolute Error compared to one-hot labels on both grading tasks.
The triangular formulation yields the best overall metrics for CPPD grading.
The beta formulation yields the best overall metrics for KL grading, including lowest class-wise errors.
The performance gains are statistically significant at p < 0.001 across the 2172-image dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may extend to other ordinal scoring tasks in radiology where annotation uncertainty is high.
Joint modelling of KL and CPPD could further exploit the asymmetric clinical relationship if the framework is adapted to multi-task training.
If the unimodal assumption holds across datasets, the method could lower sensitivity to inter-rater variability in clinical annotations.

Load-bearing premise

Unimodal probability distributions centred on the annotated grade accurately capture both the ordinal uncertainty of the scores and the asymmetric clinical relationship between the KL and CPPD scales.

What would settle it

A replication study on an independent set of knee X-rays where the soft-labelling models fail to achieve higher QWK or lower MAE than the one-hot baseline would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.28176 by C\'esar Herv\'as-Mart\'inez, Edoardo Cipolletta, Emilio Filippucci, Francisco B\'erchez-Moreno, Luca Romeo, Maria Chiara Fiorentino, Pedro A. Guti\'errez, Riccardo Rosati, V\'ictor M. Vargas.

**Figure 2.** Figure 2: Representative examples of the different Kellgren–Lawrence (KL) and Calcium [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Schematic representation of the proposed ordinal deep learning framework for [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: Example of the binomial (a), exponential (b), beta (c) and triangular (d) discrete [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Combined violin and box plots of the AMAE distributions for all methodologies. [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗

**Figure 6.** Figure 6: Confusion matrices comparing the best-performing soft-labelling models for each [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗

**Figure 7.** Figure 7: Grad-CAM visualisations across all severity grades for the nominal baseline and [PITH_FULL_IMAGE:figures/full_fig_p024_7.png] view at source ↗

**Figure 8.** Figure 8: Residuals with the difference of the average obtained contingency tables and [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗

**Figure 9.** Figure 9: Grad-CAM analysis of model robustness under co-occurring pathologies. The [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗

read the original abstract

Background and objective. Conventional Deep Learning (DL) approaches for Knee Osteoarthritis (KOA) grading rely on one-hot labels, which fail to capture both the ordinal uncertainty of Kellgren--Lawrence (KL) and Calcium Pyrophosphate Deposition Disease (CPPD) severity scores and the asymmetric relationship between the two scales observed in clinical practice. Methods. We retrospectively collected 2172 knee X-ray images, including 968 radiographs jointly annotated for KL and CPPD severity. An ordinal DL framework based on soft-labelling was developed for both tasks, replacing one-hot targets with unimodal probability distributions centred on the annotated grade. Four formulations were investigated: binomial, beta, triangular, and exponential. Results. All soft-labelling strategies consistently outperformed the nominal baseline. For CPPD grading, the triangular formulation achieved the highest Quadratic Weighted Kappa (QWK) and the lowest Mean Absolute Error (MAE) (QWK = 0.796; MAE = 0.438), while the beta formulation yielded the most balanced class-wise performance considering Average MAE (AMAE) and Maximum MAE (MMAE) across classes (AMAE = 0.458; MMAE = 0.573). For KL grading, the beta-based approach provided the best overall performance, achieving the highest QWK together with the lowest MAE and class-wise errors (QWK = 0.777; MAE = 0.529; AMAE = 0.523; MMAE = 0.775). Statistical analysis demonstrated significant improvements over conventional one-hot supervision (p < 0.001).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Soft labels beat one-hot on this knee dataset with reported p<0.001 gains, but the four fixed unimodal distributions are heuristics with no grounding in rater data or joint scale statistics.

read the letter

The paper compares four standard unimodal soft-label distributions (binomial, beta, triangular, exponential) against one-hot targets on 968 jointly annotated knee X-rays for both KL and CPPD grading. All four beat the baseline on QWK and MAE, with triangular strongest for CPPD (QWK 0.796) and beta for KL (QWK 0.777).

What stands out is the direct empirical test on a dataset that has both scales labeled on the same images, plus the reporting of class-wise AMAE and MMAE alongside the usual metrics.

The soft spots are straightforward. The distributions are chosen as fixed centered forms with no derivation from inter-rater variability, longitudinal data, or the KL-CPPD asymmetry mentioned in the background. The two tasks are trained separately, so the asymmetry is never actually modeled or tested. Any smoothing effect could produce similar gains; the uncertainty-modeling story therefore rests on an unvalidated assumption. The abstract gives no architecture, split, or exclusion details, which makes it hard to judge whether the p<0.001 result holds after proper controls.

This is useful for groups already working on ordinal grading of routine radiographs who want a quick label-encoding tweak to try. It is worth sending to peer review because the dataset size and the clean head-to-head comparison are concrete enough for referees to evaluate the methods and check the numbers.

Referee Report

3 major / 2 minor

Summary. The paper proposes a soft-labelling framework for ordinal grading of knee osteoarthritis on X-rays, replacing one-hot targets with four fixed unimodal distributions (binomial, beta, triangular, exponential) centered on the annotated KL or CPPD grade. Using 2172 images (968 jointly annotated), it reports that all soft-labelling variants outperform the one-hot baseline on QWK and MAE for both tasks, with the triangular distribution best for CPPD (QWK=0.796, MAE=0.438) and beta best for KL (QWK=0.777, MAE=0.529), all with p<0.001.

Significance. If the gains arise from faithful modeling of ordinal uncertainty rather than generic regularization, the approach could improve robustness in medical ordinal classification tasks. The use of jointly annotated cases and multiple distribution families is a positive empirical step, but the heuristic nature of the labels limits claims about capturing clinical asymmetry or uncertainty.

major comments (3)

[Methods] Methods (soft-labelling section): The four distributions are defined with fixed, hand-chosen parameters and applied independently to KL and CPPD; no derivation from inter-rater agreement data, longitudinal progression, or joint KL-CPPD statistics is provided, so the claim that they capture 'ordinal uncertainty' and 'asymmetric relationship' rests on an untested assumption.
[Results] Results and data description: The 968 jointly annotated radiographs are used only for separate per-task training; no joint model, cross-task loss, or analysis of KL-CPPD co-occurrence is presented, leaving the background claim of asymmetry unaddressed by the experiments.
[Results] Evaluation: Performance improvements are reported on QWK/MAE but no ablation compares the chosen unimodal forms against alternatives (e.g., learned label smoothing or empirical inter-rater distributions), so it is unclear whether gains exceed what standard regularization would achieve.

minor comments (2)

[Abstract] Abstract: The statistical test yielding p<0.001 is not named (paired t-test, Wilcoxon, etc.), and the exact data splits or cross-validation scheme are not summarized.
[Methods] Notation: The precise functional forms and parameter values for the beta, triangular, and exponential distributions should be given explicitly (e.g., as equations) rather than described only qualitatively.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We address each major point below, indicating revisions where the manuscript will be updated to clarify claims and strengthen the evaluation.

read point-by-point responses

Referee: [Methods] Methods (soft-labelling section): The four distributions are defined with fixed, hand-chosen parameters and applied independently to KL and CPPD; no derivation from inter-rater agreement data, longitudinal progression, or joint KL-CPPD statistics is provided, so the claim that they capture 'ordinal uncertainty' and 'asymmetric relationship' rests on an untested assumption.

Authors: We agree that the parameters were selected heuristically based on general ordinal properties rather than derived from inter-rater statistics, longitudinal data, or joint KL-CPPD co-occurrence in this dataset. The claims in the introduction regarding capturing ordinal uncertainty and asymmetry therefore rest on the suitability of unimodal distributions rather than empirical derivation. In the revised manuscript we will qualify these claims in the methods, introduction, and discussion, add a dedicated limitations paragraph, and emphasize that the contribution is the empirical demonstration of performance gains over one-hot labels. revision: yes
Referee: [Results] Results and data description: The 968 jointly annotated radiographs are used only for separate per-task training; no joint model, cross-task loss, or analysis of KL-CPPD co-occurrence is presented, leaving the background claim of asymmetry unaddressed by the experiments.

Authors: The jointly annotated cases were used exclusively for separate per-task training and evaluation. No joint model, cross-task loss, or co-occurrence analysis was performed, as the study scope was limited to validating soft-labelling for each grading task independently. The background reference to asymmetry draws from clinical literature rather than our results. We will revise the manuscript to remove any implication that the experiments address asymmetry and will add a future-work statement on multi-task or joint modeling. revision: yes
Referee: [Results] Evaluation: Performance improvements are reported on QWK/MAE but no ablation compares the chosen unimodal forms against alternatives (e.g., learned label smoothing or empirical inter-rater distributions), so it is unclear whether gains exceed what standard regularization would achieve.

Authors: We acknowledge that the current evaluation lacks ablations against other regularization strategies. We will add an ablation study comparing the four soft-labelling distributions against standard label smoothing (multiple epsilon values) using the same backbone and metrics. The revised results section will report these comparisons to clarify whether the unimodal forms provide benefits beyond generic smoothing. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison of fixed heuristic label encodings on collected data

full rationale

The paper performs an empirical study: 2172 radiographs (968 jointly annotated) are used to train ordinal DL models under one-hot vs. four fixed unimodal soft-label distributions (binomial, beta, triangular, exponential) centered on the annotated grade. Performance is measured by QWK, MAE, AMAE, MMAE with statistical tests. No equations derive a target quantity from fitted parameters within the paper; the distributions are chosen as alternative encodings rather than learned or self-referential. No self-citation chain, uniqueness theorem, or ansatz smuggling supports a central claim. The work is self-contained against external benchmarks (held-out test performance) and does not reduce any reported result to a quantity defined by its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based solely on abstract; no explicit free parameters, invented entities, or additional axioms described beyond standard assumptions of supervised learning on annotated medical images.

axioms (1)

domain assumption The 968 jointly annotated radiographs provide accurate ground-truth grades that reflect clinical severity and the observed asymmetry between scales.
Framework trains and evaluates directly against these annotations as targets.

pith-pipeline@v0.9.1-grok · 5880 in / 1091 out tokens · 48289 ms · 2026-06-29T13:07:03.704290+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

51 extracted references · 46 canonical work pages

[1]

K. D. Allen, L. Thoma, Y. Golightly, Epidemiology of osteoarthritis, Osteoarthritis and cartilage 30 (2) (2022) 184–195.doi:10.1016/j. joca.2021.04.020

work page doi:10.1016/j 2022
[2]

Sharma, Osteoarthritis of the knee, New England Journal of Medicine 384 (1) (2021) 51–59.doi:10.1056/NEJMcp1903768

L. Sharma, Osteoarthritis of the knee, New England Journal of Medicine 384 (1) (2021) 51–59.doi:10.1056/NEJMcp1903768

work page doi:10.1056/nejmcp1903768 2021
[3]

Sakellariou, P

G. Sakellariou, P. G. Conaghan, W. Zhang, J. W. Bijlsma, P. Boye- sen, M. A. D’Agostino, M. Doherty, D. Fodor, M. Kloppenburg, F. Miese, et al., Eular recommendations for the use of imaging in the clinical management of peripheral joint osteoarthritis, Annals of the rheumatic diseases 76 (9) (2017) 1484–1494.doi:10.1136/ annrheumdis-2016-210815

2017
[4]

M. D. Kohn, A. A. Sassoon, N. D. Fernando, Classifications in brief: Kellgren-lawrence classification of osteoarthritis, Clinical Orthopaedics and Related Research®474 (8) (2016) 1886–1893.doi:10.1007/ s11999-016-4732-4

2016
[5]

Filippou, E

G. Filippou, E. Filippucci, P. Mandl, A. Abhishek, A critical review of the available evidence on the diagnosis and clinical features of cppd: do we really need imaging?, Clinical rheumatology 40 (7) (2021) 2581–2592. doi:10.1007/s10067-020-05516-3

work page doi:10.1007/s10067-020-05516-3 2021
[6]

Q. D. Buchlak, J. Clair, N. Esmaili, A. Barmare, S. Chandrasekaran, Clinical outcomes associated with robotic and computer-navigated total knee arthroplasty: a machine learning-augmented systematic review, European Journal of Orthopaedic Surgery & Traumatology 32 (5) (2022) 915–931.doi:10.1007/s00590-021-03059-0

work page doi:10.1007/s00590-021-03059-0 2022
[7]

Y. X. Teoh, A. Othmani, K. W. Lai, S. L. Goh, J. Usman, Stratify- ing knee osteoarthritis features through multitask deep hybrid learning: data from the osteoarthritis initiative, Computer methods and programs in biomedicine 242 (2023) 107807.doi:10.1016/j.cmpb.2023.107807. 33

work page doi:10.1016/j.cmpb.2023.107807 2023
[8]

S. M. Ahmed, R. J. Mstafa, A comprehensive survey on bone seg- mentation techniques in knee osteoarthritis research: From conven- tional methods to deep learning, Diagnostics 12 (3) (2022) 611.doi: 10.3390/diagnostics12030611

work page doi:10.3390/diagnostics12030611 2022
[9]

L. Si, J. Zhong, J. Huo, K. Xuan, Z. Zhuang, Y. Hu, Q. Wang, H. Zhang, W. Yao, Deep learning in knee imaging: a systematic re- view utilizing a checklist for artificial intelligence in medical imaging (claim), European Radiology 32 (2) (2022) 1353–1361.doi:10.1007/ s00330-021-08190-4

2022
[10]

W. Lv, J. Peng, J. Hu, Y. Lu, Z. Zhou, H. Xu, K. Xing, X. Zhang, L. Lu, Lmsst-gcn: Longitudinal mri sub-structural texture guided graph convo- lution network for improved progression prediction of knee osteoarthri- tis, ComputerMethodsandProgramsinBiomedicine261(2025)108600. doi:10.1016/j.cmpb.2025.108600

work page doi:10.1016/j.cmpb.2025.108600 2025
[11]

Hinterwimmer, I

F. Hinterwimmer, I. Lazic, C. Suren, M. T. Hirschmann, F. Pohlig, D. Rueckert, R. Burgkart, R. von Eisenhart-Rothe, Machine learning in knee arthroplasty: specific data are key—a systematic review, Knee Surgery, Sports Traumatology, Arthroscopy 30 (2) (2022) 376–388.doi: 10.1007/s00167-021-06848-6

work page doi:10.1007/s00167-021-06848-6 2022
[12]

P. Chen, L. Gao, X. Shi, K. Allen, L. Yang, Fully automatic knee os- teoarthritis severity grading using deep neural networks with a novel ordinal loss, Computerized Medical Imaging and Graphics 75 (2019) 84–92.doi:10.1016/j.compmedimag.2019.06.002

work page doi:10.1016/j.compmedimag.2019.06.002 2019
[13]

C. W. Yong, K. Teo, B. P. Murphy, Y. C. Hum, Y. K. Tee, K. Xia, K. W. Lai, Knee osteoarthritis severity classification with ordinal regression module, MultimediaToolsandApplications81(29)(2022)41497–41509. doi:10.1007/s11042-021-10557-0

work page doi:10.1007/s11042-021-10557-0 2022
[14]

Kokkotis, S

C. Kokkotis, S. Moustakidis, E. Papageorgiou, G. Giakas, D. Tsaopou- los, Machine learning in knee osteoarthritis: A review, Osteoarthritis andCartilageOpen2(3)(2020)100069.doi:10.1016/j.ocarto.2020. 100069

work page doi:10.1016/j.ocarto.2020 2020
[15]

Upadhyay, O

A. Upadhyay, O. Sawant, P. Choudhary, Detection of knee osteoarthritis stages using convolutional neural network, SN Computer Science 4 (3) (2023) 257.doi:10.1007/s42979-022-01644-6. 34

work page doi:10.1007/s42979-022-01644-6 2023
[16]

Y. Wang, S. Li, B. Zhao, J. Zhang, Y. Yang, B. Li, A resnet-based approach for accurate radiographic diagnosis of knee osteoarthritis, CAAI Transactions on Intelligence Technology 7 (3) (2022) 512–521. doi:10.1049/cit2.12079

work page doi:10.1049/cit2.12079 2022
[17]

M. W. Brejnebøl, P. Hansen, J. U. Nybing, R. Bachmann, U. Ratjen, I. V. Hansen, A. Lenskjold, M. Axelsen, M. Lundemann, M. Boesen, External validation of an artificial intelligence tool for radiographic knee osteoarthritis severity classification, European Journal of Radiology 150 (2022) 110249.doi:10.1016/j.ejrad.2022.110249

work page doi:10.1016/j.ejrad.2022.110249 2022
[18]

S. V. Chaugule, V. Malemath, Knee osteoarthritis grading using densenet and radiographic images, SN Computer Science 4 (1) (2022) 63.doi:10.1007/s42979-022-01468-4

work page doi:10.1007/s42979-022-01468-4 2022
[19]

Kalpana, G

V. Kalpana, G. H. Kumar, et al., Evaluating the efficacy of deep learn- ing models for knee osteoarthritis prediction based on kellgren-lawrence grading system, e-Prime-Advances in Electrical Engineering, Electronics and Energy 5 (2023) 100266.doi:10.1016/j.prime.2023.100266

work page doi:10.1016/j.prime.2023.100266 2023
[20]

Jahan, M

M. Jahan, M. Z. Hasan, I. J. Samia, K. Fatema, M. A. H. Rony, M. S. Arefin, A. Moustafa, Koa-cctnet: An enhanced knee osteoarthri- tis grade assessment framework using modified compact convolutional transformer model, IEEE Access 12 (2024) 107719–107741.doi:10. 1109/ACCESS.2024.3435572

work page arXiv 2024
[21]

Maqsood, N

S. Maqsood, N. Maqsood, S. Shahid, F. E. Subhan, M. A. Sarwar, M. Yousufi, A. Qurthobi, A. Zafar, M. A. Khan, R. Damaševičius, et al., Knee osteoarthritis network: A hybrid transformer-based ap- proach for enhanced detection and grading of knee osteoarthritis, Engi- neering Applications of Artificial Intelligence 159 (2025) 111751.doi: 10.1016/j.engappai....

work page doi:10.1016/j.engappai.2025.111751 2025
[22]

Albuquerque, R

T. Albuquerque, R. Cruz, J. S. Cardoso, Ordinal losses for classification of cervical cancer risk, PeerJ Computer Science 7 (2021) e457.doi: 10.7717/peerj-cs.457

work page doi:10.7717/peerj-cs.457 2021
[23]

T. T. Le Vuong, K. Kim, B. Song, J. T. Kwak, Joint categorical and ordinal learning for cancer grading in pathology images, Medical image analysis 73 (2021) 102206.doi:10.1016/j.media.2021.102206. 35

work page doi:10.1016/j.media.2021.102206 2021
[24]

L. Wang, H. Wang, Y. Su, F. Lure, J. Li, A novel hybrid ordinal learning model with health care application, IEEE Transactions on Automation Science and Engineering 22 (2024) 339–352.doi:10.1109/TASE.2024. 3350894

work page doi:10.1109/tase.2024 2024
[25]

Rivera-Gavilán, V

M. Rivera-Gavilán, V. M. Vargas, P. A. Gutiérrez, J. Briceño, C. Hervás-Martínez, D. Guijo-Rubio, Ordinal classification approach for donor-recipient matching in liver transplantation with circula- tory death donors, in: International Work-Conference on Artifi- cial Neural Networks, Springer, 2023, pp. 517–528.doi:10.1007/ 978-3-031-43078-7_42

2023
[26]

H. L. Le, H. G. Roh, H. J. Kim, J. T. Kwak, A 3d multi-task regression and ordinal regression deep neural network for collateral imaging from dynamic susceptibility contrast-enhanced mr perfusion in acute ischemic stroke, Computer Methods and Programs in Biomedicine 225 (2022) 107071.doi:10.1016/j.cmpb.2022.107071

work page doi:10.1016/j.cmpb.2022.107071 2022
[27]

X. Liu, F. Fan, L. Kong, Z. Diao, W. Xie, J. Lu, J. You, Unimodal regu- larized neuron stick-breaking for ordinal classification, Neurocomputing 388 (2020) 34–44.doi:10.1016/j.neucom.2020.01.025

work page doi:10.1016/j.neucom.2020.01.025 2020
[28]

Q. Li, J. Wang, Z. Yao, Y. Li, P. Yang, J. Yan, C. Wang, S. Pu, Unimodal-concentrated loss: Fully adaptive label distribution learning for ordinal regression, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20513–20522. doi:10.1109/CVPR52688.2022.01986

work page doi:10.1109/cvpr52688.2022.01986 2022
[29]

V. M. Vargas, P. A. Gutiérrez, C. Hervás-Martínez, Unimodal regular- isation based on beta distribution for deep ordinal regression, Pattern Recognition 122 (2022) 108310.doi:10.1016/j.patcog.2021.108310

work page doi:10.1016/j.patcog.2021.108310 2022
[30]

V. M. Vargas, P. A. Gutiérrez, R. Rosati, L. Romeo, E. Frontoni, C. Hervás-Martínez, Exponential loss regularisation for encouraging or- dinalconstrainttoshotgunstocksqualityassessment, AppliedSoftCom- puting 138 (2023) 110191.doi:10.1016/j.asoc.2023.110191

work page doi:10.1016/j.asoc.2023.110191 2023
[31]

V. M. Vargas, P. A. Gutiérrez, J. Barbero-Gómez, C. Hervás-Martínez, Soft labelling based on triangular distributions for ordinal classification, 36 Information Fusion 93 (2023) 258–267.doi:10.1016/j.inffus.2023. 01.003

work page doi:10.1016/j.inffus.2023 2023
[32]

V. M. Vargas, A. M. Duran-Rosal, D. Guijo-Rubio, P. A. Gutierrez, C. Hervas-Martinez, Generalised triangular distributions for ordinal deep learning: Novel proposal and optimisation, Information Sciences 648 (2023) 119606.doi:10.1016/j.ins.2023.119606

work page doi:10.1016/j.ins.2023.119606 2023
[33]

J. S. Cardoso, R. P. Cruz, T. Albuquerque, Unimodal distributions for ordinal regression, IEEE Transactions on Artificial Intelligence 6 (2025) 2498–2509.doi:10.1109/TAI.2025.3549740

work page doi:10.1109/tai.2025.3549740 2025
[34]

V. M. Vargas, D. Guijo-Rubio, R. Ayllón-Gavilán, A. M. Gómez- Orellana, P. A. Gutiérrez, C. Hervás-Martínez, Soft labelling for deep ordinal classification: an experimental review, IEEE Transactions on Knowledge and Data Engineering (2026).doi:10.1109/TKDE.2026. 3681678

work page doi:10.1109/tkde.2026 2026
[35]

van Veldhuizen, V

V. van Veldhuizen, V. Botha, C. Lu, M. E. Cesur, K. G. Lipman, E. D. de Jong, H. Horlings, C. I. Sanchez, C. G. Snoek, L. Wessels, et al., Foundation models in medical imaging: A review and outlook, arXiv preprint arXiv:2506.09095 (2025).doi:10.48550/arXiv.2506.09095

work page doi:10.48550/arxiv.2506.09095 2025
[36]

A Whitney polynomial for hype rmaps

O. Elharrouss, Y. Himeur, Y. Mahmood, S. Alrabaee, A. Ouamane, F. Bensaali, Y. Bechqito, A. Chouchane, Vits as backbones: Leveraging visiontransformersforfeatureextraction, InformationFusion118(2025) 102951.doi:10.1016/j.inffus.2025.102951

work page doi:10.1016/j.inffus.2025.102951 2025
[37]

P. A. Gutiérrez, M. Perez-Ortiz, J. Sanchez-Monedero, F. Fernandez- Navarro, C. Hervas-Martinez, Ordinal regression methods: survey and experimental study, IEEE Transactions on Knowledge and Data Engi- neering 28 (1) (2015) 127–146.doi:10.1109/TKDE.2015.2457911

work page doi:10.1109/tkde.2015.2457911 2015
[38]

J. Moon, P. Jadhav, S. Choi, Deep learning analysis for rheumatologic imaging: current trends, future directions, and the role of human, Jour- nal of rheumatic diseases 32 (2) (2025) 73–88.doi:10.4078/jrd.2024. 0128

work page doi:10.4078/jrd.2024 2025
[39]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision 37 and pattern recognition, 2016, pp. 770–778.doi:10.1109/CVPR.2016. 90

work page doi:10.1109/cvpr.2016 2016
[40]

Gómez-Orellana, D

A. Gómez-Orellana, D. Guijo-Rubio, P. Gutiérrez, C. Hervás-Martínez, V. Vargas, ORFEO: Ordinal classifier and regressor fusion for estimating an ordinal categorical target, Eng. Applications of Artificial Intelligence 133 (2024) 108462.doi:10.1016/j.engappai.2024.108462

work page doi:10.1016/j.engappai.2024.108462 2024
[41]

Bérchez-Moreno, R

F. Bérchez-Moreno, R. Ayllón-Gavilán, V. M. Vargas, D. Guijo-Rubio, C. Hervás-Martínez, J. C. Fernández, P. A. Gutiérrez, dlordinal: A python package for deep ordinal classification, Neurocomputing (2025) 129305doi:10.1016/j.neucom.2024.129305

work page doi:10.1016/j.neucom.2024.129305 2025
[42]

de La Torre, D

J. de La Torre, D. Puig, A. Valls, Weighted kappa loss function for multi- class classification of ordinal data in deep learning, Pattern Recognition Letters 105 (2018) 144–154.doi:10.1016/j.patrec.2017.05.018

work page doi:10.1016/j.patrec.2017.05.018 2018
[43]

Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit., Psychological bulletin 70 (4) (1968) 213

J. Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit., Psychological bulletin 70 (4) (1968) 213. doi:10.1037/h0026256

work page doi:10.1037/h0026256 1968
[44]

M. J. Warrens, Cohen’s quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables, Statistical Methodology 9 (3) (2012) 440–444.doi:10.1016/j.stamet.2011.08. 006

work page doi:10.1016/j.stamet.2011.08 2012
[45]

C. J. Willmott, K. Matsuura, Advantages of the mean absolute er- ror (mae) over the root mean square error (rmse) in assessing aver- age model performance, Climate research 30 (1) (2005) 79–82.doi: 10.3354/cr030079

work page doi:10.3354/cr030079 2005
[46]

Baccianella, A

S. Baccianella, A. Esuli, F. Sebastiani, Evaluation measures for ordinal regression, in: 2009 Ninth international conference on intelligent systems design and applications, IEEE, 2009, pp. 283–287.doi:10.1109/ISDA. 2009.230

work page doi:10.1109/isda 2009
[47]

Cruz-Ramírez, C

M. Cruz-Ramírez, C. Hervás-Martínez, J. Sánchez-Monedero, P. A. Gutiérrez, Metrics to guide a multi-objective evolutionary algorithm for ordinal classification, Neurocomputing 135 (2014) 21–31.doi: 10.1016/j.neucom.2013.05.058. 38

work page doi:10.1016/j.neucom.2013.05.058 2014
[48]

J. C. Fernandez Caballero, F. J. Martinez, C. Hervas, P. A. Gutierrez, Sensitivity versus accuracy in multiclass problems using memetic pareto evolutionary neural networks, IEEE Transactions on Neural Networks 21 (5) (2010) 750–770.doi:10.1109/TNN.2010.2041468

work page doi:10.1109/tnn.2010.2041468 2010
[49]

V. M. Vargas, A. M. Gómez-Orellana, P. A. Gutiérrez, C. Hervás- Martínez, D. Guijo-Rubio, Ebano: A novel ensemble based on uni- modal ordinal classifiers for the prediction of significant wave height, Knowledge-Based Systems 300 (2024) 112223.doi:10.1016/j.knosys. 2024.112223

work page doi:10.1016/j.knosys 2024
[50]

Grad-CAM: visual explanations from deep networks via gradient-based localization.Proceedings of the IEEE International Conference on Com- puter Vision

R.R.Selvaraju, M.Cogswell, A.Das, R.Vedantam, D.Parikh, D.Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.doi:10.1109/ICCV.2017.74

work page doi:10.1109/iccv.2017.74 2017
[51]

Kullback, Information theory and statistics, Courier Corporation, 1997

S. Kullback, Information theory and statistics, Courier Corporation, 1997. 39

1997

[1] [1]

K. D. Allen, L. Thoma, Y. Golightly, Epidemiology of osteoarthritis, Osteoarthritis and cartilage 30 (2) (2022) 184–195.doi:10.1016/j. joca.2021.04.020

work page doi:10.1016/j 2022

[2] [2]

Sharma, Osteoarthritis of the knee, New England Journal of Medicine 384 (1) (2021) 51–59.doi:10.1056/NEJMcp1903768

L. Sharma, Osteoarthritis of the knee, New England Journal of Medicine 384 (1) (2021) 51–59.doi:10.1056/NEJMcp1903768

work page doi:10.1056/nejmcp1903768 2021

[3] [3]

Sakellariou, P

G. Sakellariou, P. G. Conaghan, W. Zhang, J. W. Bijlsma, P. Boye- sen, M. A. D’Agostino, M. Doherty, D. Fodor, M. Kloppenburg, F. Miese, et al., Eular recommendations for the use of imaging in the clinical management of peripheral joint osteoarthritis, Annals of the rheumatic diseases 76 (9) (2017) 1484–1494.doi:10.1136/ annrheumdis-2016-210815

2017

[4] [4]

M. D. Kohn, A. A. Sassoon, N. D. Fernando, Classifications in brief: Kellgren-lawrence classification of osteoarthritis, Clinical Orthopaedics and Related Research®474 (8) (2016) 1886–1893.doi:10.1007/ s11999-016-4732-4

2016

[5] [5]

Filippou, E

G. Filippou, E. Filippucci, P. Mandl, A. Abhishek, A critical review of the available evidence on the diagnosis and clinical features of cppd: do we really need imaging?, Clinical rheumatology 40 (7) (2021) 2581–2592. doi:10.1007/s10067-020-05516-3

work page doi:10.1007/s10067-020-05516-3 2021

[6] [6]

Q. D. Buchlak, J. Clair, N. Esmaili, A. Barmare, S. Chandrasekaran, Clinical outcomes associated with robotic and computer-navigated total knee arthroplasty: a machine learning-augmented systematic review, European Journal of Orthopaedic Surgery & Traumatology 32 (5) (2022) 915–931.doi:10.1007/s00590-021-03059-0

work page doi:10.1007/s00590-021-03059-0 2022

[7] [7]

Y. X. Teoh, A. Othmani, K. W. Lai, S. L. Goh, J. Usman, Stratify- ing knee osteoarthritis features through multitask deep hybrid learning: data from the osteoarthritis initiative, Computer methods and programs in biomedicine 242 (2023) 107807.doi:10.1016/j.cmpb.2023.107807. 33

work page doi:10.1016/j.cmpb.2023.107807 2023

[8] [8]

S. M. Ahmed, R. J. Mstafa, A comprehensive survey on bone seg- mentation techniques in knee osteoarthritis research: From conven- tional methods to deep learning, Diagnostics 12 (3) (2022) 611.doi: 10.3390/diagnostics12030611

work page doi:10.3390/diagnostics12030611 2022

[9] [9]

L. Si, J. Zhong, J. Huo, K. Xuan, Z. Zhuang, Y. Hu, Q. Wang, H. Zhang, W. Yao, Deep learning in knee imaging: a systematic re- view utilizing a checklist for artificial intelligence in medical imaging (claim), European Radiology 32 (2) (2022) 1353–1361.doi:10.1007/ s00330-021-08190-4

2022

[10] [10]

W. Lv, J. Peng, J. Hu, Y. Lu, Z. Zhou, H. Xu, K. Xing, X. Zhang, L. Lu, Lmsst-gcn: Longitudinal mri sub-structural texture guided graph convo- lution network for improved progression prediction of knee osteoarthri- tis, ComputerMethodsandProgramsinBiomedicine261(2025)108600. doi:10.1016/j.cmpb.2025.108600

work page doi:10.1016/j.cmpb.2025.108600 2025

[11] [11]

Hinterwimmer, I

F. Hinterwimmer, I. Lazic, C. Suren, M. T. Hirschmann, F. Pohlig, D. Rueckert, R. Burgkart, R. von Eisenhart-Rothe, Machine learning in knee arthroplasty: specific data are key—a systematic review, Knee Surgery, Sports Traumatology, Arthroscopy 30 (2) (2022) 376–388.doi: 10.1007/s00167-021-06848-6

work page doi:10.1007/s00167-021-06848-6 2022

[12] [12]

P. Chen, L. Gao, X. Shi, K. Allen, L. Yang, Fully automatic knee os- teoarthritis severity grading using deep neural networks with a novel ordinal loss, Computerized Medical Imaging and Graphics 75 (2019) 84–92.doi:10.1016/j.compmedimag.2019.06.002

work page doi:10.1016/j.compmedimag.2019.06.002 2019

[13] [13]

C. W. Yong, K. Teo, B. P. Murphy, Y. C. Hum, Y. K. Tee, K. Xia, K. W. Lai, Knee osteoarthritis severity classification with ordinal regression module, MultimediaToolsandApplications81(29)(2022)41497–41509. doi:10.1007/s11042-021-10557-0

work page doi:10.1007/s11042-021-10557-0 2022

[14] [14]

Kokkotis, S

C. Kokkotis, S. Moustakidis, E. Papageorgiou, G. Giakas, D. Tsaopou- los, Machine learning in knee osteoarthritis: A review, Osteoarthritis andCartilageOpen2(3)(2020)100069.doi:10.1016/j.ocarto.2020. 100069

work page doi:10.1016/j.ocarto.2020 2020

[15] [15]

Upadhyay, O

A. Upadhyay, O. Sawant, P. Choudhary, Detection of knee osteoarthritis stages using convolutional neural network, SN Computer Science 4 (3) (2023) 257.doi:10.1007/s42979-022-01644-6. 34

work page doi:10.1007/s42979-022-01644-6 2023

[16] [16]

Y. Wang, S. Li, B. Zhao, J. Zhang, Y. Yang, B. Li, A resnet-based approach for accurate radiographic diagnosis of knee osteoarthritis, CAAI Transactions on Intelligence Technology 7 (3) (2022) 512–521. doi:10.1049/cit2.12079

work page doi:10.1049/cit2.12079 2022

[17] [17]

M. W. Brejnebøl, P. Hansen, J. U. Nybing, R. Bachmann, U. Ratjen, I. V. Hansen, A. Lenskjold, M. Axelsen, M. Lundemann, M. Boesen, External validation of an artificial intelligence tool for radiographic knee osteoarthritis severity classification, European Journal of Radiology 150 (2022) 110249.doi:10.1016/j.ejrad.2022.110249

work page doi:10.1016/j.ejrad.2022.110249 2022

[18] [18]

S. V. Chaugule, V. Malemath, Knee osteoarthritis grading using densenet and radiographic images, SN Computer Science 4 (1) (2022) 63.doi:10.1007/s42979-022-01468-4

work page doi:10.1007/s42979-022-01468-4 2022

[19] [19]

Kalpana, G

V. Kalpana, G. H. Kumar, et al., Evaluating the efficacy of deep learn- ing models for knee osteoarthritis prediction based on kellgren-lawrence grading system, e-Prime-Advances in Electrical Engineering, Electronics and Energy 5 (2023) 100266.doi:10.1016/j.prime.2023.100266

work page doi:10.1016/j.prime.2023.100266 2023

[20] [20]

Jahan, M

M. Jahan, M. Z. Hasan, I. J. Samia, K. Fatema, M. A. H. Rony, M. S. Arefin, A. Moustafa, Koa-cctnet: An enhanced knee osteoarthri- tis grade assessment framework using modified compact convolutional transformer model, IEEE Access 12 (2024) 107719–107741.doi:10. 1109/ACCESS.2024.3435572

work page arXiv 2024

[21] [21]

Maqsood, N

S. Maqsood, N. Maqsood, S. Shahid, F. E. Subhan, M. A. Sarwar, M. Yousufi, A. Qurthobi, A. Zafar, M. A. Khan, R. Damaševičius, et al., Knee osteoarthritis network: A hybrid transformer-based ap- proach for enhanced detection and grading of knee osteoarthritis, Engi- neering Applications of Artificial Intelligence 159 (2025) 111751.doi: 10.1016/j.engappai....

work page doi:10.1016/j.engappai.2025.111751 2025

[22] [22]

Albuquerque, R

T. Albuquerque, R. Cruz, J. S. Cardoso, Ordinal losses for classification of cervical cancer risk, PeerJ Computer Science 7 (2021) e457.doi: 10.7717/peerj-cs.457

work page doi:10.7717/peerj-cs.457 2021

[23] [23]

T. T. Le Vuong, K. Kim, B. Song, J. T. Kwak, Joint categorical and ordinal learning for cancer grading in pathology images, Medical image analysis 73 (2021) 102206.doi:10.1016/j.media.2021.102206. 35

work page doi:10.1016/j.media.2021.102206 2021

[24] [24]

L. Wang, H. Wang, Y. Su, F. Lure, J. Li, A novel hybrid ordinal learning model with health care application, IEEE Transactions on Automation Science and Engineering 22 (2024) 339–352.doi:10.1109/TASE.2024. 3350894

work page doi:10.1109/tase.2024 2024

[25] [25]

Rivera-Gavilán, V

M. Rivera-Gavilán, V. M. Vargas, P. A. Gutiérrez, J. Briceño, C. Hervás-Martínez, D. Guijo-Rubio, Ordinal classification approach for donor-recipient matching in liver transplantation with circula- tory death donors, in: International Work-Conference on Artifi- cial Neural Networks, Springer, 2023, pp. 517–528.doi:10.1007/ 978-3-031-43078-7_42

2023

[26] [26]

H. L. Le, H. G. Roh, H. J. Kim, J. T. Kwak, A 3d multi-task regression and ordinal regression deep neural network for collateral imaging from dynamic susceptibility contrast-enhanced mr perfusion in acute ischemic stroke, Computer Methods and Programs in Biomedicine 225 (2022) 107071.doi:10.1016/j.cmpb.2022.107071

work page doi:10.1016/j.cmpb.2022.107071 2022

[27] [27]

X. Liu, F. Fan, L. Kong, Z. Diao, W. Xie, J. Lu, J. You, Unimodal regu- larized neuron stick-breaking for ordinal classification, Neurocomputing 388 (2020) 34–44.doi:10.1016/j.neucom.2020.01.025

work page doi:10.1016/j.neucom.2020.01.025 2020

[28] [28]

Q. Li, J. Wang, Z. Yao, Y. Li, P. Yang, J. Yan, C. Wang, S. Pu, Unimodal-concentrated loss: Fully adaptive label distribution learning for ordinal regression, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20513–20522. doi:10.1109/CVPR52688.2022.01986

work page doi:10.1109/cvpr52688.2022.01986 2022

[29] [29]

V. M. Vargas, P. A. Gutiérrez, C. Hervás-Martínez, Unimodal regular- isation based on beta distribution for deep ordinal regression, Pattern Recognition 122 (2022) 108310.doi:10.1016/j.patcog.2021.108310

work page doi:10.1016/j.patcog.2021.108310 2022

[30] [30]

V. M. Vargas, P. A. Gutiérrez, R. Rosati, L. Romeo, E. Frontoni, C. Hervás-Martínez, Exponential loss regularisation for encouraging or- dinalconstrainttoshotgunstocksqualityassessment, AppliedSoftCom- puting 138 (2023) 110191.doi:10.1016/j.asoc.2023.110191

work page doi:10.1016/j.asoc.2023.110191 2023

[31] [31]

V. M. Vargas, P. A. Gutiérrez, J. Barbero-Gómez, C. Hervás-Martínez, Soft labelling based on triangular distributions for ordinal classification, 36 Information Fusion 93 (2023) 258–267.doi:10.1016/j.inffus.2023. 01.003

work page doi:10.1016/j.inffus.2023 2023

[32] [32]

V. M. Vargas, A. M. Duran-Rosal, D. Guijo-Rubio, P. A. Gutierrez, C. Hervas-Martinez, Generalised triangular distributions for ordinal deep learning: Novel proposal and optimisation, Information Sciences 648 (2023) 119606.doi:10.1016/j.ins.2023.119606

work page doi:10.1016/j.ins.2023.119606 2023

[33] [33]

J. S. Cardoso, R. P. Cruz, T. Albuquerque, Unimodal distributions for ordinal regression, IEEE Transactions on Artificial Intelligence 6 (2025) 2498–2509.doi:10.1109/TAI.2025.3549740

work page doi:10.1109/tai.2025.3549740 2025

[34] [34]

V. M. Vargas, D. Guijo-Rubio, R. Ayllón-Gavilán, A. M. Gómez- Orellana, P. A. Gutiérrez, C. Hervás-Martínez, Soft labelling for deep ordinal classification: an experimental review, IEEE Transactions on Knowledge and Data Engineering (2026).doi:10.1109/TKDE.2026. 3681678

work page doi:10.1109/tkde.2026 2026

[35] [35]

van Veldhuizen, V

V. van Veldhuizen, V. Botha, C. Lu, M. E. Cesur, K. G. Lipman, E. D. de Jong, H. Horlings, C. I. Sanchez, C. G. Snoek, L. Wessels, et al., Foundation models in medical imaging: A review and outlook, arXiv preprint arXiv:2506.09095 (2025).doi:10.48550/arXiv.2506.09095

work page doi:10.48550/arxiv.2506.09095 2025

[36] [36]

A Whitney polynomial for hype rmaps

O. Elharrouss, Y. Himeur, Y. Mahmood, S. Alrabaee, A. Ouamane, F. Bensaali, Y. Bechqito, A. Chouchane, Vits as backbones: Leveraging visiontransformersforfeatureextraction, InformationFusion118(2025) 102951.doi:10.1016/j.inffus.2025.102951

work page doi:10.1016/j.inffus.2025.102951 2025

[37] [37]

P. A. Gutiérrez, M. Perez-Ortiz, J. Sanchez-Monedero, F. Fernandez- Navarro, C. Hervas-Martinez, Ordinal regression methods: survey and experimental study, IEEE Transactions on Knowledge and Data Engi- neering 28 (1) (2015) 127–146.doi:10.1109/TKDE.2015.2457911

work page doi:10.1109/tkde.2015.2457911 2015

[38] [38]

J. Moon, P. Jadhav, S. Choi, Deep learning analysis for rheumatologic imaging: current trends, future directions, and the role of human, Jour- nal of rheumatic diseases 32 (2) (2025) 73–88.doi:10.4078/jrd.2024. 0128

work page doi:10.4078/jrd.2024 2025

[39] [39]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE conference on computer vision 37 and pattern recognition, 2016, pp. 770–778.doi:10.1109/CVPR.2016. 90

work page doi:10.1109/cvpr.2016 2016

[40] [40]

Gómez-Orellana, D

A. Gómez-Orellana, D. Guijo-Rubio, P. Gutiérrez, C. Hervás-Martínez, V. Vargas, ORFEO: Ordinal classifier and regressor fusion for estimating an ordinal categorical target, Eng. Applications of Artificial Intelligence 133 (2024) 108462.doi:10.1016/j.engappai.2024.108462

work page doi:10.1016/j.engappai.2024.108462 2024

[41] [41]

Bérchez-Moreno, R

F. Bérchez-Moreno, R. Ayllón-Gavilán, V. M. Vargas, D. Guijo-Rubio, C. Hervás-Martínez, J. C. Fernández, P. A. Gutiérrez, dlordinal: A python package for deep ordinal classification, Neurocomputing (2025) 129305doi:10.1016/j.neucom.2024.129305

work page doi:10.1016/j.neucom.2024.129305 2025

[42] [42]

de La Torre, D

J. de La Torre, D. Puig, A. Valls, Weighted kappa loss function for multi- class classification of ordinal data in deep learning, Pattern Recognition Letters 105 (2018) 144–154.doi:10.1016/j.patrec.2017.05.018

work page doi:10.1016/j.patrec.2017.05.018 2018

[43] [43]

Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit., Psychological bulletin 70 (4) (1968) 213

J. Cohen, Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit., Psychological bulletin 70 (4) (1968) 213. doi:10.1037/h0026256

work page doi:10.1037/h0026256 1968

[44] [44]

M. J. Warrens, Cohen’s quadratically weighted kappa is higher than linearly weighted kappa for tridiagonal agreement tables, Statistical Methodology 9 (3) (2012) 440–444.doi:10.1016/j.stamet.2011.08. 006

work page doi:10.1016/j.stamet.2011.08 2012

[45] [45]

C. J. Willmott, K. Matsuura, Advantages of the mean absolute er- ror (mae) over the root mean square error (rmse) in assessing aver- age model performance, Climate research 30 (1) (2005) 79–82.doi: 10.3354/cr030079

work page doi:10.3354/cr030079 2005

[46] [46]

Baccianella, A

S. Baccianella, A. Esuli, F. Sebastiani, Evaluation measures for ordinal regression, in: 2009 Ninth international conference on intelligent systems design and applications, IEEE, 2009, pp. 283–287.doi:10.1109/ISDA. 2009.230

work page doi:10.1109/isda 2009

[47] [47]

Cruz-Ramírez, C

M. Cruz-Ramírez, C. Hervás-Martínez, J. Sánchez-Monedero, P. A. Gutiérrez, Metrics to guide a multi-objective evolutionary algorithm for ordinal classification, Neurocomputing 135 (2014) 21–31.doi: 10.1016/j.neucom.2013.05.058. 38

work page doi:10.1016/j.neucom.2013.05.058 2014

[48] [48]

J. C. Fernandez Caballero, F. J. Martinez, C. Hervas, P. A. Gutierrez, Sensitivity versus accuracy in multiclass problems using memetic pareto evolutionary neural networks, IEEE Transactions on Neural Networks 21 (5) (2010) 750–770.doi:10.1109/TNN.2010.2041468

work page doi:10.1109/tnn.2010.2041468 2010

[49] [49]

V. M. Vargas, A. M. Gómez-Orellana, P. A. Gutiérrez, C. Hervás- Martínez, D. Guijo-Rubio, Ebano: A novel ensemble based on uni- modal ordinal classifiers for the prediction of significant wave height, Knowledge-Based Systems 300 (2024) 112223.doi:10.1016/j.knosys. 2024.112223

work page doi:10.1016/j.knosys 2024

[50] [50]

Grad-CAM: visual explanations from deep networks via gradient-based localization.Proceedings of the IEEE International Conference on Com- puter Vision

R.R.Selvaraju, M.Cogswell, A.Das, R.Vedantam, D.Parikh, D.Batra, Grad-cam: Visual explanations from deep networks via gradient-based localization, in: Proceedings of the IEEE international conference on computer vision, 2017, pp. 618–626.doi:10.1109/ICCV.2017.74

work page doi:10.1109/iccv.2017.74 2017

[51] [51]

Kullback, Information theory and statistics, Courier Corporation, 1997

S. Kullback, Information theory and statistics, Courier Corporation, 1997. 39

1997