pith. sign in

arxiv: 2509.17661 · v2 · submitted 2025-09-22 · 📡 eess.AS · cs.SD

Comparator Loss: An Ordinal Contrastive Loss to Derive a Severity Score for Speech-based Health Monitoring

Pith reviewed 2026-05-18 14:59 UTC · model grok-4.3

classification 📡 eess.AS cs.SD
keywords comparator lossseverity scorespeech monitoringneurodegenerative diseaseordinal contrastive lossALSFRS-Rhealth progression
0
0 comments X

The pith

A comparator loss trains speech models to output continuous severity scores that respect clinical orderings and correlate with unseen disease measures.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes training a model on speech recordings to produce a scalar severity score by minimizing a comparator loss that enforces ordering constraints. These constraints can come from diagnosis labels, existing clinical scores, or simply the chronological sequence of recordings. The resulting scores distinguish subjects with neurodegenerative diseases from healthy controls and correlate with independent clinical annotations such as ALSFRS-R and ratings from speech and language therapists, even though those annotations were never shown during training. The method is designed to make fuller use of small, lightly annotated health datasets by combining signals across different metrics through shared orderings.

Core claim

Minimizing the comparator loss on speech data yields a continuous severity score whose values obey ordering relations derived from diagnosis, clinical scores, or recording chronology, and these scores generalize to predict annotations not present in the training data.

What carries the argument

The comparator loss, an ordinal contrastive loss that penalizes pairs whose assigned scores violate a known ordering relation.

Load-bearing premise

Ordering relations derived from diagnosis or recording times supply enough signal to learn a continuous severity score that reflects actual disease progression beyond those orderings.

What would settle it

A held-out longitudinal speech dataset in which the model scores fail to rise with documented clinical worsening or show zero correlation with later ALSFRS-R values would falsify the central claim.

read the original abstract

Monitoring the progression of neurodegenerative disease (NDD) has important applications in planning treatment and evaluating new medications. Whereas much work has focused on discriminating patients from healthy controls, or predicting real-world health metrics, we propose a novel measure of disease progression: the severity score, derived from a model trained to minimize what we call the comparator loss. This loss ensures scores obey an ordering relation, based on diagnosis, clinical scores, or simply chronological order of recordings. The proposed comparator loss-based system has the potential to incorporate information from disparate health metrics, critical for making full use of small health-related datasets. We show that a model trained on lightly annotated data is capable of distinguishing between subjects with NDDs and healthy controls. Our score also correlates with annotations not observed in training, such as ALSFRS-R and those of speech and language therapists.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the comparator loss, an ordinal contrastive loss that trains a model to output a continuous severity score from speech recordings for neurodegenerative disease monitoring. Ordering constraints are derived from diagnosis labels, clinical scores, or recording chronology; the resulting scores are shown to separate NDD patients from controls and to correlate with held-out annotations such as ALSFRS-R and speech-language therapist ratings.

Significance. If the central claim holds after the requested clarifications, the work would offer a practical route to leverage lightly annotated speech data for progression tracking, allowing disparate clinical signals to be combined in small health datasets and potentially supporting treatment planning and drug trials.

major comments (2)
  1. [Abstract] Abstract: the claim that scores 'correlate with annotations not observed in training, such as ALSFRS-R' is presented without any numerical correlation values, p-values, confidence intervals, or controls that regress out the training ordering cues (diagnosis or chronology); this leaves open whether the correlation reflects genuine progression information or merely group separation.
  2. [Methods] Methods: the comparator loss is introduced only conceptually ('ensures scores obey an ordering relation') with no explicit equation, loss formulation, or comparison to standard contrastive or ranking losses; without this, it is impossible to verify whether the loss enforces fine-grained monotonicity within subjects or can be satisfied by a near-binary embedding when orderings are coarse.
minor comments (1)
  1. [Abstract] The abstract would benefit from a short statement of dataset size, number of subjects, and recording conditions to allow readers to gauge the scale of the reported distinctions and correlations.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments have helped us identify areas where additional clarity and quantitative support will strengthen the manuscript. We respond to each major comment below and have revised the manuscript to address the points raised.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that scores 'correlate with annotations not observed in training, such as ALSFRS-R' is presented without any numerical correlation values, p-values, confidence intervals, or controls that regress out the training ordering cues (diagnosis or chronology); this leaves open whether the correlation reflects genuine progression information or merely group separation.

    Authors: We agree that the abstract would be strengthened by including quantitative results. The results section reports Pearson correlations, p-values, and confidence intervals between the severity scores and held-out annotations such as ALSFRS-R. To address the concern about training cues, we have added a control analysis that regresses out diagnosis labels and chronological order information before computing the correlations; the partial correlations with the unseen clinical annotations remain statistically significant. We will revise the abstract to include these numerical values along with a brief description of the control analysis. revision: yes

  2. Referee: [Methods] Methods: the comparator loss is introduced only conceptually ('ensures scores obey an ordering relation') with no explicit equation, loss formulation, or comparison to standard contrastive or ranking losses; without this, it is impossible to verify whether the loss enforces fine-grained monotonicity within subjects or can be satisfied by a near-binary embedding when orderings are coarse.

    Authors: We thank the referee for this observation. We will expand the Methods section to provide the explicit mathematical formulation of the comparator loss (a margin-based pairwise ranking objective applied according to the available ordering constraints) together with a direct comparison to standard contrastive losses and ranking losses. The revised text will also include an explanation and example showing that, when within-subject chronological orderings are supplied, the loss enforces fine-grained monotonicity rather than permitting a near-binary solution. revision: yes

Circularity Check

0 steps flagged

No significant circularity in the derivation chain

full rationale

The paper introduces the comparator loss as a training objective explicitly constructed to enforce ordering relations drawn from diagnosis labels, clinical scores, or chronological order. The severity score is defined as the scalar output of a model trained under this loss. This construction is intentional and transparent rather than a hidden reduction. The central empirical claims—distinguishing NDD subjects from controls and correlating with held-out annotations such as ALSFRS-R and speech-therapist ratings—are evaluated on data or metrics not used to supply the training orderings. Because these correlations constitute independent, falsifiable evidence outside the fitted orderings, the reported results do not reduce to the inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps in the provided text. The work is therefore a standard supervised embedding method whose validity rests on external validation rather than definitional equivalence.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that available ordering information reflects underlying disease severity and that a model can generalize this to a continuous score validated on unseen metrics.

axioms (1)
  • domain assumption Ordering relations based on diagnosis, clinical scores, or chronological order reflect true underlying disease progression.
    This premise is invoked to define the comparator loss and train the severity score model.
invented entities (2)
  • comparator loss no independent evidence
    purpose: Training objective that enforces ordinal relations for severity scoring
    Newly proposed loss function in this work.
  • severity score no independent evidence
    purpose: Continuous measure of NDD progression derived from speech
    Output of the model trained with the comparator loss.

pith-pipeline@v0.9.0 · 5705 in / 1312 out tokens · 42674 ms · 2026-05-18T14:59:12.723897+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

  • IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear
    ?
    unclear

    Relation between the paper passage and the cited Recognition theorem.

    the comparator loss ... J = max((f θ(a)−f θ(b)) +ϵ,0) ... ensures scores follow an ordering relation, which can be based on diagnosis, clinically annotated scores, or simply the chronological order of the recordings

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

28 extracted references · 28 canonical work pages · 1 internal anchor

  1. [1]

    INTRODUCTION Neurodegenerative disorders are becoming more prevalent globally, posing a significant health challenge. They include progressive and fatal diagnoses affecting cognition such as Alzheimer’s disease, and neuromuscular disorders such as motor neuron disease (MND), the most common subtype of which is amyotrophic lateral sclero- sis (ALS). There ...

  2. [2]

    Category learning Categorical losses, such as cross-entropy, are well-established in the field of machine learning and commonly used in medical tasks

    BACKGROUND 2.1. Category learning Categorical losses, such as cross-entropy, are well-established in the field of machine learning and commonly used in medical tasks. These make no assumption on the ordering of classes, meaning there is no inductive bias encoded in the model that categories are sequential. However, cross-entropy has also been applied to o...

  3. [3]

    Their proposed model is trained to learn a low-dimensional manifold from images

    where authors introduce a contrastive loss for dimensionality re- duction. Their proposed model is trained to learn a low-dimensional manifold from images. Pairs of images that are of similar categories are mapped to nearby points in that manifold. To prevent the repre- sentation space collapsing, a contrastive term is introduced, forcing image pairs from...

  4. [4]

    Model Fig

    METHOD 3.1. Model Fig. 1 illustrates the general structure of the proposed comparator model. The model is composed of an encoder that outputs an em- bedding given a speech waveform and a scoring network that takes this embedding and converts it into a scalar valuef θ(.). We refer to this value as the ‘score’ as it can be used as a proxy for a sever- ity s...

  5. [5]

    Data We used the Anne Rowling Neurological Speech Corpus [20] for training and evaluation of the proposed system

    EV ALUA TION 4.1. Data We used the Anne Rowling Neurological Speech Corpus [20] for training and evaluation of the proposed system. The corpus con- tains speech recordings of 780 individuals, grouped according to their documented diagnosis. We used data from the healthy control group (HC) of 170 individuals and the motor neuron disease (MND) group of 103....

  6. [6]

    While it is not yet possible to use the scores analysed here to confirm progression of disease using longitudinal data, the system compares favourably with strong baselines

    CONCLUSIONS A new loss has been proposed; the predictions of models trained with it correlate significantly with clinical judgments. While it is not yet possible to use the scores analysed here to confirm progression of disease using longitudinal data, the system compares favourably with strong baselines. As the design of the loss enables the exploita- ti...

  7. [7]

    Alize J Ferrari et al., “Global incidence, prevalence, years lived with disability, disability-adjusted life-years, and healthy life expectancy for 371 diseases and injuries in 204 countries and territories and 811 subnational locations, 1990–2021: a sys- tematic analysis for the Global Burden of Disease Study 2021,” The Lancet, vol. 403, no. 10440, pp. 2...

  8. [8]

    A systematic review and narrative analy- sis of digital speech biomarkers in motor neuron disease,

    Molly Bowden et al., “A systematic review and narrative analy- sis of digital speech biomarkers in motor neuron disease,”NPJ digital medicine, vol. 6, no. 1, pp. 228, 2023

  9. [9]

    Artificial intelligence, speech, and language processing approaches to monitoring Alzheimer’s disease: a systematic review,

    Sofia De la Fuente Garcia, Craig W Ritchie, and Saturnino Luz, “Artificial intelligence, speech, and language processing approaches to monitoring Alzheimer’s disease: a systematic review,”Journal of Alzheimer’s Disease, vol. 78, no. 4, pp. 1547–1574, 2020

  10. [10]

    Alzheimer’s dementia recognition through spontaneous speech: The ADReSS Challenge,

    Saturnino Luz et al., “Alzheimer’s dementia recognition through spontaneous speech: The ADReSS Challenge,” in Proc. Interspeech, Shanghai, China, 2020

  11. [11]

    The ALSFRS-R: a revised ALS functional rating scale that incorporates assessments of respi- ratory function,

    Jesse M Cedarbaum et al., “The ALSFRS-R: a revised ALS functional rating scale that incorporates assessments of respi- ratory function,”Journal of the neurological sciences, vol. 169, no. 1-2, pp. 13–21, 1999

  12. [12]

    ALS-specific cognitive and behavior changes associated with advancing disease stage in ALS,

    Christopher Crockford et al., “ALS-specific cognitive and behavior changes associated with advancing disease stage in ALS,”Neurology, vol. 91, no. 15, pp. e1370–e1380, 2018

  13. [13]

    A brief cognitive test battery to differentiate Alzheimer’s disease and frontotemporal demen- tia,

    Pavagada S Mathuranath et al., “A brief cognitive test battery to differentiate Alzheimer’s disease and frontotemporal demen- tia,”Neurology, vol. 55, no. 11, pp. 1613–1620, 2000

  14. [14]

    ALSFRS and appel ALS scores: discordance with disease progression,

    Andrei V oustianiouk et al., “ALSFRS and appel ALS scores: discordance with disease progression,”Muscle & Nerve: Offi- cial Journal of the American Association of Electrodiagnostic Medicine, vol. 37, no. 5, pp. 668–672, 2008

  15. [15]

    Using the ALSFRS-R in multicentre clinical trials for amyotrophic lateral sclerosis: potential limi- tations in current standard operating procedures,

    Jaap NE Bakers et al., “Using the ALSFRS-R in multicentre clinical trials for amyotrophic lateral sclerosis: potential limi- tations in current standard operating procedures,”Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, vol. 23, no. 7-8, pp. 500–507, 2022

  16. [16]

    A further rasch study confirms that ALSFRS-R does not conform to fundamental measurement requirements,

    Franco Franchignoni, Jessica Mandrioli, Andrea Giordano, Salvatore Ferro, and ERRALS Group, “A further rasch study confirms that ALSFRS-R does not conform to fundamental measurement requirements,”Amyotrophic Lateral Sclerosis and Frontotemporal Degeneration, vol. 16, no. 5-6, pp. 331– 337, 2015

  17. [17]

    Longitudinal modeling of de- pression shifts using speech and language,

    Paula Andrea P ´erez-Toro et al., “Longitudinal modeling of de- pression shifts using speech and language,” inProc. ICASSP, 2024, pp. 12021–12025

  18. [18]

    The PHQ-8 as a measure of current depression in the general population,

    Kurt Kroenke, Tara W. Strine, Robert L. Spitzer, Janet B.W. Williams, Joyce T. Berry, and Ali H. Mokdad, “The PHQ-8 as a measure of current depression in the general population,” Journal of Affective Disorders, vol. 114, no. 1, pp. 163–173, 2009

  19. [19]

    Efficient neural audio synthesis,

    Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stim- berg, Aaron van den Oord, Sander Dieleman, and Koray Kavukcuoglu, “Efficient neural audio synthesis,” inProc. ICML, 2018, vol. 80, pp. 2410–2419

  20. [20]

    A neu- ral network approach to ordinal regression,

    Jianlin Cheng, Zheng Wang, and Gianluca Pollastri, “A neu- ral network approach to ordinal regression,” inProc. IJCNN, 2008, pp. 1279–1284

  21. [21]

    Ensemble of deep neural network models for MOS prediction,

    Marie Kune ˇsov´a, Jind ˇrich Matou ˇsek, Jan Lehe ˇcka, Jan ˇSvec, Josef Mich ´alek, Daniel Tihelka, Martin Bul ´ın, Zdenˇek Hanzl´ıˇcek, and Mark´eta ˇRez´aˇckov´a, “Ensemble of deep neural network models for MOS prediction,” inProc. ICASSP, 2023, pp. 1–5

  22. [22]

    Contrastive representation learning,

    Lilian Weng, “Contrastive representation learning,”lilian- weng.github.io, May 2021

  23. [23]

    Dimensionality reduc- tion by learning an invariant mapping,

    R. Hadsell, S. Chopra, and Y . LeCun, “Dimensionality reduc- tion by learning an invariant mapping,” inProc. CVPR, 2006, vol. 2, pp. 1735–1742

  24. [24]

    FaceNet: A unified embedding for face recognition and clus- tering,

    Florian Schroff, Dmitry Kalenichenko, and James Philbin, “FaceNet: A unified embedding for face recognition and clus- tering,” inProc. CVPR, June 2015

  25. [25]

    Deep metric learning via lifted struc- tured feature embedding,

    Hyun Oh Song et al., “Deep metric learning via lifted struc- tured feature embedding,” inProc. CVPR, 2016

  26. [26]

    Anne Rowling Neurological Speech Cor- pus: clinically annotated longitudinal dataset for developing speech biomarkers in neurodegenerative disorders,

    Johnny Tam et al., “Anne Rowling Neurological Speech Cor- pus: clinically annotated longitudinal dataset for developing speech biomarkers in neurodegenerative disorders,” inProc. Interspeech, 2025

  27. [27]

    Ti- taNet: Neural model for speaker representation with 1d depth- wise separable convolutions and global context,

    Nithin Rao Koluguri, Taejin Park, and Boris Ginsburg, “Ti- taNet: Neural model for speaker representation with 1d depth- wise separable convolutions and global context,” inProc. ICASSP, 2022, pp. 8102–8106

  28. [28]

    ArcFace: Additive angular margin loss for deep face recognition,

    Jiankang Deng et al., “ArcFace: Additive angular margin loss for deep face recognition,” inProc. CVPR, 2019, pp. 4685– 4694