A training-free method quantifies dysarthria severity via d-prime scores on phonological contrasts in HuBERT embeddings, correlating with clinical ratings across 5 languages and multiple conditions.
Something from Nothing: Data Augmentation for Robust Severity Level Estimation of Dysarthric Speech
3 Pith papers cite this work. Polarity classification is still indexing.
abstract
Dysarthric speech quality assessment (DSQA) is critical for clinical diagnostics and inclusive speech technologies. However, subjective evaluation is costly and difficult to scale, and the scarcity of labeled data limits robust objective modeling. To address this, we propose a three-stage framework that leverages unlabeled dysarthric speech and large-scale typical speech datasets to scale training. A teacher model first generates pseudo-labels for unlabeled samples, followed by weakly supervised pretraining using a label-aware contrastive learning strategy that exposes the model to diverse speakers and acoustic conditions. The pretrained model is then fine-tuned for the downstream DSQA task. Experiments on five unseen datasets spanning multiple etiologies and languages demonstrate the robustness of our approach. Our Whisper-based baseline significantly outperforms SOTA DSQA predictors such as SpICE, and the full framework achieves an average SRCC of 0.761 across unseen test datasets.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
A three-stage pseudo-labeling and contrastive learning framework achieves average SRCC of 0.761 on five unseen dysarthric speech datasets for robust severity estimation.
Phonological subspace collapse in SSL speech representations produces aetiology-specific degradation profiles that remain stable in shape across languages and model architectures.
citing papers explorer
-
Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations
A training-free method quantifies dysarthria severity via d-prime scores on phonological contrasts in HuBERT embeddings, correlating with clinical ratings across 5 languages and multiple conditions.
-
Something from Nothing: Data Augmentation for Robust Severity Level Estimation of Dysarthric Speech
A three-stage pseudo-labeling and contrastive learning framework achieves average SRCC of 0.761 on five unseen dysarthric speech datasets for robust severity estimation.
-
Phonological Subspace Collapse Is Aetiology-Specific and Cross-Lingually Stable: Evidence from 3,374 Speakers
Phonological subspace collapse in SSL speech representations produces aetiology-specific degradation profiles that remain stable in shape across languages and model architectures.