CoSTA: Cognitive-State-Conditioned TTS Data Augmentation Using ASR Transcripts for Alzheimer's Disease Detection

Jiahong Yuan; Jiaxin Chen; Liu He; Rui Feng; Shaobo Liu; Yiming Wang; Yin-Long Liu; Yuanchao Li; Yuang Chen; Yue Li

arxiv: 2606.06170 · v1 · pith:NEFYAOLGnew · submitted 2026-06-04 · 📡 eess.AS

CoSTA: Cognitive-State-Conditioned TTS Data Augmentation Using ASR Transcripts for Alzheimer's Disease Detection

Yin-Long Liu , Yuanchao Li , Yiming Wang , Yue Li , Rui Feng , Jiaxin Chen , Shaobo Liu , Liu He

show 3 more authors

Yuang Chen Jiahong Yuan Zhen-Hua Ling

This is my paper

classification 📡 eess.AS

keywords augmentationspeechcostadatatranscriptsadressalzheimercognitive-state-conditioned

0 comments

read the original abstract

Speech-based Alzheimer's Disease (AD) detection is constrained by scarce pathological speech data. To address this, we propose CoSTA, a Text-to-Speech (TTS)-based data augmentation framework. Specifically, we first develop two Cognitive-State-Conditioned (CS-Cond) TTS models by adapting CosyVoice2 and F5-TTS to synthesize speech with distinct AD and Healthy Control characteristics. Furthermore, by constructing a transcript pool comprising Manual Transcripts (MT) and 36 Automatic Speech Recognition (ASR) transcripts, we investigate the impact of text sources on TTS-based augmentation. We also perform augmentation-factor analysis and test-time augmentation. Experiments on the ADReSS dataset show that CS-Cond TTS significantly improves synthetic speech utility, and ASR-driven augmentation frequently outperforms MT-driven augmentation. Finally, CoSTA yields a 4.16% gain over the baseline, achieving an audio-only accuracy of 85.83% on the ADReSS test set and outperforming prior methods.

This paper has not been read by Pith yet.

CoSTA: Cognitive-State-Conditioned TTS Data Augmentation Using ASR Transcripts for Alzheimer's Disease Detection

discussion (0)