PSP decomposes TTS accent into retroflex collapse rate, aspiration fidelity, vowel-length fidelity, Tamil-zha fidelity, FAD, and prosodic signature divergence, revealing that commercial systems vary in accent fidelity beyond WER scores.
Durational variability in speech and the rhythm class hypothesis
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PSP: An Interpretable Per-Dimension Accent Benchmark for Indic Text-to-Speech
PSP decomposes TTS accent into retroflex collapse rate, aspiration fidelity, vowel-length fidelity, Tamil-zha fidelity, FAD, and prosodic signature divergence, revealing that commercial systems vary in accent fidelity beyond WER scores.