PSP decomposes TTS accent into retroflex collapse rate, aspiration fidelity, vowel-length fidelity, Tamil-zha fidelity, FAD, and prosodic signature divergence, revealing that commercial systems vary in accent fidelity beyond WER scores.
Praxy Voice: Voice-Prompt Recovery + BUPS for Commercial-Class Indic TTS from a Frozen Non-Indic Base at Zero Commercial-Training-Data Cost
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
PSP: An Interpretable Per-Dimension Accent Benchmark for Indic Text-to-Speech
PSP decomposes TTS accent into retroflex collapse rate, aspiration fidelity, vowel-length fidelity, Tamil-zha fidelity, FAD, and prosodic signature divergence, revealing that commercial systems vary in accent fidelity beyond WER scores.