Post-training autoregressive speech enhancement LMs via GSPO with composite perceptual rewards from DNSMOS, WER, and UTMOS reaches SOTA on DNS2020 and outperforms single-metric variants in human evaluation.
Fine- grained Preference Optimization Improves Zero-shot Text-to- Speech,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Post-Training Speech Enhancement Language Models with Perceptual Rewards
Post-training autoregressive speech enhancement LMs via GSPO with composite perceptual rewards from DNSMOS, WER, and UTMOS reaches SOTA on DNS2020 and outperforms single-metric variants in human evaluation.