GLASS enables composable acoustic style control in zero-shot TTS by training independent GRPO-optimized LoRA adapters on style rewards that can be linearly combined.
arXiv preprint arXiv:2511.21270 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
A two-stage post-training pipeline of SFT followed by editing-oriented GRPO on unpaired data improves speech editing consistency and zero-shot TTS quality.
citing papers explorer
-
GLASS: GRPO-Trained LoRA for Acoustic Style Steering in Zero-Shot Text-to-Speech
GLASS enables composable acoustic style control in zero-shot TTS by training independent GRPO-optimized LoRA adapters on style rewards that can be linearly combined.
-
CosyEdit2: Speech-Editing-Oriented Reinforcement Learning Unlocks Better Zero-Shot TTS
A two-stage post-training pipeline of SFT followed by editing-oriented GRPO on unpaired data improves speech editing consistency and zero-shot TTS quality.