arXiv preprint arXiv:2511.21270 , year=

Multi-Reward GRPO for Stable, Prosodic Single-Codebook TTS LLMs at Scale , author= · 2026 · arXiv 2511.21270

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

GLASS: GRPO-Trained LoRA for Acoustic Style Steering in Zero-Shot Text-to-Speech

cs.SD · 2026-06-04 · unverdicted · novelty 6.0

GLASS enables composable acoustic style control in zero-shot TTS by training independent GRPO-optimized LoRA adapters on style rewards that can be linearly combined.

CosyEdit2: Speech-Editing-Oriented Reinforcement Learning Unlocks Better Zero-Shot TTS

cs.SD · 2026-05-25 · unverdicted · novelty 6.0

A two-stage post-training pipeline of SFT followed by editing-oriented GRPO on unpaired data improves speech editing consistency and zero-shot TTS quality.

citing papers explorer

Showing 2 of 2 citing papers after filters.

GLASS: GRPO-Trained LoRA for Acoustic Style Steering in Zero-Shot Text-to-Speech cs.SD · 2026-06-04 · unverdicted · none · ref 19
GLASS enables composable acoustic style control in zero-shot TTS by training independent GRPO-optimized LoRA adapters on style rewards that can be linearly combined.
CosyEdit2: Speech-Editing-Oriented Reinforcement Learning Unlocks Better Zero-Shot TTS cs.SD · 2026-05-25 · unverdicted · none · ref 4
A two-stage post-training pipeline of SFT followed by editing-oriented GRPO on unpaired data improves speech editing consistency and zero-shot TTS quality.

arXiv preprint arXiv:2511.21270 , year=

fields

years

verdicts

representative citing papers

citing papers explorer