Koel-tts: Enhancing llm based speech generation with preference alignment and classifier free guidance

· 2025 · arXiv 2502.05236

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

Reliable Neural-Codec Text-to-Speech by ASR Self-Verification and Distillation: Near-Zero Catastrophic Failures Across Models and Codecs

cs.SD · 2026-06-16 · unverdicted · novelty 6.0

ASR self-verification via best-of-N sampling eliminates observed catastrophic failures in multiple neural-codec TTS models, with distillation transferring most of the robustness to single-shot decoding.

DDPO-VC: Speaker De-Identification via Diffusion Denoising Policy Optimization

eess.AS · 2026-06-13 · unverdicted · novelty 6.0

DDPO-VC applies diffusion denoising policy optimization with dual-teacher rewards to improve speaker de-identification while preserving cognitive utility on dementia speech benchmarks.

Cross-modal Consistency Guidance for Robust Emotion Control in Auto-Regressive TTS Models

cs.CL · 2025-10-15 · 2 refs

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Koel-tts: Enhancing llm based speech generation with preference alignment and classifier free guidance

fields

years

verdicts

representative citing papers

citing papers explorer