Seed-TTS models produce speech matching human naturalness and speaker similarity, with added controllability via self-distillation and reinforcement learning.
Streaming voice conversion via intermediate bottleneck features and non-streaming teacher guidance
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Seed-TTS: A Family of High-Quality Versatile Speech Generation Models
Seed-TTS models produce speech matching human naturalness and speaker similarity, with added controllability via self-distillation and reinforcement learning.