OneVoice unifies zero-shot voice conversion for speech, expressive, and singing scenarios in one model via MoE routing and progressive training, matching specialized performance.
HuBERT: Self- supervised speech representation learning by masked pre- diction of hidden units.Transactions on Audio, Speech, and Language Processing, 29:3451–3460,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
OneVoice: One Model, Triple Scenarios-Towards Unified Zero-shot Voice Conversion
OneVoice unifies zero-shot voice conversion for speech, expressive, and singing scenarios in one model via MoE routing and progressive training, matching specialized performance.