KNN retrieval over WavLM representations creates synthetic source-target pairs from non-parallel data for supervised voice conversion training with a speaker loss, achieving strong results on multilingual test sets despite English-only training.
We intro- duce an end-to-end training framework for any-to-any zero-shot voice conversion under a non-parallel data setting
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SD 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
From A to B to A: Palindromic Zero-Shot Voice Conversion with Non-Parallel Data
KNN retrieval over WavLM representations creates synthetic source-target pairs from non-parallel data for supervised voice conversion training with a speaker loss, achieving strong results on multilingual test sets despite English-only training.