Two new embedding algorithms (similarity vector prediction and Frobenius-norm matrix matching) trained on subjective inter-speaker scores yield d-vectors more correlated with human similarity judgments and improve TTS quality for unseen speakers.
Non-p arallel voice conversion using variational autoencoders conditio ned by phonetic posteriorgrams and d-vectors,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.AS 1years
2019 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis
Two new embedding algorithms (similarity vector prediction and Frobenius-norm matrix matching) trained on subjective inter-speaker scores yield d-vectors more correlated with human similarity judgments and improve TTS quality for unseen speakers.