Large-model adaptation with Tibetan text handling produces natural speech from limited data, outperforming commercial systems.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot V oice Conversion for everyone
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2verdicts
UNVERDICTED 2representative citing papers
MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority with ASVspoof 2019 across eight test sets.
citing papers explorer
-
Tibetan-TTS:Low-Resource Tibetan Speech Synthesis with Large Model Adaptation
Large-model adaptation with Tibetan text handling produces natural speech from limited data, outperforming commercial systems.
-
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority with ASVspoof 2019 across eight test sets.