LambdaMark is the first generic radioactive audio watermark that injects multi-bit messages into semantic latent representations, achieving robustness to distortions and removal attacks even after downstream model finetuning.
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot V oice Conversion for everyone
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Large-model adaptation with Tibetan text handling produces natural speech from limited data, outperforming commercial systems.
MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority with ASVspoof 2019 across eight test sets.
Proxy-Anchor metric learning on Wav2Vec2-BERT embeddings with architecture merging achieves 99.76% closed-set accuracy and 2.04% FPR@95 OOD detection on MLAAD v9, doubling prior OOD accuracy on v5 splits.
citing papers explorer
-
MLAAD: The Multi-Language Audio Anti-Spoofing Dataset
MLAAD provides a large-scale multi-language synthetic audio dataset for training and evaluating audio anti-spoofing models, showing better training performance than InTheWild and FakeOrReal and alternating superiority with ASVspoof 2019 across eight test sets.