Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.
The lj speech dataset
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
The paper creates ADD-C benchmark dataset for audio deepfake detection under codec compression and packet loss, shows baseline degradation, and demonstrates a data augmentation method that boosts robustness.
citing papers explorer
-
Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing
Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.
-
Benchmarking Audio Deepfake Detection Robustness in Real-world Communication Scenarios
The paper creates ADD-C benchmark dataset for audio deepfake detection under codec compression and packet loss, shows baseline degradation, and demonstrates a data augmentation method that boosts robustness.