Fine-tuning Whisper on Swiss German speech with subtitle supervision yields an honest 25.6% WER baseline (13.8% cWER) and demonstrates that prior SOTA claims of 17% WER result from benchmark contamination allowing 13.88% WER with no dialect training.
Fine-tuning whisper on low-resource languages for real- world applications
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CL 2years
2026 2roles
method 1polarities
use method 1representative citing papers
WARDEN achieves better transcription and translation for Wardaman than larger models by separating the tasks and using Sundanese initialization plus a domain dictionary with just 6 hours of data.
citing papers explorer
-
WARDEN: Endangered Indigenous Language Transcription and Translation with 6 Hours of Training Data
WARDEN achieves better transcription and translation for Wardaman than larger models by separating the tasks and using Sundanese initialization plus a domain dictionary with just 6 hours of data.