pith. machine review for the scientific record. sign in

arxiv: 1806.02786 · v1 · submitted 2018-06-07 · 💻 cs.CL

Recognition: unknown

Domain Adversarial Training for Accented Speech Recognition

Authors on Pith no claims yet
classification 💻 cs.CL
keywords accenteddomaindataspeechtrainingadversarialrecognitionaccent
0
0 comments X
read the original abstract

In this paper, we propose a domain adversarial training (DAT) algorithm to alleviate the accented speech recognition problem. In order to reduce the mismatch between labeled source domain data ("standard" accent) and unlabeled target domain data (with heavy accents), we augment the learning objective for a Kaldi TDNN network with a domain adversarial training (DAT) objective to encourage the model to learn accent-invariant features. In experiments with three Mandarin accents, we show that DAT yields up to 7.45% relative character error rate reduction when we do not have transcriptions of the accented speech, compared with the baseline trained on standard accent data only. We also find a benefit from DAT when used in combination with training from automatic transcriptions on the accented data. Furthermore, we find that DAT is superior to multi-task learning for accented speech recognition.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Few-Shot Accent Synthesis for ASR with LLM-Guided Phoneme Editing

    cs.SD 2026-04 unverdicted novelty 5.0

    Few-shot TTS adaptation combined with LLM-guided phoneme editing produces synthetic accented speech that improves ASR word error rates on real accented audio even in cross-speaker and ultra-low-data settings.