Segmenting Subtitles for Correcting ASR Segmentation Errors

Chris Kedzie; David Wan; Elena Zotkina; Elsbeth Turcan; Faisal Ladhak; Kathleen McKeown; Peter Bell; Petra Galu\v{s}\v{c}\'akov\'a; Zhengping Jiang

arxiv: 2104.07868 · v1 · pith:V3VEDUH2new · submitted 2021-04-16 · 💻 cs.CL

Segmenting Subtitles for Correcting ASR Segmentation Errors

David Wan , Chris Kedzie , Faisal Ladhak , Elsbeth Turcan , Petra Galu\v{s}\v{c}\'akov\'a , Elena Zotkina , Zhengping Jiang , Peter Bell

show 1 more author

Kathleen McKeown

This is my paper

classification 💻 cs.CL

keywords acousticcorrectingsegmentationdownstreaminformationmodelperformancepropose

0 comments

read the original abstract

Typical ASR systems segment the input audio into utterances using purely acoustic information, which may not resemble the sentence-like units that are expected by conventional machine translation (MT) systems for Spoken Language Translation. In this work, we propose a model for correcting the acoustic segmentation of ASR models for low-resource languages to improve performance on downstream tasks. We propose the use of subtitles as a proxy dataset for correcting ASR acoustic segmentation, creating synthetic acoustic utterances by modeling common error modes. We train a neural tagging model for correcting ASR acoustic segmentation and show that it improves downstream performance on MT and audio-document cross-language information retrieval (CLIR).

This paper has not been read by Pith yet.

Segmenting Subtitles for Correcting ASR Segmentation Errors

discussion (0)