Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach
read the original abstract
Mispronunciation Detection and Diagnosis (MDD) is crucial for language learning and speech therapy. Unlike conventional methods that require scoring models or training phoneme-level models, we propose a novel training-free framework that leverages retrieval techniques with a pretrained Automatic Speech Recognition model. Our method avoids phoneme-specific modeling or additional task-specific training, while still achieving accurate detection and diagnosis of pronunciation errors. Experiments on the L2-ARCTIC dataset show that our method achieves a superior F1 score of 69.60% while avoiding the complexity of model training.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Beyond Acoustic Sparsity and Linguistic Bias: A Prompt-Free Paradigm for Mispronunciation Detection and Diagnosis
CROTTC-IF is a prompt-free MDD system with monotonic frame-level alignment and implicit knowledge transfer that reaches 71.77% F1 on L2-ARCTIC and 71.70% on Iqra'Eval2.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.