Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach

Ha Viet Khanh; Huu Tuong Tu; Nguyen Thi Thu Trang; Nguyen Tien Cuong; Thien Van Luong; Tran Tien Dat; Vu Huan

arxiv: 2511.20107 · v1 · pith:JJRL4YXHnew · submitted 2025-11-25 · 💻 cs.CL · cs.SD· eess.AS

Mispronunciation Detection and Diagnosis Without Model Training: A Retrieval-Based Approach

Huu Tuong Tu , Ha Viet Khanh , Tran Tien Dat , Vu Huan , Thien Van Luong , Nguyen Tien Cuong , Nguyen Thi Thu Trang This is my paper

classification 💻 cs.CL cs.SDeess.AS

keywords trainingdetectiondiagnosismodelmethodmispronunciationmodelsspeech

0 comments

read the original abstract

Mispronunciation Detection and Diagnosis (MDD) is crucial for language learning and speech therapy. Unlike conventional methods that require scoring models or training phoneme-level models, we propose a novel training-free framework that leverages retrieval techniques with a pretrained Automatic Speech Recognition model. Our method avoids phoneme-specific modeling or additional task-specific training, while still achieving accurate detection and diagnosis of pronunciation errors. Experiments on the L2-ARCTIC dataset show that our method achieves a superior F1 score of 69.60% while avoiding the complexity of model training.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Beyond Acoustic Sparsity and Linguistic Bias: A Prompt-Free Paradigm for Mispronunciation Detection and Diagnosis
eess.AS 2026-04 unverdicted novelty 6.0

CROTTC-IF is a prompt-free MDD system with monotonic frame-level alignment and implicit knowledge transfer that reaches 71.77% F1 on L2-ARCTIC and 71.70% on Iqra'Eval2.