Comparative study finds that tokenization choices and SSL pretraining models produce distinct effects on French ASR when assessed with linguistic and acoustic metrics beyond CER and WER.
CamemBERT: a Tasty French Language Model,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Comprehensive Analysis of Tokenization and Self-Supervised Learning in End-to-End Automatic Speech Recognition applied on French Language
Comparative study finds that tokenization choices and SSL pretraining models produce distinct effects on French ASR when assessed with linguistic and acoustic metrics beyond CER and WER.