CTC-seeded variable-length edit refinement with a diffusion-based Edit Flow decoder achieves WER reductions in non-autoregressive ASR using only two inference steps plus classifier-free guidance.
Less is more: Accurate speech recognition & translation without web-scale data
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
BlasBench supplies an Irish-aware normalizer and scoring harness that enables reproducible ASR comparisons and exposes a 33-43 point generalization gap for fine-tuned models versus 7-10 points for massively multilingual ones.
Frame-aligned fusion of Canary and WavLM encoders, with WavLM temporally prepared via learnable strided convolution, outperforms other fusion strategies and reaches Eval RMSE 24.96 and Corr 0.796 on non-intrusive intelligibility prediction.
citing papers explorer
-
BlasBench: An Open Benchmark for Irish Speech Recognition
BlasBench supplies an Irish-aware normalizer and scoring harness that enables reproducible ASR comparisons and exposes a 33-43 point generalization gap for fine-tuned models versus 7-10 points for massively multilingual ones.