Frame-aligned fusion of Canary and WavLM encoders, with WavLM temporally prepared via learnable strided convolution, outperforms other fusion strategies and reaches Eval RMSE 24.96 and Corr 0.796 on non-intrusive intelligibility prediction.
An algorithm for intelligibility prediction of time-frequency weighted noisy speech,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
eess.AS 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Word-level modeling with alignment-aware acoustic fusion on a frozen Whisper model improves text-assisted intelligibility prediction metrics on the CPC3 evaluation set.
citing papers explorer
-
Frame-Aligned Fusion of Canary and WavLM for Non-Intrusive Intelligibility Prediction of Hearing-Aid-Processed Speech
Frame-aligned fusion of Canary and WavLM encoders, with WavLM temporally prepared via learnable strided convolution, outperforms other fusion strategies and reaches Eval RMSE 24.96 and Corr 0.796 on non-intrusive intelligibility prediction.
-
Word-Level Modeling with Alignment-Aware Acoustic Fusion for Text-Assisted Intelligibility Prediction in Listeners with Hearing Loss
Word-level modeling with alignment-aware acoustic fusion on a frozen Whisper model improves text-assisted intelligibility prediction metrics on the CPC3 evaluation set.