Assessing True Generalisability of Audio-Visual Speech Recognisers

· 2026 · eess.AS · arXiv 2606.07259

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

Current Audio-Visual Speech Recognition (AVSR) models achieve near-perfect performance on the standard LRS3 benchmark, raising concerns of adaptive overfitting. To systematically assess true generalisability, we construct a highly controlled, unseen evaluation set subsampled from the massive MultiVSR dataset. Unlike standard out-of-distribution benchmarks, our subset strictly matches the acoustic, visual, and demographic distributions of the LRS3 test set. Evaluating five state-of-the-art architectures reveals a universal performance collapse, proving that current systems fail to generalise even under strictly aligned conditions. Through a fine-grained attribute analysis across seven factors, we isolate the specific drivers of this degradation. Furthermore, we uncover a profound lexical bias, expose distinct error patterns, and surprisingly reveal that audio-visual performance even lags behind audio-only settings. We release our matched test set for future benchmarking.

representative citing papers

Assessing True Generalisability of Audio-Visual Speech Recognisers

eess.AS · 2026-06-05 · unverdicted · novelty 6.0

State-of-the-art AVSR models show substantial performance degradation on a new distribution-matched test set from MultiVSR, indicating failure to generalise beyond the LRS3 benchmark.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Assessing True Generalisability of Audio-Visual Speech Recognisers eess.AS · 2026-06-05 · unverdicted · none · ref 3 · internal anchor
State-of-the-art AVSR models show substantial performance degradation on a new distribution-matched test set from MultiVSR, indicating failure to generalise beyond the LRS3 benchmark.

Assessing True Generalisability of Audio-Visual Speech Recognisers

fields

years

verdicts

representative citing papers

citing papers explorer