SQuTR aggregates 37k queries from six text retrieval datasets, synthesizes speech from 200 speakers, adds 17 noise categories at varying SNR, and shows that even large retrieval models degrade sharply under extreme acoustic noise.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
representative citing papers
The survey introduces a four-category taxonomy for LALM evaluations and reviews benchmarks across general auditory processing, knowledge reasoning, dialogue, and fairness-safety.
citing papers explorer
-
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise
SQuTR aggregates 37k queries from six text retrieval datasets, synthesizes speech from 200 speakers, adds 17 noise categories at varying SNR, and shows that even large retrieval models degrade sharply under extreme acoustic noise.
-
Towards Holistic Evaluation of Large Audio-Language Models: A Comprehensive Survey
The survey introduces a four-category taxonomy for LALM evaluations and reviews benchmarks across general auditory processing, knowledge reasoning, dialogue, and fairness-safety.