Audio language models are benchmarked on five semantic and paralinguistic reasoning tasks to reveal limitations in handling spoken audio evidence, accent variation, and domain shifts.
, month = apr, year =
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it