Instruction-based vector steering redirects temporal attention in LALMs to acoustically relevant regions, recovering queried sound event locations with 60.87-68.72% overlap accuracy without training.
Not in sync: Unveiling temporal bias in audio chat models,
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SD 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Introduces a benchmark for mechanistic analysis of temporal failures in LALMs and shows attention scaling at bottleneck layers improves accuracy from 55.9% to 59.1%.
citing papers explorer
-
Steering Where to Listen: Instruction-Based Activation Steering Redirects Temporal Attention in Large Audio-Language Models
Instruction-based vector steering redirects temporal attention in LALMs to acoustically relevant regions, recovering queried sound event locations with 60.87-68.72% overlap accuracy without training.
-
A Closer Look at Failure Modes in Temporal Understanding of Large Audio-Language Models
Introduces a benchmark for mechanistic analysis of temporal failures in LALMs and shows attention scaling at bottleneck layers improves accuracy from 55.9% to 59.1%.