Sink value vectors in Omni-LLMs act as a shared bias organizing token representations, and aligning non-sink tokens to them via OutRo improves performance on video QA benchmarks.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
On the Nature of Attention Sink that Shapes Decoding Strategy in Omni-LLMs
Sink value vectors in Omni-LLMs act as a shared bias organizing token representations, and aligning non-sink tokens to them via OutRo improves performance on video QA benchmarks.