Contribution Weights combine attention, value magnitude, and directional alignment to measure token influence more faithfully than attention alone, and show attention sinks actively suppress information via a convex sink-rate to output-norm relationship.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.
citing papers explorer
-
Contribution Weights: A Geometrical Analysis of Self-Attention Transformers
Contribution Weights combine attention, value magnitude, and directional alignment to measure token influence more faithfully than attention alone, and show attention sinks actively suppress information via a convex sink-rate to output-norm relationship.
-
A Survey of Mamba
The paper consolidates existing research on Mamba models, their architecture variants, adaptations to different data modalities, and applications across domains.