The first survey on Attention Sink in Transformers structures the literature around fundamental utilization, mechanistic interpretation, and strategic mitigation.
arXiv preprint arXiv:2507.16018 (2025)
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 3
citation-polarity summary
years
2026 3roles
background 3polarities
background 3representative citing papers
citing papers explorer
-
Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation
The first survey on Attention Sink in Transformers structures the literature around fundamental utilization, mechanistic interpretation, and strategic mitigation.
- Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs
- When Sinks Help or Hurt: Unified Framework for Attention Sink in Large Vision-Language Models