arXiv preprint arXiv:2410.23079 , year=

Buzz: Beehive-structured sparse kv cache with segmented heavy hitters for efficient llm inference , author= · 2024 · arXiv 2410.23079

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

cs.LG · 2026-04-11 · unverdicted · novelty 7.0

The first survey on Attention Sink in Transformers structures the literature around fundamental utilization, mechanistic interpretation, and strategic mitigation.

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

cs.CL · 2025-02-16 · unverdicted · novelty 6.0

NSA is a hardware-aligned sparse attention mechanism that enables end-to-end trainable long-context modeling by combining coarse token compression with fine-grained selection.

citing papers explorer

Showing 2 of 2 citing papers.

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation cs.LG · 2026-04-11 · unverdicted · none · ref 143
The first survey on Attention Sink in Transformers structures the literature around fundamental utilization, mechanistic interpretation, and strategic mitigation.
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention cs.CL · 2025-02-16 · unverdicted · none · ref 53
NSA is a hardware-aligned sparse attention mechanism that enables end-to-end trainable long-context modeling by combining coarse token compression with fine-grained selection.

arXiv preprint arXiv:2410.23079 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer