Efficiently modeling long sequences with structured state spaces

Albert Gu, Karan Goel, Christopher Ré · 2022

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

KVBuffer: IO-aware Serving for Linear Attention

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

KVBuffer reduces linear attention decoding latency by up to 45% and increases speculative decoding throughput 5x by buffering keys/values for flexible chunked and parallel computation.

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

cs.AI · 2026-05-21 · unverdicted · novelty 6.0

Gated DeltaNet-2 decouples channel-wise erase and write gates in linear attention, generalizing prior DeltaNet and KDA models while showing stronger results on language modeling and long-context retrieval at 1.3B scale.

citing papers explorer

Showing 2 of 2 citing papers.

KVBuffer: IO-aware Serving for Linear Attention cs.LG · 2026-05-18 · unverdicted · none · ref 5
KVBuffer reduces linear attention decoding latency by up to 45% and increases speculative decoding throughput 5x by buffering keys/values for flexible chunked and parallel computation.
Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention cs.AI · 2026-05-21 · unverdicted · none · ref 38
Gated DeltaNet-2 decouples channel-wise erase and write gates in linear attention, generalizing prior DeltaNet and KDA models while showing stronger results on language modeling and long-context retrieval at 1.3B scale.

Efficiently modeling long sequences with structured state spaces

fields

years

verdicts

representative citing papers

citing papers explorer