Derives a blockwise resolvent-style attention operator that exploits structured sparsity for subquadratic O(n^{4/3}d) entity tracking while matching dense accuracy.
Entity Tracking in Language Models
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Standard CoT transformers are limited to TC^0 for length-generalizable reasoning but can simulate Turing machines with linear-length traces if vocabulary grows, using signpost tokens and change encodings.
citing papers explorer
-
Structured-Sparse Attention for Entity Tracking with Subquadratic Sequence Complexity
Derives a blockwise resolvent-style attention operator that exploits structured sparsity for subquadratic O(n^{4/3}d) entity tracking while matching dense accuracy.
-
Barriers to Universal Reasoning With Transformers (And How to Overcome Them)
Standard CoT transformers are limited to TC^0 for length-generalizable reasoning but can simulate Turing machines with linear-length traces if vocabulary grows, using signpost tokens and change encodings.