mlr.press/v162/ravfogel22a.html

URL https://proceedings · arXiv 2502.01432

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

Dissociating Decodability and Causal Use in Bracket-Sequence Transformers

cs.CL · 2026-04-24 · unverdicted · novelty 6.0 · 2 refs

In Dyck-language transformers, attention patterns causally use top-of-stack information while residual-stream depth and distance signals are decodable yet causally inert.

citing papers explorer

Showing 1 of 1 citing paper.

Dissociating Decodability and Causal Use in Bracket-Sequence Transformers cs.CL · 2026-04-24 · unverdicted · none · ref 12 · 2 links
In Dyck-language transformers, attention patterns causally use top-of-stack information while residual-stream depth and distance signals are decodable yet causally inert.

mlr.press/v162/ravfogel22a.html

fields

years

verdicts

representative citing papers

citing papers explorer