Context gating in associative memories boosts inter-memory separation and sparsity for exponential retrieval gains, admits a unique fixed point driven by direct bias and feedback, and matches in-context learning dynamics in transformers like Llama-3.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Ordinary least squares is a special case of the single-layer linear transformer when attention parameters are set via spectral decomposition of the empirical covariance matrix.
citing papers explorer
-
Context-Gated Associative Retrieval: From Theory to Transformers
Context gating in associative memories boosts inter-memory separation and sparsity for exponential retrieval gains, admits a unique fixed point driven by direct bias and feedback, and matches in-context learning dynamics in transformers like Llama-3.
-
Ordinary Least Squares is a Special Case of Transformer
Ordinary least squares is a special case of the single-layer linear transformer when attention parameters are set via spectral decomposition of the empirical covariance matrix.