×-shaped variable-width transformers outperform parameter-matched uniform baselines on language modeling loss with 22% fewer FLOPs and 15% smaller KV cache.
Optimal Degrees of Synaptic Connectivity
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4roles
background 1polarities
background 1representative citing papers
Coarse wiring statistics set the dynamical regime while precise connections set activity geometry in a parameter-free model of the complete larval Drosophila connectome.
Four axioms (Causality, Minimality, Separability, Stability) are formalized for latent thought representations; audits of open LLMs on 23 tasks show none satisfy all four and representations add little beyond input embeddings.
KLR Hopfield networks store up to 16-20 times their neuron count before dynamical instability from crosstalk noise causes collapse, with sharp attractor boundaries observed via morphing and SNR analysis.
citing papers explorer
-
Variable-Width Transformers
×-shaped variable-width transformers outperform parameter-matched uniform baselines on language modeling loss with 22% fewer FLOPs and 15% smaller KV cache.