Transformers converge pathwise to a stochastic particle system and SPDE in the scaling limit, exhibiting synchronization by noise and exponential energy dissipation when common noise is coercive relative to self-attention drift.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
representative citing papers
Kernel ridge regression predicts the self-energy of 1D Hubbard models from static and dynamic mean-field features, enabling Green's functions via Dyson's equation for U/t from weak to strong coupling.
citing papers explorer
-
Stochastic Scaling Limits and Synchronization by Noise in Deep Transformer Models
Transformers converge pathwise to a stochastic particle system and SPDE in the scaling limit, exhibiting synchronization by noise and exponential energy dissipation when common noise is coercive relative to self-attention drift.
-
Machine Learning Green's Functions of Strongly Correlated Hubbard Models
Kernel ridge regression predicts the self-energy of 1D Hubbard models from static and dynamic mean-field features, enabling Green's functions via Dyson's equation for U/t from weak to strong coupling.
- DMFT analysis of Hopfield network with plasticity