The Transformer is recovered exactly as the forward Euler step of spherical SVFlow, with multi-head attention and MoE/FFN as approximations to its vector field.
The emergence of clusters in self-attention dynamics
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Transformer as an Euler Discretization of Score-based Variational Flow
The Transformer is recovered exactly as the forward Euler step of spherical SVFlow, with multi-head attention and MoE/FFN as approximations to its vector field.