pith. sign in

hub

A unified perspective on the dynamics of deep transformers.arXiv preprint arXiv:2501.18322

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

hub tools

citation-role summary

background 3

citation-polarity summary

years

2026 9 2025 2

roles

background 3

polarities

background 3

representative citing papers

Transformer-like Inference from Optimal Control

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

Derives transformer-like dual-filter inference layers from first-principles optimal control on nonlinear discrete and linear Gaussian sequence models.

Uniform Scaling Limits in AdamW-Trained Transformers

stat.ML · 2026-05-11 · unverdicted · novelty 7.0

AdamW-trained transformer hidden states and backpropagated variables converge uniformly in L2 to a forward-backward ODE system (McKean-Vlasov when non-causal) at rate O(L^{-1}+L^{-1/3}H^{-1/2}) as depth L and heads H increase, with bounds independent of token number.

Spectral Selection in Symmetric Self-Attention Dynamics

math.DS · 2026-04-28 · unverdicted · novelty 7.0

Symmetric self-attention dynamics select the dominant eigendirection of V, producing homogeneous alignment when one positive eigenvalue dominates or sign-split polarization when V is negative definite.

Preconditioned Regularized Wasserstein Proximal Sampling

stat.ML · 2025-09-01 · unverdicted · novelty 7.0

A preconditioned regularized Wasserstein proximal sampling algorithm is introduced for particle-based approximation of Gibbs distributions, featuring a PDE-derived kernel formulation and non-asymptotic convergence analysis for quadratic potentials.

Propagation of Chaos in Contextual Flow Maps

cs.LG · 2026-05-16 · unverdicted · novelty 6.0

Derives forward and backward propagation-of-chaos bounds for finite vs. infinite-context transformers modeled as contextual flow maps, achieving Wasserstein rate n^{-1/d} generally and n^{-1/2} for transformer-like cases.

citing papers explorer

Showing 11 of 11 citing papers.