Title resolution pending

The Mean-Field Dynamics of Transformers , author= · 2026

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

$L^2$ over Wasserstein: Statistical Analysis for Optimal Transport

math.ST · 2026-05-20 · unverdicted · novelty 7.0

Defines the L² over Wasserstein space to equip random probability measures with inherited Riemannian geometry, enabling statistical convergence results and Bayesian posterior consistency in the Wasserstein topology.

Training-Induced Escape from Token Clustering in a Mean-Field Formulation of Transformers

cs.LG · 2026-05-08 · unverdicted · novelty 5.0

Training a mean-field Transformer under L2 regularization induces an escape from attention-driven token clustering in later layers after initial clustering.

citing papers explorer

Showing 2 of 2 citing papers.

$L^2$ over Wasserstein: Statistical Analysis for Optimal Transport math.ST · 2026-05-20 · unverdicted · none · ref 43
Defines the L² over Wasserstein space to equip random probability measures with inherited Riemannian geometry, enabling statistical convergence results and Bayesian posterior consistency in the Wasserstein topology.
Training-Induced Escape from Token Clustering in a Mean-Field Formulation of Transformers cs.LG · 2026-05-08 · unverdicted · none · ref 12
Training a mean-field Transformer under L2 regularization induces an escape from attention-driven token clustering in later layers after initial clustering.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer