Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.
A mean field view of the landscape of two-layer neural networks.Proceedings of the National Academy of Sciences, 115(33): E7665–E7671
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
Muon dynamics are equivalent to gradient flows of spectral Wasserstein distances on parameter-space measures, with the operator norm recovering the Muon geometry.
Discrete decentralized learning dynamics on manifolds converge uniformly to an overdamped Langevin SDE whose stationary states produce orthogonally disentangled, linearly separable features.
citing papers explorer
-
Muon Dynamics as a Spectral Wasserstein Flow
Muon dynamics are equivalent to gradient flows of spectral Wasserstein distances on parameter-space measures, with the operator norm recovering the Muon geometry.