New insights and perspectives on the natural gradient method.Journal of Machine Learning Research, 21(146):1–76

James Martens · 2020

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

browse 5 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Canonical Regularisation of Wide Feature-Learning Neural Networks

stat.ML · 2026-05-18 · unverdicted · novelty 8.0

Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.

Fast Gauss-Newton for Multiclass Cross-Entropy

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

FGN is a positive semidefinite under-approximation of the multiclass GGN obtained by exact decomposition into true-vs-rest and within-competitor terms, exact for binary classification and implemented via matrix-free conjugate gradient on a whitened row-space system.

Error whitening: Why Gauss-Newton outperforms Newton

cs.LG · 2026-05-11 · conditional · novelty 6.0

Gauss-Newton descent whitens errors by projecting Newton directions or gradients onto the tangent space, replacing JJ^T with the identity and removing parameterization distortions that affect Newton descent.

Natural Riemannian gradient for learning functional tensor networks

math.OC · 2026-04-10 · unverdicted · novelty 6.0

Natural Riemannian gradient descent enables optimization of functional tensor networks for general losses and shows improved convergence on classification tasks.

DynMuon: A Dynamic Spectral Shaping View of Muon

cs.LG · 2026-05-16 · 2 refs

citing papers explorer

Showing 3 of 3 citing papers after filters.

Canonical Regularisation of Wide Feature-Learning Neural Networks stat.ML · 2026-05-18 · unverdicted · none · ref 28
Derives geodesic ridge regularization and Riemannian Gibbs Process prior for feature-learning wide neural networks, generalizing kernel-regime results via function-space axiomatization.
Fast Gauss-Newton for Multiclass Cross-Entropy cs.LG · 2026-05-07 · unverdicted · none · ref 23
FGN is a positive semidefinite under-approximation of the multiclass GGN obtained by exact decomposition into true-vs-rest and within-competitor terms, exact for binary classification and implemented via matrix-free conjugate gradient on a whitened row-space system.
Natural Riemannian gradient for learning functional tensor networks math.OC · 2026-04-10 · unverdicted · none · ref 27
Natural Riemannian gradient descent enables optimization of functional tensor networks for general losses and shows improved convergence on classification tasks.

New insights and perspectives on the natural gradient method.Journal of Machine Learning Research, 21(146):1–76

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer