pith. sign in

Efficient Subsampled Gauss-Newton and Natural Gradient Methods for Training Neural Networks

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it
abstract

We present practical Levenberg-Marquardt variants of Gauss-Newton and natural gradient methods for solving non-convex optimization problems that arise in training deep neural networks involving enormous numbers of variables and huge data sets. Our methods use subsampled Gauss-Newton or Fisher information matrices and either subsampled gradient estimates (fully stochastic) or full gradients (semi-stochastic), which, in the latter case, we prove convergent to a stationary point. By using the Sherman-Morrison-Woodbury formula with automatic differentiation (backpropagation) we show how our methods can be implemented to perform efficiently. Finally, numerical results are presented to demonstrate the effectiveness of our proposed methods.

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 4

years

2026 4

verdicts

UNVERDICTED 4

roles

background 1

polarities

background 1

representative citing papers

Fast Gauss-Newton for Multiclass Cross-Entropy

cs.LG · 2026-05-07 · unverdicted · novelty 7.0

FGN is a positive semidefinite under-approximation of the multiclass GGN obtained by exact decomposition into true-vs-rest and within-competitor terms, exact for binary classification and implemented via matrix-free conjugate gradient on a whitened row-space system.

citing papers explorer

Showing 4 of 4 citing papers.