pith. machine review for the scientific record. sign in

arxiv: 1301.3641 · v3 · submitted 2013-01-16 · 💻 cs.LG · cs.NE· stat.ML

Recognition: unknown

Training Neural Networks with Stochastic Hessian-Free Optimization

Authors on Pith no claims yet
classification 💻 cs.LG cs.NEstat.ML
keywords hessian-freeoptimizationstochasticdeepgradientnetworkstrainingachieves
0
0 comments X
read the original abstract

Hessian-free (HF) optimization has been successfully used for training deep autoencoders and recurrent networks. HF uses the conjugate gradient algorithm to construct update directions through curvature-vector products that can be computed on the same order of time as gradients. In this paper we exploit this property and study stochastic HF with gradient and curvature mini-batches independent of the dataset size. We modify Martens' HF for these settings and integrate dropout, a method for preventing co-adaptation of feature detectors, to guard against overfitting. Stochastic Hessian-free optimization gives an intermediary between SGD and HF that achieves competitive performance on both classification and deep autoencoder experiments.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Fast Gauss-Newton for Multiclass Cross-Entropy

    cs.LG 2026-05 unverdicted novelty 7.0

    FGN is a positive semidefinite under-approximation of the multiclass GGN obtained by exact decomposition into true-vs-rest and within-competitor terms, exact for binary classification and implemented via matrix-free c...