Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Behnam Neyshabur; Nathan Srebro; Ruslan Salakhutdinov

arxiv: 1506.02617 · v1 · pith:CDJJKMFZnew · submitted 2015-06-08 · 💻 cs.LG · cs.CV· cs.NE· stat.ML

Path-SGD: Path-Normalized Optimization in Deep Neural Networks

Behnam Neyshabur , Ruslan Salakhutdinov , Nathan Srebro This is my paper

classification 💻 cs.LG cs.CVcs.NEstat.ML

keywords path-sgddeepgeometrynetworksneuralweightsadagradaffect

0 comments

read the original abstract

We revisit the choice of SGD for training deep neural networks by reconsidering the appropriate geometry in which to optimize the weights. We argue for a geometry invariant to rescaling of weights that does not affect the output of the network, and suggest Path-SGD, which is an approximate steepest descent method with respect to a path-wise regularizer related to max-norm regularization. Path-SGD is easy and efficient to implement and leads to empirical gains over SGD and AdaGrad.

This paper has not been read by Pith yet.

Path-SGD: Path-Normalized Optimization in Deep Neural Networks

discussion (0)