Mollifying Networks

· 2016 · cs.LG · arXiv 1608.04980

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

The optimization of deep neural networks can be more challenging than traditional convex optimization problems due to the highly non-convex nature of the loss function, e.g. it can involve pathological landscapes such as saddle-surfaces that can be difficult to escape for algorithms based on simple gradient descent. In this paper, we attack the problem of optimization of highly non-convex neural networks by starting with a smoothed -- or \textit{mollified} -- objective function that gradually has a more non-convex energy landscape during the training. Our proposition is inspired by the recent studies in continuation methods: similar to curriculum methods, we begin learning an easier (possibly convex) objective function and let it evolve during the training, until it eventually goes back to being the original, difficult to optimize, objective function. The complexity of the mollified networks is controlled by a single hyperparameter which is annealed during the training. We show improvements on various difficult optimization tasks and establish a relationship with recent works on continuation methods for neural networks and mollifiers.

representative citing papers

A Stochastic Composite Gradient Method with Incremental Variance Reduction

math.OC · 2019-06-24 · unverdicted · novelty 6.0

Proposes an incremental variance-reduced stochastic gradient method for minimizing smooth nonconvex composite functions that achieves optimal first-order complexity rates.

citing papers explorer

Showing 1 of 1 citing paper.

A Stochastic Composite Gradient Method with Incremental Variance Reduction math.OC · 2019-06-24 · unverdicted · none · ref 10 · internal anchor
Proposes an incremental variance-reduced stochastic gradient method for minimizing smooth nonconvex composite functions that achieves optimal first-order complexity rates.

Mollifying Networks

fields

years

verdicts

representative citing papers

citing papers explorer