Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

Mahesh Chandra Mukkamala; Matthias Hein

arxiv: 1706.05507 · v2 · pith:QJ7S56Z5new · submitted 2017-06-17 · 💻 cs.LG · cs.AI· cs.CV· cs.NE· stat.ML

Variants of RMSProp and Adagrad with Logarithmic Regret Bounds

Mahesh Chandra Mukkamala , Matthias Hein This is my paper

classification 💻 cs.LG cs.AIcs.CVcs.NEstat.ML

keywords boundsconvexdeepgradientnetworksneuralregrettraining

0 comments

read the original abstract

Adaptive gradient methods have become recently very popular, in particular as they have been shown to be useful in the training of deep neural networks. In this paper we have analyzed RMSProp, originally proposed for the training of deep neural networks, in the context of online convex optimization and show $\sqrt{T}$-type regret bounds. Moreover, we propose two variants SC-Adagrad and SC-RMSProp for which we show logarithmic regret bounds for strongly convex functions. Finally, we demonstrate in the experiments that these new variants outperform other adaptive gradient techniques or stochastic gradient descent in the optimization of strongly convex functions as well as in training of deep neural networks.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

DeepDA: LSTM-based Deep Data Association Network for Multi-Targets Tracking in Clutter
cs.LG 2019-07 unverdicted novelty 5.0

An LSTM network is trained via supervised learning on simulated radar data to directly predict measurement-to-track association probabilities for multi-target tracking in clutter, reporting improved association ratios...