pith. sign in

arxiv: 1705.11159 · v1 · pith:P5FZCUYMnew · submitted 2017-05-31 · 💻 cs.LG

Reinforcement Learning for Learning Rate Control

classification 💻 cs.LG
keywords learningnetworkactorrateratesbettercalledcritic
0
0 comments X
read the original abstract

Stochastic gradient descent (SGD), which updates the model parameters by adding a local gradient times a learning rate at each step, is widely used in model training of machine learning algorithms such as neural networks. It is observed that the models trained by SGD are sensitive to learning rates and good learning rates are problem specific. We propose an algorithm to automatically learn learning rates using neural network based actor-critic methods from deep reinforcement learning (RL).In particular, we train a policy network called actor to decide the learning rate at each step during training, and a value network called critic to give feedback about quality of the decision (e.g., the goodness of the learning rate outputted by the actor) that the actor made. The introduction of auxiliary actor and critic networks helps the main network achieve better performance. Experiments on different datasets and network architectures show that our approach leads to better convergence of SGD than human-designed competitors.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Hyp-RL : Hyperparameter Optimization by Reinforcement Learning

    cs.LG 2019-06 unverdicted novelty 5.0

    Reinforcement learning selects hyperparameters sequentially by learning from actual future validation loss reductions and outperforms SMBO methods on 50 datasets.