pith. machine review for the scientific record. sign in

arxiv: 1611.03824 · v6 · submitted 2016-11-11 · 📊 stat.ML · cs.LG

Recognition: unknown

Learning to Learn without Gradient Descent by Gradient Descent

Authors on Pith no claims yet
classification 📊 stat.ML cs.LG
keywords descentgradientlearnoptimizersfunctionshyper-parameterlearnedoptimization
0
0 comments X
read the original abstract

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.