pith. sign in

arxiv: 1611.03824 · v6 · pith:X7YRPS7Znew · submitted 2016-11-11 · 📊 stat.ML · cs.LG

Learning to Learn without Gradient Descent by Gradient Descent

classification 📊 stat.ML cs.LG
keywords descentgradientlearnoptimizersfunctionshyper-parameterlearnedoptimization
0
0 comments X
read the original abstract

We learn recurrent neural network optimizers trained on simple synthetic functions by gradient descent. We show that these learned optimizers exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks. Up to the training horizon, the learned optimizers learn to trade-off exploration and exploitation, and compare favourably with heavily engineered Bayesian optimization packages for hyper-parameter tuning.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Learning to learn with quantum neural networks via classical neural networks

    quant-ph 2019-07 unverdicted novelty 7.0

    Classical RNNs trained on small instances provide parameter initializations for QAOA and VQE that reduce total optimization iterations and generalize across problem sizes.