Stochastic Hyperparameter Optimization through Hypernetworks

Jonathan Lorraine , David Duvenaud

Authors on Pith no claims yet

classification 💻 cs.LG

keywords optimizationhyperparametersweightshypernetworkshyperparametermethodoptimalstochastic

read the original abstract

Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparameters. Our process trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our technique converges to locally optimal weights and hyperparameters for sufficiently large hypernetworks. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

On the Stability and Generalization of First-order Bilevel Minimax Optimization
cs.LG 2026-04 unverdicted novelty 7.0

Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.
Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization
cs.LG 2026-04 unverdicted novelty 7.0

Derives upper bounds on on-average argument stability for single- and two-timescale SGD in bilevel optimization under NC-NC, C-C, and SC-SC regimes, linking stability directly to generalization gaps.