pith. machine review for the scientific record. sign in

arxiv: 1802.09419 · v2 · submitted 2018-02-26 · 💻 cs.LG

Recognition: unknown

Stochastic Hyperparameter Optimization through Hypernetworks

Authors on Pith no claims yet
classification 💻 cs.LG
keywords optimizationhyperparametersweightshypernetworkshyperparametermethodoptimalstochastic
0
0 comments X
read the original abstract

Machine learning models are often tuned by nesting optimization of model weights inside the optimization of hyperparameters. We give a method to collapse this nested optimization into joint stochastic optimization of weights and hyperparameters. Our process trains a neural network to output approximately optimal weights as a function of hyperparameters. We show that our technique converges to locally optimal weights and hyperparameters for sufficiently large hypernetworks. We compare this method to standard hyperparameter optimization strategies and demonstrate its effectiveness for tuning thousands of hyperparameters.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. On the Stability and Generalization of First-order Bilevel Minimax Optimization

    cs.LG 2026-04 unverdicted novelty 7.0

    Provides the first systematic generalization analysis via algorithmic stability for single-timescale and two-timescale stochastic gradient descent-ascent in bilevel minimax problems.

  2. Fine-grained Analysis of Stability and Generalization for Stochastic Bilevel Optimization

    cs.LG 2026-04 unverdicted novelty 7.0

    Derives upper bounds on on-average argument stability for single- and two-timescale SGD in bilevel optimization under NC-NC, C-C, and SC-SC regimes, linking stability directly to generalization gaps.