Freeze-Thaw Bayesian Optimization

Jasper Snoek; Kevin Swersky; Ryan Prescott Adams

arxiv: 1406.3896 · v1 · pith:L3UHSUOAnew · submitted 2014-06-16 · 📊 stat.ML · cs.LG

Freeze-Thaw Bayesian Optimization

Kevin Swersky , Jasper Snoek , Ryan Prescott Adams This is my paper

classification 📊 stat.ML cs.LG

keywords learningmachinetrainingmodelbayesiandevelopmethodmodels

0 comments

read the original abstract

In this paper we develop a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings. Our method uses the partial information gained during the training of a machine learning model in order to decide whether to pause training and start a new model, or resume the training of a previously-considered model. We specifically tailor our method to machine learning problems by developing a novel positive-definite covariance kernel to capture a variety of training curves. Furthermore, we develop a Gaussian process prior that scales gracefully with additional temporal observations. Finally, we provide an information-theoretic framework to automate the decision process. Experiments on several common machine learning models show that our approach is extremely effective in practice.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Open-Ended Task Discovery via Bayesian Optimization
cs.AI 2026-05 unverdicted novelty 6.0

Generate-Select-Refine is an open-ended Bayesian optimization method that generates tasks and concentrates evaluations on the best one with only logarithmic regret overhead relative to standard single-task optimization.
HARBOR: Automated Harness Optimization
cs.LG 2026-04 unverdicted novelty 6.0

HARBOR formalizes harness optimization as constrained noisy Bayesian optimization over mixed-variable spaces and reports a case study where it outperforms manual tuning on a production coding agent.
Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization
cs.LG 2026-04 unverdicted novelty 6.0

ACTS improves Thompson sampling in high-dimensional Bayesian optimization by adaptively reducing the search space using gradients from surrogate samples to produce better maximizer samples.
Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive
cs.CL 2024-02 conditional novelty 6.0

DPOP is a new loss function that prevents DPO from lowering preferred response likelihoods and outperforms standard DPO on diverse datasets, MT-Bench, and enables Smaug-72B to exceed 80% on the Open LLM Leaderboard.
Two-stage Optimization for Machine Learning Workflow
cs.LG 2019-07 unverdicted novelty 4.0

Two-stage optimization for ML workflows that prioritizes data pipeline search over hyperparameter tuning, with time-allocation policies and a specificity metric for pruning.
A Tutorial on Bayesian Optimization
stat.ML 2018-07 unverdicted novelty 4.0

Bayesian optimization uses Gaussian process regression to build a surrogate model and acquisition functions to guide sampling for optimizing costly objective functions, including a new formal generalization of expecte...