EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization

Mujin Cheon , Jay H. Lee , Dong-Yeun Koh , Calvin Tsay

Authors on Pith no claims yet

classification 💻 cs.LG math.OC

keywords lookaheadmulti-stephigh-dimensionallearningoptimizationbayesianearl-boencoder

read the original abstract

To avoid myopic behavior, multi-step lookahead Bayesian optimization (BO) algorithms consider the sequential nature of BO and have demonstrated promising results in recent years. However, owing to the curse of dimensionality, most of these methods make significant approximations or suffer scalability issues. This paper presents a novel reinforcement learning (RL)-based framework for multi-step lookahead BO in high-dimensional black-box optimization problems. The proposed method enhances the scalability and decision-making quality of multi-step lookahead BO by efficiently solving the sequential dynamic program of the BO process in a near-optimal manner using RL. We first introduce an Attention-DeepSets encoder to represent the state of knowledge to the RL agent and subsequently propose a multi-task, fine-tuning procedure based on end-to-end (encoder-RL) on-policy learning. We evaluate the proposed method, EARL-BO (Encoder Augmented RL for BO), on synthetic benchmark functions and hyperparameter tuning problems, finding significantly improved performance compared to existing multi-step lookahead and high-dimensional BO methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

BayMOTH: Bayesian optiMizatiOn with meTa-lookahead -- a simple approacH
cs.LG 2026-04 unverdicted novelty 4.0

BayMOTH unifies meta-Bayesian optimization with a usefulness-based fallback to lookahead, demonstrating competitive results on function optimization tasks even under low task relatedness.