hub

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

· 2009 · cs.LG · arXiv 0912.3995

19 Pith papers cite this work. Polarity classification is still indexing.

19 Pith papers citing it

open full Pith review browse 19 citing papers arXiv PDF

abstract

Many applications require optimizing an unknown, noisy function that is expensive to evaluate. We formalize this task as a multi-armed bandit problem, where the payoff function is either sampled from a Gaussian process (GP) or has low RKHS norm. We resolve the important open problem of deriving regret bounds for this setting, which imply novel convergence rates for GP optimization. We analyze GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design. Moreover, by bounding the latter in terms of operator spectra, we obtain explicit sublinear regret bounds for many commonly used covariance functions. In some important cases, our bounds have surprisingly weak dependence on the dimensionality. In our experiments on real sensor data, GP-UCB compares favorably with other heuristical GP optimization approaches.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 method 1

citation-polarity summary

background 2 unclear 1 use method 1

representative citing papers

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees

cs.LG · 2026-04-15 · unverdicted · novelty 8.0

RHC-UCRL is the first algorithm for safety-constrained RL under explicit adversarial dynamics, providing sub-linear regret and constraint violation guarantees by maintaining optimism over both agent and adversary policies.

Active Learning for Gaussian Process Regression Under Self-Induced Boltzmann Weights

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

AB-SID-iVAR enables Gaussian process active learning for self-induced Boltzmann distributions by closed-form approximation of the target, with high-probability error vanishing guarantees and empirical gains on PES and drug discovery tasks.

FORGE: Fragment-Oriented Ranking and Generation for Context-Aware Molecular Optimization

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

FORGE reformulates molecular optimization as context-aware fragment ranking and replacement using mined low-to-high edit pairs, outperforming larger language models and graph methods on standard benchmarks.

Many Needles in a Haystack: Active Hit Discovery for Perturbation Experiments

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.

Learning myopic mixed-integer nonlinear model predictive control from expert demonstrations

eess.SY · 2026-05-08 · unverdicted · novelty 7.0

A myopic MINMPC framework learns a value function offline via inverse optimization from expert data, allowing short horizons with near-optimal performance and strict integer feasibility online for hybrid systems.

Spectral bandits

stat.ML · 2026-04-28 · unverdicted · novelty 7.0

Spectral bandits achieve scalable regret in graph-structured recommendation by using an effective dimension to learn good policies from few node evaluations.

Unleashing LLMs in Bayesian Optimization: Preference-Guided Framework for Scientific Discovery

cs.AI · 2026-05-18 · unverdicted · novelty 6.0

LGBO integrates LLM semantic preferences continuously into Bayesian optimization iterations, with a theoretical worst-case guarantee and empirical gains including 90% of best value in 6 iterations on a wet-lab battery task.

Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks

cs.LG · 2024-04-03 · unverdicted · novelty 6.0

NEON provides uncertainty-aware operator learning for composite Bayesian optimization in function spaces using a single network, achieving claimed SOTA with orders of magnitude fewer parameters than ensembles.

Data-Centric Mixed-Variable Bayesian Optimization For Materials Design

physics.comp-ph · 2019-07-04 · conditional · novelty 6.0

A mixed-variable Bayesian optimization framework based on latent variable Gaussian processes is developed and demonstrated on optimizing composition and morphology for insulating polymer nanocomposites, with an extension to multi-objective Pareto optimization.

ADKO: Agentic Decentralized Knowledge Optimization

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

ADKO is a decentralized framework where agents share compact GP-derived tokens and LM insights to achieve collaborative Bayesian optimization with a decomposed regret bound that includes compression and approximation losses.

Decoupled PFNs: Identifiable Epistemic-Aleatoric Decomposition via Structured Synthetic Priors

stat.ML · 2026-05-07 · conditional · novelty 6.0

Decoupled PFNs use controllable synthetic priors to train separate latent-signal and noise heads, making epistemic-aleatoric decomposition identifiable and improving acquisition in noisy settings.

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.

Multi-Agent Pathfinding with Non-Unit Integer Edge Costs via Enhanced Conflict-Based Search and Graph Discretization

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

MAPFZ extends classical MAPF to non-unit integer costs on graphs with finite states, solved efficiently by CBS-NIC and Bayesian-optimized discretization, outperforming prior methods on benchmarks.

Regret-Based $(\epsilon,\delta)$-optimal Stopping Criteria for Bayesian Optimization

cs.LG · 2026-05-21 · unverdicted · novelty 5.0

The paper derives provably tighter instantaneous regret bounds for GP-UCB and proposes (ε,δ)-optimal stopping criteria for Bayesian optimization based on those bounds.

Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity

cs.LG · 2026-05-20 · unverdicted · novelty 5.0

A restarting-based nonparametric online learning method for dynamic pricing with one-point revenue feedback that achieves regret bounds scaling with time horizon and total market variation.

Uncertainty-Aware Offline Data-Driven Multi-Objective Optimization

cs.NE · 2025-11-09 · unverdicted · novelty 5.0

A dual-ranking strategy improves offline data-driven multi-objective optimization by prioritizing solutions that score well on both predicted performance and low uncertainty across different surrogate models.

A Bandit Approach to Posterior Dialog Orchestration Under a Budget

cs.AI · 2019-06-22 · unverdicted · novelty 5.0

Formalizes budget-constrained posterior dialog orchestration as CABO and evaluates the approach on simulated and proprietary conversational datasets.

Bayesian Optimization of Crossbar-Based Compute-In-Memory System Design for Efficient DNN Inference

cs.ET · 2026-05-08 · unverdicted · novelty 5.0

A multi-objective Bayesian optimization framework co-optimizes CIM crossbar hardware and DNN parameters for VGG8/CIFAR-10 and VGG16/Tiny-ImageNet, achieving comparable accuracy with up to 65% smaller area and 52% lower energy.

Laser-Enhanced Contact Optimization in Silicon Photovoltaics: Mechanisms, Reliability, and Predictive Process Design

physics.app-ph · 2026-03-24 · unverdicted · novelty 3.0

A review of LECO in silicon photovoltaics that frames it as a multiphysics process and outlines a predictive workflow using regime maps and reduced state metrics for stable contact optimization.

citing papers explorer

Showing 19 of 19 citing papers.

Optimistic Policy Learning under Pessimistic Adversaries with Regret and Violation Guarantees cs.LG · 2026-04-15 · unverdicted · none · ref 30
RHC-UCRL is the first algorithm for safety-constrained RL under explicit adversarial dynamics, providing sub-linear regret and constraint violation guarantees by maintaining optimism over both agent and adversary policies.
Active Learning for Gaussian Process Regression Under Self-Induced Boltzmann Weights cs.LG · 2026-05-11 · unverdicted · none · ref 19
AB-SID-iVAR enables Gaussian process active learning for self-induced Boltzmann distributions by closed-form approximation of the target, with high-probability error vanishing guarantees and empirical gains on PES and drug discovery tasks.
FORGE: Fragment-Oriented Ranking and Generation for Context-Aware Molecular Optimization cs.LG · 2026-05-11 · unverdicted · none · ref 42
FORGE reformulates molecular optimization as context-aware fragment ranking and replacement using mined low-to-high edit pairs, outperforming larger language models and graph methods on standard benchmarks.
Many Needles in a Haystack: Active Hit Discovery for Perturbation Experiments cs.LG · 2026-05-11 · unverdicted · none · ref 25
Probability-of-Hit acquisition function ranks perturbation candidates by posterior probability of threshold exceedance, with asymptotic optimality proof and up to 6.4% gains on real immunology data.
Learning myopic mixed-integer nonlinear model predictive control from expert demonstrations eess.SY · 2026-05-08 · unverdicted · none · ref 51
A myopic MINMPC framework learns a value function offline via inverse optimization from expert data, allowing short horizons with near-optimal performance and strict integer feasibility online for hybrid systems.
Spectral bandits stat.ML · 2026-04-28 · unverdicted · none · ref 54
Spectral bandits achieve scalable regret in graph-structured recommendation by using an effective dimension to learn good policies from few node evaluations.
Unleashing LLMs in Bayesian Optimization: Preference-Guided Framework for Scientific Discovery cs.AI · 2026-05-18 · unverdicted · none · ref 2 · internal anchor
LGBO integrates LLM semantic preferences continuously into Bayesian optimization iterations, with a theoretical worst-case guarantee and empirical gains including 90% of best value in 6 iterations on a wet-lab battery task.
Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks cs.LG · 2024-04-03 · unverdicted · none · ref 36 · internal anchor
NEON provides uncertainty-aware operator learning for composite Bayesian optimization in function spaces using a single network, achieving claimed SOTA with orders of magnitude fewer parameters than ensembles.
Data-Centric Mixed-Variable Bayesian Optimization For Materials Design physics.comp-ph · 2019-07-04 · conditional · none · ref 23 · internal anchor
A mixed-variable Bayesian optimization framework based on latent variable Gaussian processes is developed and demonstrated on optimizing composition and morphology for insulating polymer nanocomposites, with an extension to multi-objective Pareto optimization.
ADKO: Agentic Decentralized Knowledge Optimization cs.LG · 2026-05-08 · unverdicted · none · ref 3
ADKO is a decentralized framework where agents share compact GP-derived tokens and LM insights to achieve collaborative Bayesian optimization with a decomposed regret bound that includes compression and approximation losses.
Decoupled PFNs: Identifiable Epistemic-Aleatoric Decomposition via Structured Synthetic Priors stat.ML · 2026-05-07 · conditional · none · ref 4
Decoupled PFNs use controllable synthetic priors to train separate latent-signal and noise heads, making epistemic-aleatoric decomposition identifiable and improving acquisition in noisy settings.
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits cs.AI · 2026-04-07 · unverdicted · none · ref 45
Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
Multi-Agent Pathfinding with Non-Unit Integer Edge Costs via Enhanced Conflict-Based Search and Graph Discretization cs.AI · 2026-04-07 · unverdicted · none · ref 29
MAPFZ extends classical MAPF to non-unit integer costs on graphs with finite states, solved efficiently by CBS-NIC and Bayesian-optimized discretization, outperforming prior methods on benchmarks.
Regret-Based $(\epsilon,\delta)$-optimal Stopping Criteria for Bayesian Optimization cs.LG · 2026-05-21 · unverdicted · none · ref 20 · internal anchor
The paper derives provably tighter instantaneous regret bounds for GP-UCB and proposes (ε,δ)-optimal stopping criteria for Bayesian optimization based on those bounds.
Nonparametric Learning and Earning with One-Point Feedback under Nonstationarity cs.LG · 2026-05-20 · unverdicted · none · ref 53 · internal anchor
A restarting-based nonparametric online learning method for dynamic pricing with one-point revenue feedback that achieves regret bounds scaling with time horizon and total market variation.
Uncertainty-Aware Offline Data-Driven Multi-Objective Optimization cs.NE · 2025-11-09 · unverdicted · none · ref 4 · internal anchor
A dual-ranking strategy improves offline data-driven multi-objective optimization by prioritizing solutions that score well on both predicted performance and low uncertainty across different surrogate models.
A Bandit Approach to Posterior Dialog Orchestration Under a Budget cs.AI · 2019-06-22 · unverdicted · none · ref 21 · internal anchor
Formalizes budget-constrained posterior dialog orchestration as CABO and evaluates the approach on simulated and proprietary conversational datasets.
Bayesian Optimization of Crossbar-Based Compute-In-Memory System Design for Efficient DNN Inference cs.ET · 2026-05-08 · unverdicted · none · ref 38
A multi-objective Bayesian optimization framework co-optimizes CIM crossbar hardware and DNN parameters for VGG8/CIFAR-10 and VGG16/Tiny-ImageNet, achieving comparable accuracy with up to 65% smaller area and 52% lower energy.
Laser-Enhanced Contact Optimization in Silicon Photovoltaics: Mechanisms, Reliability, and Predictive Process Design physics.app-ph · 2026-03-24 · unverdicted · none · ref 138 · internal anchor
A review of LECO in silicon photovoltaics that frames it as a multiphysics process and outlines a predictive workflow using regime maps and reduced state metrics for stable contact optimization.

Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer