BONSAI: Bayesian Optimization with Natural Simplicity and Interpretability

David Eriksson; Eytan Bakshy; Maximilian Balandat; Samuel Daulton

arxiv: 2602.07144 · v2 · submitted 2026-02-06 · 💻 cs.LG · cs.AI· stat.ML

BONSAI: Bayesian Optimization with Natural Simplicity and Interpretability

Samuel Daulton , David Eriksson , Maximilian Balandat , Eytan Bakshy This is my paper

Pith reviewed 2026-05-16 06:23 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.ML

keywords Bayesian optimizationGaussian processesacquisition functionssparsityregret boundsinterpretabilitydefault-aware optimization

0 comments

The pith

BONSAI recovers the minimal relevant coordinate set in Bayesian optimization at zero acquisition cost while matching GP-UCB regret bounds.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Bayesian optimization routinely adjusts many parameters away from a known default even when only a few actually matter. BONSAI inserts a pruning rule that discards low-impact changes while capping the drop in acquisition value. When ARD lengthscales are known, the rule identifies exactly the relevant coordinates without any loss in the acquisition function. The resulting policy therefore inherits the standard GP-UCB regret guarantee yet returns the smallest possible number of non-default parameters. Real-world tests show the approach cuts changed parameters substantially at roughly 1.5 times the usual computation cost.

Core claim

BONSAI is a default-aware BO policy that prunes low-impact deviations from a default configuration while explicitly controlling the loss in acquisition value. It works with standard acquisition functions such as expected improvement and GP-UCB. Assuming known ARD lengthscales, BONSAI provably recovers the relevant-coordinate set at zero acquisition cost, matches the GP-UCB regret rate, and recovers the minimal-ℓ0 solution.

What carries the argument

A pruning operator on the acquisition function that uses known ARD lengthscales to retain only the minimal relevant coordinate set while bounding the loss in acquisition value.

Load-bearing premise

The zero-cost recovery of the relevant set and the matching regret bound both require that the ARD lengthscales are known in advance.

What would settle it

A Gaussian-process sample with known ARD lengthscales on which BONSAI either incurs positive acquisition loss when pruning or selects a larger set of coordinates than the true minimal relevant set.

read the original abstract

Bayesian optimization (BO) is a popular technique for sample-efficient optimization of black-box functions. In many applications, the parameters being tuned come with a carefully engineered default configuration, and practitioners only want to deviate from this default when necessary. Standard BO, however, does not aim to minimize deviation from the default and, in practice, often pushes weakly relevant parameters to the boundary of the search space. This makes it difficult to distinguish between important and spurious changes and increases the burden of vetting recommendations when the optimization objective omits relevant operational considerations. We introduce BONSAI, a default-aware BO policy that prunes low-impact deviations from a default configuration while explicitly controlling the loss in acquisition value. BONSAI is compatible with a variety of acquisition functions, including expected improvement and upper confidence bound (GP-UCB). We theoretically bound the regret incurred by BONSAI, showing that, under certain conditions, it enjoys the same no-regret property as vanilla GP-UCB. Moreover, assuming known ARD lengthscales -- the same assumption underlying GP-UCB regret bounds -- BONSAI provably recovers the relevant-coordinate set at zero acquisition cost, yielding a method that matches the GP-UCB regret rate while recovering the minimal-$\ell_0$ solution -- a guarantee not provided by prior sparse-BO methods. Across many real-world applications, we empirically find that BONSAI substantially reduces the number of non-default parameters in recommended configurations while maintaining competitive optimization performance, with little effect on wall time -- averaging only $1.5\times$ the candidate-generation cost of standard BO, compared to $7$-$34\times$ on average for prior sparse-BO methods (IR, ER, and SEBO).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BONSAI adds an acquisition-aware pruning step to BO that matches GP-UCB regret and recovers the minimal relevant coordinates at zero cost when lengthscales are known.

read the letter

BONSAI adds a pruning rule to standard Bayesian optimization acquisition functions so that suggested points stay close to a provided default configuration. The rule controls the loss in acquisition value and comes with a regret bound that matches GP-UCB. When ARD lengthscales are known, it also recovers exactly the minimal set of relevant coordinates at no extra acquisition cost, a guarantee the paper says prior sparse BO methods lack. That combination is the main new element. The experiments show the practical payoff: across real-world tasks the method cuts the number of non-default parameters substantially while keeping optimization performance competitive with vanilla BO. Candidate generation cost rises only by a factor of 1.5 on average, far below the overhead reported for IR, ER, and SEBO. The paper is clear that the strongest guarantees rest on known lengthscales, the same assumption used for the GP-UCB regret bound, so no new fragility is introduced there. In practice lengthscales are usually estimated, and the recovery property may not hold exactly, but the reported results indicate the pruning still produces useful sparsity without large performance drops. The work targets practitioners who run BO on problems that start from engineered defaults and need recommendations that are easy to vet and implement, such as in production ML systems or engineering design. A reader who values sparse, interpretable outputs will get concrete tools and supporting evidence. I would send this to peer review. The regret analysis is stated precisely and the empirical controls look reasonable, so referees can check the derivations and experiment details without starting from scratch.

Referee Report

0 major / 2 minor

Summary. The paper introduces BONSAI, a default-aware Bayesian optimization policy that prunes low-impact deviations from a given default configuration while explicitly controlling the loss in acquisition value. It is compatible with standard acquisition functions including expected improvement and GP-UCB. The authors derive a regret bound showing that, under certain conditions, BONSAI retains the no-regret property of vanilla GP-UCB. Assuming known ARD lengthscales (the same assumption used for standard GP-UCB bounds), BONSAI provably recovers the minimal-ℓ0 relevant coordinate set at zero acquisition cost. Empirical results across real-world tasks show that BONSAI substantially reduces the number of non-default parameters while maintaining competitive optimization performance, with only modest increase in candidate-generation cost relative to prior sparse BO methods.

Significance. If the stated guarantees hold, BONSAI addresses a practical gap in Bayesian optimization by producing recommendations that are naturally sparse with respect to a default configuration, thereby improving interpretability and reducing the vetting burden without sacrificing sample efficiency. The zero-cost recovery of the minimal-ℓ0 set under known ARD lengthscales is a distinctive theoretical contribution not present in prior sparse-BO approaches. The empirical overhead of 1.5× versus 7–34× for alternatives further supports deployability in settings where default-respecting solutions are preferred.

minor comments (2)

The abstract states that BONSAI 'averages only 1.5× the candidate-generation cost'; a brief clarification of the precise timing metric (e.g., wall-clock time per iteration or total optimization time) would aid reproducibility.
In the theoretical section, the precise statement of the 'certain conditions' under which the regret bound matches GP-UCB should be cross-referenced to the corresponding theorem number for easier navigation.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their thorough reading and positive evaluation of the manuscript. We are grateful for the recommendation to accept and for highlighting the practical relevance of default-aware recommendations, the zero-cost recovery of the minimal-ℓ0 set, and the modest computational overhead relative to prior sparse BO methods.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The derivation chain extends standard GP-UCB regret analysis under the explicitly stated known-ARD-lengthscales assumption, which is the identical external premise used for vanilla GP-UCB bounds rather than a quantity fitted or defined inside BONSAI. The zero-acquisition-cost recovery of the minimal-ℓ0 coordinate set follows directly from the pruning rule applied to the ARD kernel structure and is proven without redefining the acquisition value or regressing parameters to the target result. No self-citation is load-bearing for the central guarantees, no ansatz is smuggled, and the method remains compatible with external acquisition functions without reducing predictions to fitted inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The method builds on standard Gaussian process assumptions and the ARD kernel; the pruning rule introduces one tunable control on acquisition loss but no new free parameters are fitted inside the central claim. No new entities are postulated.

axioms (2)

standard math Gaussian process prior with ARD kernel
Invoked for both the acquisition function and the regret analysis; same assumption used in GP-UCB bounds.
domain assumption Known ARD lengthscales
Required for the zero-acquisition-cost recovery of the relevant coordinate set; stated explicitly in the abstract.

pith-pipeline@v0.9.0 · 5626 in / 1407 out tokens · 27526 ms · 2026-05-16T06:23:22.276883+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

assuming known ARD lengthscales ... BONSAI provably recovers the relevant-coordinate set at zero acquisition cost, yielding a method that matches the GP-UCB regret rate while recovering the minimal-ℓ0 solution
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

regret bound ... O(√(γT T) + √γT Σ ρt)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.