Multi-Objective Bayesian Optimization via Adaptive \varepsilon-Constraints Decomposition

Sammie Katt; Samuel Kaski; Yaohong Yang

arxiv: 2604.15959 · v2 · pith:PDNMXF3Fnew · submitted 2026-04-17 · 💻 cs.LG

Multi-Objective Bayesian Optimization via Adaptive varepsilon-Constraints Decomposition

Yaohong Yang , Sammie Katt , Samuel Kaski This is my paper

Pith reviewed 2026-05-10 08:36 UTC · model grok-4.3

classification 💻 cs.LG

keywords coverageadaptivebayesianconstraintsoptimizationparetoconstraintfront

0 comments

The pith

STAGE-BO improves multi-objective Bayesian optimization by sequentially targeting the largest geometric gaps in the approximate Pareto front via epsilon-constraint decomposition solved with constrained expected improvement.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Multi-objective optimization involves finding good compromises among several conflicting goals, such as maximizing speed while minimizing cost in a design problem. When each evaluation is expensive, Bayesian optimization uses a probabilistic model to decide what to try next. The challenge grows with more objectives because the set of best compromises, called the Pareto front, becomes harder to cover evenly. Existing methods often rely on hypervolume calculations to measure progress, which can become costly or unstable as the number of objectives increases. STAGE-BO instead looks at the current approximate front and measures the largest empty geometric gaps between points. Each gap is turned into an inequality constraint that defines a subproblem. These subproblems are solved one after another using constrained expected improvement, which balances exploration and exploitation while respecting the gap constraint. The process repeats, filling gaps sequentially. Because the method works directly with gaps rather than a global volume measure, it avoids hypervolume computation entirely. The same gap-based constraint mechanism extends naturally to problems that already have constraints or user preferences. On synthetic test functions and real-world benchmarks the approach produced more uniform coverage of the front while remaining competitive on hypervolume scores compared with current baselines.

Core claim

Our approach provides a uniform Pareto coverage without hypervolume computation and naturally applies to constrained and preference-based settings.

Load-bearing premise

That identifying the largest geometric gaps in the current approximate Pareto front and converting them into inequality constraints will reliably produce uniform coverage when solved sequentially with constrained expected improvement.

Figures

Figures reproduced from arXiv: 2604.15959 by Sammie Katt, Samuel Kaski, Yaohong Yang.

**Figure 1.** Figure 1: Comparison of Pareto front approximation on ZDT1 and ZDT2 benchmarks. While qEHVI (left) and our method (right) achieve comparable hypervolume (HV), our approach yields an order-of-magnitude reduction in IGD, demonstrating significantly better uniform coverage of the Pareto front. This motivates the need for multiple metrics to assess solution quality. meaning that improving one objective may deteriorate … view at source ↗

**Figure 2.** Figure 2: Illustration of the STAGE-BO algorithm. The blue dots represent observations. The dashed orange curve depicts the sampled Pareto front approximation (Pet f ), generated via Thompson sampling and NSGA-II. The red point A = (y1, y2) is identified on P˜t f as having the maxmin distance to the existing blue observations. Assuming a schedule where f1 is targeted for optimization at this step, a constraint is es… view at source ↗

**Figure 3.** Figure 3: Comparison of preference handling strategies. Left: Preference scalarization methods (Paria et al., 2020) map preferred regions (shaded boxes) to preference weights dependent on a reference point z. Right: Our geometric approach operates without a reference point, directly targeting Pareto-optimal solutions that satisfy either the lower or upper bounds of the specified regions. 4.3. Extensions to MOBO wit… view at source ↗

**Figure 4.** Figure 4: Comparison of our method with state-of-the-art baselines on two synthetic and two real-world benchmark MOO problems. The first row reports hypervolume, and the second row reports IGD. Overall, our method achieves comparable or superior hypervolume relative to the baselines and consistently outperforms them in terms of IGD. 20 40 60 80 100 -2 -1 Log HV Difference MW7 20 40 60 80 100 -2 0 2 CONSTR 20 40 60 8… view at source ↗

**Figure 5.** Figure 5: Comparison of our method with state-of-the-art baselines on one synthetic and three real-world benchmark constrained MOO problems. The first row reports hypervolume, and the second row reports IGD. Overall, our method achieves comparable or superior hypervolume relative to the baselines and consistently outperforms them in terms of IGD. Our method consistently performs outperforms others with respect to IG… view at source ↗

**Figure 6.** Figure 6: Comparison of our method with state-of-the-art baselines on two synthetic and two real-world benchmark MOO problems with preferred regions. The first row reports hypervolume, and the second row reports IGD. Overall, our method achieves superior hypervolume relative to the baselines and consistently outperforms them in terms of IGD. brake design (d = 4, m = 2, c = 4), Gear train design (d = 4, m = 2, c = 1… view at source ↗

**Figure 7.** Figure 7: Ablation study on different parts of our method. The left panels show that cEI and computing maxmin to set the constraints are necessary. The right panels show that our method is robust to the strategy of picking the objective to optimize. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Computation time for STAGE-BO and all baselines. Our method shows high efficiency in low dimensions when m ≤ 4 and remains computationally tractable for higher-dimensional objective spaces (m ≥ 4). F. Additional Experiments Results F.1. Additional Evaluation Metrics Besides hypervolume and IGD, here we present more evaluation metrics. IGD+ (Ishibuchi et al., 2015) IGD+(Yt,Pf ) = 1 |Pf | ( X y∈Pf min y′∈Yt … view at source ↗

**Figure 9.** Figure 9: Comparison of our method with state-of-the-art baselines on two synthetic and two real-world benchmark MOO problems. The first row reports IGD+, and the second row reports fill distance. Overall, our method achieves comparable or superior performance [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Comparison of our method with state-of-the-art baselines on one synthetic and three real-world benchmark constrained MOO problems. The first row reports feasible ratio, the second row reports IGD+, and the last row reports fill distance. Overall, our method achieves comparable or superior performance. Lastly, we show the IGD+ and fill distance results on preference-aware MOO tasks in [PITH_FULL_IMAGE:fig… view at source ↗

**Figure 11.** Figure 11: Comparison of our method with state-of-the-art baselines on two synthetic and two real-world benchmark MOO problems with preferred regions. The first row reports hypervolume, and the second row reports IGD. Overall, our method consistently outperforms baselinese in terms of IGD+ and fill distance. 10 20 30 40 50 -4 -2 0 2 4 ZDT2 Log HV Difference 10 20 30 40 50 -4 -2 0 Log IGD 10 20 30 40 50 -4 -2 0 Log I… view at source ↗

**Figure 12.** Figure 12: Comparison of our method with state-of-the-art baselines on one synthetic and one real-world benchmark MOO problems. The first row reports the results for ZDT2, and the second row reports results for Coil compression spring design. Overall, our method achieves comparable or superior hypervolume relative to the baselines and consistently outperforms them in terms of IGD and fill distance. 16 [PITH_FULL_IM… view at source ↗

read the original abstract

Multi-objective Bayesian optimization (MOBO) provides a principled framework for optimizing multiple expensive black-box functions. However, existing MOBO methods often struggle with coverage, scalability, and handling constraints and preferences. In this work we propose STAGE-BO, Sequential Targeting Adaptive Gap-Filling $\varepsilon$-Constraint Bayesian Optimization: by analyzing the coverage of the surrogate Pareto front, our method identifies the Pareto front point with the largest uncovered gap, and uses its coordinates to define adaptive constraints in $\varepsilon$-constraint method, which transforms the problem into a sequence of inequality-constrained subproblems, efficiently solved via constrained expected improvement acquisition. Our approach provides uniform Pareto coverage without hypervolume computation and naturally handles constraints and preferences. Experiments on synthetic and real-world benchmarks demonstrate superior coverage and competitive hypervolume performance against state-of-the-art baselines. Our code implementation can be found at https://github.com/YangYaohong1/STAGE-BO.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

STAGE-BO targets gaps in the current Pareto front with adaptive epsilon-constraints and constrained EI to improve coverage without hypervolume calculations.

read the letter

The paper introduces STAGE-BO as a decomposition approach for multi-objective Bayesian optimization. It identifies the largest geometric gaps on the approximate front, converts them into inequality constraints, and solves the resulting subproblems sequentially with constrained expected improvement. This is the core new piece: a gap-driven adaptive epsilon-constraint strategy that sidesteps hypervolume entirely while extending to constrained and preference-based cases without extra machinery. The experiments on synthetic benchmarks and real-world problems show competitive hypervolume values alongside better coverage metrics than the baselines they compare against. That combination is useful because hypervolume computation grows expensive with more objectives, and many applications care about filling out the front rather than just its volume. The method is presented as empirical, with no convergence proof claimed, which matches what the benchmarks can support. The gap identification step is straightforward to implement from the description, and the results hold up across the tested settings. One soft spot is that uniformity of coverage still depends on how reliably the largest-gap heuristic keeps working as the front evolves; the paper reports improved metrics but does not explore edge cases like highly non-convex fronts or very large numbers of objectives in depth. A second minor point is that the choice of gap metric and the number of sequential subproblems per iteration could affect outcomes, though the reported runs appear stable. Overall the evidence is proportionate to the claims. This paper is for people already working on MOBO who need a coverage-focused alternative that scales better than hypervolume methods and handles constraints without redesign. A reader looking for practical extensions or benchmark comparisons will find concrete value. It deserves a serious referee because the contribution is clearly stated, the experiments are reproducible from the description, and the approach addresses a genuine limitation in existing decomposition and hypervolume techniques. I would send it to peer review.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Full details unavailable from abstract; the method appears to rest on standard assumptions of Bayesian optimization and Pareto dominance.

axioms (1)

domain assumption The current approximate Pareto front can be meaningfully decomposed into geometric gaps that serve as useful inequality constraints.
Central to the gap-filling procedure described in the abstract.

pith-pipeline@v0.9.0 · 5452 in / 1261 out tokens · 56503 ms · 2026-05-10T08:36:36.695464+00:00 · methodology

Multi-Objective Bayesian Optimization via Adaptive varepsilon-Constraints Decomposition

Core claim

Load-bearing premise

discussion (0)