pith. sign in

arxiv: 2604.24537 · v1 · submitted 2026-04-27 · 💻 cs.LG · stat.ML

Stochastic simultaneous optimistic optimization

Pith reviewed 2026-05-08 04:03 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords stochastic optimizationglobal maximizationnoisy evaluationsoptimistic algorithmshierarchical partitioningbandit problemssemi-metric smoothnessfinite-time analysis
0
0 comments X

The pith

StoSOO maximizes noisy functions nearly as well as tuned algorithms without knowing the local smoothness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents StoSOO for global maximization of a function observed through noisy evaluations. It operates under the weak assumption that the function is locally smooth around one global maximum with respect to some unknown semi-metric. The method builds hierarchical partitions of the domain and selects sample points by constructing optimistic upper bounds on each partition cell. A finite-time analysis establishes that StoSOO attains performance close to that of algorithms tuned with full knowledge of the smoothness parameters. This removes the need for prior metric information while retaining near-optimal sample efficiency.

Core claim

StoSOO follows an optimistic strategy to iteratively construct upper confidence bounds over the hierarchical partitions of the function domain to decide which point to sample next. A finite-time analysis of StoSOO shows that it performs almost as well as the best specifically-tuned algorithms even though the local smoothness of the function is not known.

What carries the argument

StoSOO, which uses upper confidence bounds on hierarchical partitions to select samples without knowledge of the underlying semi-metric.

If this is right

  • Optimization problems in general spaces can proceed without separate tuning of smoothness parameters.
  • The same hierarchical optimistic strategy applies to other noisy global search tasks that admit suitable partitions.
  • Regret bounds remain controlled even when the semi-metric is discovered on the fly through sampling.
  • The approach extends earlier bandit algorithms to settings where the metric is not supplied in advance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hierarchical partitioning can serve as a substitute for explicit metric knowledge in a wider range of stochastic search problems.
  • The method may be combined with other partitioning schemes to handle functions that are only locally smooth near multiple maxima.
  • Similar optimistic bounding ideas could yield parameter-free algorithms for related sequential decision problems.

Load-bearing premise

The function is locally smooth around one of its global maxima with respect to some unknown semi-metric, and the domain allows a hierarchical partitioning that respects this semi-metric.

What would settle it

An empirical run on a test function known to be locally smooth around its maximum, where StoSOO requires substantially more samples than the best algorithm tuned with the true semi-metric to reach the same optimization accuracy.

Figures

Figures reproduced from arXiv: 2604.24537 by Alexandra Carpentier, Michal Valko, R\'emi Munos.

Figure 1
Figure 1. Figure 1: Functions with d = 0. Left: Two-sine product function f1(x) = 1 2 (sin(13x)· sin(27x)) + 0.5. Right: Gar￾land function: f2(x) = 4x(1−x)·( 3 4 + 1 4 (1− p | sin(60x)|)). 5.2. Examples Let us consider the case of functions f defined on [0, 1]D that are locally equivalent to a polynomial of degree α around their maximum, i.e., f(x) − f(x ∗ ) = Θ(∥x − x ∗∥ α) for some α > 0, where ∥ · ∥ is any norm. The choice… view at source ↗
Figure 3
Figure 3. Figure 3: We illustrate the case of a function with different order in the upper and lower envelopes, when ℓ(x, y) = |x−y| α . Here f(x) = 1− √ x+(−x 2+ √ x)·(sin(1/x2 )+1)/2. The lower-envelope behaves like a square root whereas the upper one is quadratic. The maximum number of ℓ-balls with radius ε that can pack Xε (i.e., Euclidean balls with radius ε 1/α) is at most of order ε 1/2 /ε1/α ≤ ε −3/2 , since α ≤ 1/2 i… view at source ↗
Figure 2
Figure 2. Figure 2: Any function satisfying (11) lies in the gray area and possesses a lower- and upper-envelopes that are of same order around x ∗ . In the case when ε < η, due the upper envelope we have that: Xε ⊂ {x : cℓ(x, x∗ ) ≤ ε}, which corre￾sponds to an ℓ-ball centered in x ∗ with radius ε/c. This ball can be packed by no more than a constant number of C ′ ℓ-balls of radius ε. C ′ is necessarily finite because the do… view at source ↗
Figure 4
Figure 4. Figure 4: Functions from view at source ↗
Figure 5
Figure 5. Figure 5: displays the performance of StoSOO for differ￾ent levels of noise. We observe that as we increase the number of evaluations, the regret of StoSOO decreases. 10 50 100 500 1000 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 regret (loss) number of function evaluations 10 50 100 500 1000 −0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 regret (loss) number of function evaluations 10 50 100 500 1000 0 0.05 0.1 0.15 0.2 0.2… view at source ↗
Figure 6
Figure 6. Figure 6: StoSOO (dia￾monds) vs. Stochastic DOO with ℓ1 (circles) and ℓ2 (squares) on f1. In view at source ↗
read the original abstract

We study the problem of global maximization of a function f given a finite number of evaluations perturbed by noise. We consider a very weak assumption on the function, namely that it is locally smooth (in some precise sense) with respect to some semi-metric, around one of its global maxima. Compared to previous works on bandits in general spaces (Kleinberg et al., 2008; Bubeck et al., 2011a) our algorithm does not require the knowledge of this semi-metric. Our algorithm, StoSOO, follows an optimistic strategy to iteratively construct upper confidence bounds over the hierarchical partitions of the function domain to decide which point to sample next. A finite-time analysis of StoSOO shows that it performs almost as well as the best specifically-tuned algorithms even though the local smoothness of the function is not known.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces StoSOO, an algorithm for noisy global maximization of a function that is locally smooth with respect to an unknown semi-metric around one of its global maxima. The algorithm follows an optimistic strategy by constructing upper confidence bounds over a given hierarchical partitioning of the domain and selects the next evaluation point accordingly. A finite-time analysis is provided showing that StoSOO achieves regret bounds comparable (up to small factors) to those of algorithms that require explicit knowledge of the semi-metric.

Significance. If the finite-time analysis holds, the result is significant because it removes the need for the algorithm to know the semi-metric, addressing a practical limitation of prior work on bandits in general spaces (e.g., Kleinberg et al. 2008, Bubeck et al. 2011). The analysis treats the hierarchical partitioning as given (standard in the SOO family) and uses the semi-metric only to quantify the resulting bound, which is consistent with the central claim.

minor comments (3)
  1. [Abstract] Abstract: the phrase 'almost as well' is informal; the introduction or analysis section should explicitly state the multiplicative factors appearing in the regret bound relative to the tuned baselines.
  2. [Analysis section] The finite-time analysis (presumably Theorem 1 or the main regret theorem) should include a short discussion of how the noise variance enters the concentration inequalities used for the upper confidence bounds, as this is load-bearing for the stochastic setting.
  3. [Section 2] Related work: the comparison to Bubeck et al. (2011a) could be expanded to clarify the precise improvement in the dependence on the unknown smoothness parameters.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and recommendation of minor revision. We are pleased that the significance of StoSOO, which achieves regret bounds comparable to tuned algorithms without knowledge of the semi-metric, is recognized as addressing a practical limitation in prior work on bandits in general spaces.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation chain consists of a standard optimistic hierarchical partitioning strategy whose finite-time regret bound is obtained by applying concentration inequalities to the local smoothness assumption with respect to an unknown semi-metric. The algorithm itself never uses the semi-metric; the metric appears only in the analysis to quantify the resulting rate. No parameter is fitted to the paper's own data and then renamed as a prediction, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled in. The central claim therefore remains independent of its own outputs and is self-contained under the stated external assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the local smoothness assumption around a global maximum and the existence of a suitable hierarchical partition of the domain; no free parameters or invented entities are visible from the abstract.

axioms (1)
  • domain assumption The function is locally smooth with respect to some semi-metric around one of its global maxima
    Explicitly stated in the abstract as the weak assumption used by the algorithm.

pith-pipeline@v0.9.0 · 5434 in / 1127 out tokens · 59527 ms · 2026-05-08T04:03:57.247218+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

  1. [1]

    Finite-time Analysis of the Multiarmed Bandit Problem.Machine Learning, 47(2-3):235–256, 2002

    Auer, Peter, Cesa-Bianchi, Nicol` o, and Fischer, Paul. Finite-time Analysis of the Multiarmed Bandit Problem.Machine Learning, 47(2-3):235–256, 2002

  2. [2]

    Online Optimization of X-armed Bandits

    Bubeck, S´ ebastien, Munos, R´ emi, Stoltz, Gilles, and Szepesv´ ari, Csaba. Online Optimization of X-armed Bandits. InNeural Information Processing Systems (NeurIPS), pp. 201–208, 2008

  3. [3]

    Pure Exploration in Multi-armed Bandits Problems

    Bubeck, S´ ebastien, Munos, R´ emi, and Stoltz, Gilles. Pure Exploration in Multi-armed Bandits Problems. Algorithmic Learning Theory, pp. 23–37, 2009

  4. [4]

    Convergence rates of efficient global optimization algorithms.The Journal of Machine Learning Research, 12:2879–2904, 2011

    Bull, Adam. Convergence rates of efficient global optimization algorithms.The Journal of Machine Learning Research, 12:2879–2904, 2011

  5. [5]

    Bandit Algorithms for Tree Search

    Coquelin, Pierre-Arnaud and Munos, R´ emi. Bandit Algorithms for Tree Search. InUncertainty in Arti- ficial Intelligence, pp. 67–74, 2007

  6. [6]

    The grand challenge of computer Go: Monte Carlo tree search and extensions.Com- mun

    Teytaud, Olivier. The grand challenge of computer Go: Monte Carlo tree search and extensions.Com- mun. ACM, 55(3):106–113, March 2012

  7. [7]

    Pure and Applied Mathematics Series

    Hansen, Eldon and Walster, William.Global Opti- mization Using Interval Analysis: Revised and Ex- panded. Pure and Applied Mathematics Series. Mar- cel Dekker, 2004

  8. [8]

    Optimistic Planning of Deterministic Systems

    Hren, Jean-Francois and Munos, R´ emi. Optimistic Planning of Deterministic Systems. InEuropean Workshop on Reinforcement Learning, pp. 151–164, 2008

  9. [9]

    Lipschitzian optimization without the Lipschitz con- stant.Journal of Optimization Theory and Applica- tions, 79(1):157–181, 1993

    Jones, David, Perttunen, Cary, and Stuckman, Bruce. Lipschitzian optimization without the Lipschitz con- stant.Journal of Optimization Theory and Applica- tions, 79(1):157–181, 1993

  10. [10]

    Nonconvex Optimization and Its Ap- plications

    Kearfott, R Baker.Rigorous Global Search: Continu- ous Problems. Nonconvex Optimization and Its Ap- plications. Springer, 1996

  11. [11]

    Multi-armed bandit problems in metric spaces

    Kleinberg, Robert, Slivkins, Alexander, and Upfal, Eli. Multi-armed bandit problems in metric spaces. In Proceedings of the 40th ACM symposium on Theory Of Computing, pp. 681–690, 2008

  12. [12]

    Bandit based Monte-Carlo Planning

    Kocsis, Levente and Szepesv´ ari, Csaba. Bandit based Monte-Carlo Planning. InProceedings of the 15th European Conference on Machine Learning, pp. 282–293. Springer, 2006

  13. [13]

    Optimistic Optimization of Determinis- tic Functions without the Knowledge of its Smooth- ness

    Munos, R´ emi. Optimistic Optimization of Determinis- tic Functions without the Knowledge of its Smooth- ness. InNeural Information Processing Systems (NeurIPS), pp. 783–791, 2011

  14. [14]

    Encyclopedia of Mathematics and its Applications

    Neumaier, Arnold.Interval Methods for Systems of Equations. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2008

  15. [15]

    PhD thesis, University of Oxford, 2010

    Osborne, Michael.Bayesian Gaussian processes for sequential prediction, optimisation and quadrature. PhD thesis, University of Oxford, 2010. Pint´ er, J´ anos.Global Optimization in Action: Contin- uous and Lipschitz Optimization: Algorithms, Im- plementations and Applications. Nonconvex Opti- mization and Its Applications. Springer, 1995

  16. [16]

    Multi-armed bandits on implicit metric spaces

    Slivkins, Aleksandrs. Multi-armed bandits on implicit metric spaces. InNeural Information Processing Systems (NeurIPS), pp. 1602–1610. 2011

  17. [17]

    Gaussian Process Optimiza- tion in the Bandit Setting: No Regret and Exper- imental Design.International Conference on Ma- chine Learning (ICML), pp

    Srinivas, Niranjan, Krause, Andreas, Kakade, Sham, and Seeger, Matthias. Gaussian Process Optimiza- tion in the Bandit Setting: No Regret and Exper- imental Design.International Conference on Ma- chine Learning (ICML), pp. 1015–1022, 2010

  18. [18]

    Nonconvex Optimization and Its Applications

    Strongin, Roman and Sergeyev, Yaroslav.Global Opti- mization with Non-Convex Constraints: Sequential and Parallel Algorithms. Nonconvex Optimization and Its Applications. Springer, 2000