Stochastic simultaneous optimistic optimization
Pith reviewed 2026-05-08 04:03 UTC · model grok-4.3
The pith
StoSOO maximizes noisy functions nearly as well as tuned algorithms without knowing the local smoothness.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
StoSOO follows an optimistic strategy to iteratively construct upper confidence bounds over the hierarchical partitions of the function domain to decide which point to sample next. A finite-time analysis of StoSOO shows that it performs almost as well as the best specifically-tuned algorithms even though the local smoothness of the function is not known.
What carries the argument
StoSOO, which uses upper confidence bounds on hierarchical partitions to select samples without knowledge of the underlying semi-metric.
If this is right
- Optimization problems in general spaces can proceed without separate tuning of smoothness parameters.
- The same hierarchical optimistic strategy applies to other noisy global search tasks that admit suitable partitions.
- Regret bounds remain controlled even when the semi-metric is discovered on the fly through sampling.
- The approach extends earlier bandit algorithms to settings where the metric is not supplied in advance.
Where Pith is reading between the lines
- Hierarchical partitioning can serve as a substitute for explicit metric knowledge in a wider range of stochastic search problems.
- The method may be combined with other partitioning schemes to handle functions that are only locally smooth near multiple maxima.
- Similar optimistic bounding ideas could yield parameter-free algorithms for related sequential decision problems.
Load-bearing premise
The function is locally smooth around one of its global maxima with respect to some unknown semi-metric, and the domain allows a hierarchical partitioning that respects this semi-metric.
What would settle it
An empirical run on a test function known to be locally smooth around its maximum, where StoSOO requires substantially more samples than the best algorithm tuned with the true semi-metric to reach the same optimization accuracy.
Figures
read the original abstract
We study the problem of global maximization of a function f given a finite number of evaluations perturbed by noise. We consider a very weak assumption on the function, namely that it is locally smooth (in some precise sense) with respect to some semi-metric, around one of its global maxima. Compared to previous works on bandits in general spaces (Kleinberg et al., 2008; Bubeck et al., 2011a) our algorithm does not require the knowledge of this semi-metric. Our algorithm, StoSOO, follows an optimistic strategy to iteratively construct upper confidence bounds over the hierarchical partitions of the function domain to decide which point to sample next. A finite-time analysis of StoSOO shows that it performs almost as well as the best specifically-tuned algorithms even though the local smoothness of the function is not known.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces StoSOO, an algorithm for noisy global maximization of a function that is locally smooth with respect to an unknown semi-metric around one of its global maxima. The algorithm follows an optimistic strategy by constructing upper confidence bounds over a given hierarchical partitioning of the domain and selects the next evaluation point accordingly. A finite-time analysis is provided showing that StoSOO achieves regret bounds comparable (up to small factors) to those of algorithms that require explicit knowledge of the semi-metric.
Significance. If the finite-time analysis holds, the result is significant because it removes the need for the algorithm to know the semi-metric, addressing a practical limitation of prior work on bandits in general spaces (e.g., Kleinberg et al. 2008, Bubeck et al. 2011). The analysis treats the hierarchical partitioning as given (standard in the SOO family) and uses the semi-metric only to quantify the resulting bound, which is consistent with the central claim.
minor comments (3)
- [Abstract] Abstract: the phrase 'almost as well' is informal; the introduction or analysis section should explicitly state the multiplicative factors appearing in the regret bound relative to the tuned baselines.
- [Analysis section] The finite-time analysis (presumably Theorem 1 or the main regret theorem) should include a short discussion of how the noise variance enters the concentration inequalities used for the upper confidence bounds, as this is load-bearing for the stochastic setting.
- [Section 2] Related work: the comparison to Bubeck et al. (2011a) could be expanded to clarify the precise improvement in the dependence on the unknown smoothness parameters.
Simulated Author's Rebuttal
We thank the referee for the positive summary and recommendation of minor revision. We are pleased that the significance of StoSOO, which achieves regret bounds comparable to tuned algorithms without knowledge of the semi-metric, is recognized as addressing a practical limitation in prior work on bandits in general spaces.
Circularity Check
No significant circularity
full rationale
The derivation chain consists of a standard optimistic hierarchical partitioning strategy whose finite-time regret bound is obtained by applying concentration inequalities to the local smoothness assumption with respect to an unknown semi-metric. The algorithm itself never uses the semi-metric; the metric appears only in the analysis to quantify the resulting rate. No parameter is fitted to the paper's own data and then renamed as a prediction, no self-citation supplies a load-bearing uniqueness theorem, and no ansatz is smuggled in. The central claim therefore remains independent of its own outputs and is self-contained under the stated external assumptions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The function is locally smooth with respect to some semi-metric around one of its global maxima
Reference graph
Works this paper leans on
-
[1]
Finite-time Analysis of the Multiarmed Bandit Problem.Machine Learning, 47(2-3):235–256, 2002
Auer, Peter, Cesa-Bianchi, Nicol` o, and Fischer, Paul. Finite-time Analysis of the Multiarmed Bandit Problem.Machine Learning, 47(2-3):235–256, 2002
work page 2002
-
[2]
Online Optimization of X-armed Bandits
Bubeck, S´ ebastien, Munos, R´ emi, Stoltz, Gilles, and Szepesv´ ari, Csaba. Online Optimization of X-armed Bandits. InNeural Information Processing Systems (NeurIPS), pp. 201–208, 2008
work page 2008
-
[3]
Pure Exploration in Multi-armed Bandits Problems
Bubeck, S´ ebastien, Munos, R´ emi, and Stoltz, Gilles. Pure Exploration in Multi-armed Bandits Problems. Algorithmic Learning Theory, pp. 23–37, 2009
work page 2009
-
[4]
Bull, Adam. Convergence rates of efficient global optimization algorithms.The Journal of Machine Learning Research, 12:2879–2904, 2011
work page 2011
-
[5]
Bandit Algorithms for Tree Search
Coquelin, Pierre-Arnaud and Munos, R´ emi. Bandit Algorithms for Tree Search. InUncertainty in Arti- ficial Intelligence, pp. 67–74, 2007
work page 2007
-
[6]
The grand challenge of computer Go: Monte Carlo tree search and extensions.Com- mun
Teytaud, Olivier. The grand challenge of computer Go: Monte Carlo tree search and extensions.Com- mun. ACM, 55(3):106–113, March 2012
work page 2012
-
[7]
Pure and Applied Mathematics Series
Hansen, Eldon and Walster, William.Global Opti- mization Using Interval Analysis: Revised and Ex- panded. Pure and Applied Mathematics Series. Mar- cel Dekker, 2004
work page 2004
-
[8]
Optimistic Planning of Deterministic Systems
Hren, Jean-Francois and Munos, R´ emi. Optimistic Planning of Deterministic Systems. InEuropean Workshop on Reinforcement Learning, pp. 151–164, 2008
work page 2008
-
[9]
Jones, David, Perttunen, Cary, and Stuckman, Bruce. Lipschitzian optimization without the Lipschitz con- stant.Journal of Optimization Theory and Applica- tions, 79(1):157–181, 1993
work page 1993
-
[10]
Nonconvex Optimization and Its Ap- plications
Kearfott, R Baker.Rigorous Global Search: Continu- ous Problems. Nonconvex Optimization and Its Ap- plications. Springer, 1996
work page 1996
-
[11]
Multi-armed bandit problems in metric spaces
Kleinberg, Robert, Slivkins, Alexander, and Upfal, Eli. Multi-armed bandit problems in metric spaces. In Proceedings of the 40th ACM symposium on Theory Of Computing, pp. 681–690, 2008
work page 2008
-
[12]
Bandit based Monte-Carlo Planning
Kocsis, Levente and Szepesv´ ari, Csaba. Bandit based Monte-Carlo Planning. InProceedings of the 15th European Conference on Machine Learning, pp. 282–293. Springer, 2006
work page 2006
-
[13]
Optimistic Optimization of Determinis- tic Functions without the Knowledge of its Smooth- ness
Munos, R´ emi. Optimistic Optimization of Determinis- tic Functions without the Knowledge of its Smooth- ness. InNeural Information Processing Systems (NeurIPS), pp. 783–791, 2011
work page 2011
-
[14]
Encyclopedia of Mathematics and its Applications
Neumaier, Arnold.Interval Methods for Systems of Equations. Encyclopedia of Mathematics and its Applications. Cambridge University Press, 2008
work page 2008
-
[15]
PhD thesis, University of Oxford, 2010
Osborne, Michael.Bayesian Gaussian processes for sequential prediction, optimisation and quadrature. PhD thesis, University of Oxford, 2010. Pint´ er, J´ anos.Global Optimization in Action: Contin- uous and Lipschitz Optimization: Algorithms, Im- plementations and Applications. Nonconvex Opti- mization and Its Applications. Springer, 1995
work page 2010
-
[16]
Multi-armed bandits on implicit metric spaces
Slivkins, Aleksandrs. Multi-armed bandits on implicit metric spaces. InNeural Information Processing Systems (NeurIPS), pp. 1602–1610. 2011
work page 2011
-
[17]
Srinivas, Niranjan, Krause, Andreas, Kakade, Sham, and Seeger, Matthias. Gaussian Process Optimiza- tion in the Bandit Setting: No Regret and Exper- imental Design.International Conference on Ma- chine Learning (ICML), pp. 1015–1022, 2010
work page 2010
-
[18]
Nonconvex Optimization and Its Applications
Strongin, Roman and Sergeyev, Yaroslav.Global Opti- mization with Non-Convex Constraints: Sequential and Parallel Algorithms. Nonconvex Optimization and Its Applications. Springer, 2000
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.