pith. sign in

arxiv: 2502.08208 · v1 · pith:UGJJQVVGnew · submitted 2025-02-12 · 💻 cs.LG

Exploring Exploration in Bayesian Optimization

Pith reviewed 2026-05-23 03:25 UTC · model grok-4.3

classification 💻 cs.LG
keywords Bayesian optimizationacquisition functionsexplorationobservation entropytraveling salesman distanceblack-box optimizationexploration-exploitation
0
0 comments X

The pith

Acquisition functions can have their exploration quantified using only the points they select via observation traveling salesman distance and observation entropy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces two measures, observation traveling salesman distance and observation entropy, that assess how much exploration an acquisition function performs by examining only the points it chooses during optimization. These are applied to standard acquisition functions on multiple black-box problems to compare their exploration behavior and connect it to observed performance. A sympathetic reader would care because better quantification of the exploration-exploitation balance could improve how acquisition functions are chosen or designed for practical optimization tasks.

Core claim

Observation traveling salesman distance and observation entropy quantify the exploration characteristics of acquisition functions based solely on their selected observations. Applying these to several well-known acquisition functions across diverse black-box problems reveals links between exploration and empirical performance as well as new relationships among the functions themselves.

What carries the argument

Observation traveling salesman distance and observation entropy, which compute exploration directly from the sequence or set of points chosen by the acquisition function.

If this is right

  • Acquisition functions can be compared and ranked by exploration level using only their chosen observations.
  • Empirical performance differences among acquisition functions can be partly explained by differences in their measured exploration.
  • New relationships and groupings among existing acquisition functions become visible through these exploration metrics.
  • The measures supply a systematic basis for designing new acquisition functions with targeted exploration properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These metrics could be used to automatically select or switch acquisition functions mid-optimization based on desired exploration levels.
  • If the metrics prove stable across problem classes, they might reduce the need for expensive full benchmarking when introducing new acquisition functions.
  • Extending the metrics to track exploration over time rather than total might expose phases where different functions shift from exploration to exploitation.

Load-bearing premise

Exploration characteristics of an acquisition function can be isolated and quantified solely from the set of points it selects, without reference to the surrogate model or the acquisition function's internal formulation.

What would settle it

Finding two acquisition functions that select identical point sequences on a problem yet produce measurably different exploration behavior under these metrics, or showing that the metrics do not predict performance differences on new test problems.

Figures

Figures reproduced from arXiv: 2502.08208 by Leonard Papenmeier, Luigi Nardi, Nuojin Cheng, Stephen Becker.

Figure 1
Figure 1. Figure 1: Observations of UCB with various β values (1, 10, and 100) on a two-dimensional GP-prior sample reveal the explorative behavior for different β. The black crosses are initial points, the orange plus signs are the observations of the BO phase, and the red star is the optimal location. In short, a successful AF should exhibit a good exploration￾exploitation trade-off (EETO) that balances these desiderata. It… view at source ↗
Figure 2
Figure 2. Figure 2: The exploration quantities OTSD and OE of RS, UCB (β = 0.1), EI, PI, MES, and deterministic selection (DM) on the 6-dimensional Hartmann function. From left to right, these plots show OTSD, normalized OTSD, and OE, respectively. DM values for OE (around −134) are hidden for better visualization. The shaded areas show the standard error of the mean. Proposition 3.2 (Upper and Lower Bounds for the True OTSD)… view at source ↗
Figure 5
Figure 5. Figure 5: Normalized OTSD, average ranks for OE and optimization performance for EI and its variations on the synthetic benchmarks. Both RAASP and TRs get a similarly low OE rank, indicat￾ing that they are similarly unexplorative. However, while RAASP is the highest performing variant (orange), TRs reduce exploration on a similar level as RAASP, but it is one of the worst variants, as shown in the right panel of [P… view at source ↗
Figure 4
Figure 4. Figure 4: Normalized OTSD, average rank for OE and opti￾mization performance on the synthetic benchmarks. 4.3 SYNTHETIC BENCHMARKS [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Empirical AF (and their variants) exploration taxonomy based on the quantitative exploration methods OTSD and OE. It represents a general understanding and may differ in specific problems. more explorative less explorative worse better [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Normalized OTSD and average optimization per￾formance rank on the real-world benchmarks. explorative in high-dimensional benchmarks; second, while EI is more explorative than other methods in low dimen￾sions, it becomes less so in high dimensions. One potential explanation is that in high-dimensional spaces, the GP may struggle to accurately learn the objective function, further en￾couraging explorative sa… view at source ↗
Figure 8
Figure 8. Figure 8: Normalized OTSD of Random Search for varying problem dimensions. [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Normalized OTSD and mean ranks of the empirical performance for [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Normalized OTSD and mean ranks of the empirical performance for [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Normalized OTSD and mean ranks of the empirical performance for [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Normalized OTSD and OE rank averaged across all the synthetic benchmarks. A lower rank means lower exploration. We do not show the initial design of experiments phase. 10 100 200 iteration 0.05 0.10 OTSDnorm 10 100 200 iteration 1 2 3 4 5 6 7 8 OE (relative rank) CMA-ES-0.5 EI KG MES PI TS UCB-1.0 RS 0 100 200 iteration 1 2 3 4 5 6 7 Performance (relative rank) Synthetic Benchmarks [PITH_FULL_IMAGE:figur… view at source ↗
Figure 13
Figure 13. Figure 13: Normalized OTSD, average rank for OE and optimization performance on the synthetic benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Normalized OTSD, average ranks for OE and optimization performance for EI and its variations on the synthetic benchmarks. 10 500 1000 iteration 0.025 0.050 0.075 OTSDnorm 0 500 1000 iteration 1 2 3 4 5 Performance (relative rank) CMA-ES-0.5 EI MES PI TS UCB-1.0 RS Real World Benchmarks [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Normalized OTSD and average optimization performance rank on the real-world benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Normalized OTSD and average optimization performance rank on the real-world benchmarks for EI and its variations. TRs and RAASP sampling promote exploitation [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Normalized OTSD and optimization performance ranks on the low-dimensional problems. [PITH_FULL_IMAGE:figures/full_fig_p019_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Normalized OTSD and optimization performance rank on the high-dimensional problems. [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Effect of TRs and RAASP sampling on PI in the context of synthetic benchmarks: Both methods reduce the level of exploration. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Effect of TRs and RAASP sampling on PI in the context of real-world benchmarks: Both methods reduce the level of exploration. D.5.2 Upper Confidence Bounds 10 100 200 iteration 0.025 0.050 0.075 0.100 OTSDnorm 10 100 200 iteration 2 3 4 5 6 OE (relative rank) UCB-1.0 UCB-1.0 + RAASP UCB-1.0 + TR UCB-1.0 + q=32 UCB-1.0 + q=8 RS 0 100 200 iteration 1 2 3 4 5 Performance (relative rank) Synthetic Benchmarks … view at source ↗
Figure 21
Figure 21. Figure 21: Effect of TRs, RAASP sampling, and batching on UCB-1 in the context of synthetic benchmarks:RAASP sampling and TRs reduce exploration, batching increases it. 10 500 1000 iteration 0.02 0.04 0.06 OTSDnorm 0 500 1000 iteration 2 3 4 Performance (relative rank) UCB-1.0 UCB-1.0 + RAASP UCB-1.0 + TR UCB-1.0 + q=32 UCB-1.0 + q=8 RS Real World Benchmarks (a) Without error bars. 10 500 1000 iteration 0.02 0.04 0.… view at source ↗
Figure 22
Figure 22. Figure 22: Effect of TRs, RAASP sampling, and batching on UCB-1 in the context of real-world benchmarks:RAASP sampling and TRs reduce exploration, batching increases it. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Effect of TRs, RAASP sampling, and batching on TS in the context of synthetic benchmarks:RAASP sampling and TRs reduce exploration, batching increases it. RAASP sampling also improves optimization performance. 10 500 1000 iteration 0.04 0.06 0.08 OTSDnorm 0 500 1000 iteration 2 3 4 Performance (relative rank) TS TS + RAASP TS + TR TS + q=32 TS + q=8 RS Real World Benchmarks (a) Without error bars. 10 500 … view at source ↗
Figure 24
Figure 24. Figure 24: Effect of TRs, RAASP sampling, and batching on TS in the context of real-world benchmarks:RAASP sampling and TRs reduce exploration, batching increases it. RAASP sampling also improves optimization performance. D.5.4 Max-Value Entropy Search For MES, we only study the effect of RAASP sampling as we observed model-fitting errors for the TRs. Furthermore, batching is not implemented for MES in BoTorch. 10 1… view at source ↗
Figure 25
Figure 25. Figure 25: Effect of RAASP sampling on MES in the context of synthetic benchmarks: RAASP sampling reduces the level of exploration. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Effect of RAASP sampling on MES in the context of real-world benchmarks: RAASP sampling reduces the level of exploration. D.5.5 Knowledge Gradient We only ran KG for low-dimensional synthetic benchmarks due to its high computational cost in high dimensions. 10 100 200 iteration 0.06 0.08 OTSDnorm 10 100 200 iteration 2 3 4 5 6 OE (relative rank) KG KG + RAASP KG + TR KG + q=32 KG + q=8 RS 0 100 200 iterat… view at source ↗
Figure 27
Figure 27. Figure 27: Effect of TRs, RAASP sampling, and batching on KG in the context of synthetic benchmarks:RAASP sampling and TRs reduce exploration, batching increases it. D.6 OPTIMIZATION PERFORMANCE We present the optimization performance of various acquisition functions (AFs) and their variants for each individual benchmark. These results form the basis for the performance ranking plots in Section 4. For synthetic benc… view at source ↗
Figure 28
Figure 28. Figure 28: Optimization performance of the basic optimizer configuration on the different benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p023_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Optimization performance of the different EI variations on the benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p024_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Optimization performance of the different KG variations on the benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p024_30.png] view at source ↗
Figure 31
Figure 31. Figure 31: Optimization performance of the different MES variations on the benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p025_31.png] view at source ↗
Figure 32
Figure 32. Figure 32: Optimization performance of the different TS variations on the benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p026_32.png] view at source ↗
Figure 33
Figure 33. Figure 33: Optimization performance of the different UCB variations on the benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p027_33.png] view at source ↗
Figure 34
Figure 34. Figure 34: Runtimes for OTSD and OE in different dimensions. Empirically, OTSD scales linearly with the number of dimensions. OE is costlier. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_34.png] view at source ↗
read the original abstract

A well-balanced exploration-exploitation trade-off is crucial for successful acquisition functions in Bayesian optimization. However, there is a lack of quantitative measures for exploration, making it difficult to analyze and compare different acquisition functions. This work introduces two novel approaches - observation traveling salesman distance and observation entropy - to quantify the exploration characteristics of acquisition functions based on their selected observations. Using these measures, we examine the explorative nature of several well-known acquisition functions across a diverse set of black-box problems, uncover links between exploration and empirical performance, and reveal new relationships among existing acquisition functions. Beyond enabling a deeper understanding of acquisition functions, these measures also provide a foundation for guiding their design in a more principled and systematic manner.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces two novel measures—observation traveling salesman distance and observation entropy—to quantify the exploration characteristics of acquisition functions in Bayesian optimization based on their selected observation sets. It applies these to several well-known acquisition functions across a diverse set of black-box problems, examines links between exploration and empirical performance, and reveals new relationships among existing acquisition functions.

Significance. If the measures validly attribute differences in selected points to acquisition-function-specific exploration-exploitation logic (rather than surrogate updates or problem geometry), the work would supply a useful quantitative lens for comparing and designing acquisition functions. The empirical scope across multiple problems is a strength, but the central attribution claim requires verification that is not indicated in the provided description.

major comments (2)
  1. [Abstract] Abstract: the claim that the two measures 'quantify the exploration characteristics of acquisition functions' is load-bearing for the entire contribution, yet the abstract provides no indication that the metrics control for Gaussian-process surrogate dynamics, sequential decision effects, or black-box landscape geometry; differences in TSP distance or entropy could arise identically from these confounders regardless of the acquisition function.
  2. [Approach (definitions)] Approach section (definitions of observation traveling salesman distance and observation entropy): these quantities are computed solely from the final observation sets without reference to the surrogate model or the acquisition function's internal formulation; this directly risks the circularity and confounding concern that the measures do not isolate AF-specific exploration behavior.
minor comments (1)
  1. [Abstract] The abstract would benefit from explicitly naming the acquisition functions and benchmark problems examined so readers can immediately assess the scope of the empirical study.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed feedback. The two major comments both concern whether the proposed measures can validly attribute exploration differences to acquisition-function logic rather than surrogate dynamics or problem geometry. We address each point below and indicate where revisions will be made to clarify scope and limitations.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that the two measures 'quantify the exploration characteristics of acquisition functions' is load-bearing for the entire contribution, yet the abstract provides no indication that the metrics control for Gaussian-process surrogate dynamics, sequential decision effects, or black-box landscape geometry; differences in TSP distance or entropy could arise identically from these confounders regardless of the acquisition function.

    Authors: We agree that the abstract claim is strong and does not qualify the measures' dependence on the selected observation sets alone. The measures are intentionally post-hoc descriptors of the point sets produced by each acquisition function; they do not attempt to partial out surrogate updates or landscape effects. Our empirical results show consistent patterns across multiple problems and random seeds, but we accept that this does not constitute a controlled isolation. We will revise the abstract to state that the measures characterize the spatial properties of the observation sets generated by each acquisition function, without claiming full isolation from confounding factors. revision: yes

  2. Referee: [Approach (definitions)] Approach section (definitions of observation traveling salesman distance and observation entropy): these quantities are computed solely from the final observation sets without reference to the surrogate model or the acquisition function's internal formulation; this directly risks the circularity and confounding concern that the measures do not isolate AF-specific exploration behavior.

    Authors: The definitions are deliberately formulated on the observation sets only, because the exploration we wish to quantify is precisely the spatial distribution of points that the acquisition function ultimately selects. Including surrogate or internal AF terms would change the measures into something other than direct descriptors of exploration realized in the data. We recognize that this design choice leaves open the possibility that observed differences partly reflect surrogate dynamics or geometry rather than acquisition-function logic alone. We will add an explicit limitations paragraph in the Approach section acknowledging this and noting that future work could design controlled experiments that hold the surrogate and problem fixed while varying only the acquisition function. revision: yes

Circularity Check

0 steps flagged

No circularity: new metrics defined from observation sets and applied externally to existing AFs

full rationale

The paper introduces observation TSP distance and observation entropy as direct functions of the final selected point sets. These definitions are then used to measure and compare existing acquisition functions on black-box problems. No step reduces a claimed result to its own inputs by construction, renames a fit as a prediction, or relies on a load-bearing self-citation chain. The derivation chain consists of independent metric definitions followed by empirical application; the central claims therefore remain self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central contribution consists of two newly defined measures whose definitions rest on standard mathematical concepts; no free parameters or additional invented physical entities are described.

axioms (1)
  • standard math Standard mathematical definitions of the traveling salesman problem distance and of entropy can be applied directly to finite sets of selected observation points.
    These definitions are invoked to construct the two new measures.
invented entities (2)
  • observation traveling salesman distance no independent evidence
    purpose: Quantify exploration by treating selected points as a tour and measuring its length
    Newly proposed construct with no independent evidence supplied in the abstract.
  • observation entropy no independent evidence
    purpose: Quantify exploration by computing entropy over the distribution of selected points
    Newly proposed construct with no independent evidence supplied in the abstract.

pith-pipeline@v0.9.0 · 5640 in / 1333 out tokens · 38323 ms · 2026-05-23T03:25:15.980716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Understanding High-Dimensional Bayesian Optimization

    cs.LG 2025-02 unverdicted novelty 5.0

    Vanishing gradients from GP initialization explain HDBO failures; MLE of length scales suffices for SOTA performance, enabling the MSR variant on real-world tasks.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · cited by 1 Pith paper · 4 internal anchors

  1. [1]

    PI is back! Switching Acquisition Functions in Bayesian Optimization

    Carolin Benjamins, Elena Raponi, Anja Jankovic, Koen van der Blom, Maria Laura Santoni, Marius Lindauer, and Carola Doerr. PI is back! Switching Acquisition Functions in Bayesian Optimization. In 2022 NeurIPS Workshop on Gaussian Processes, Spatiotemporal Mod- eling, and Decision-making Systems,

  2. [2]

    A sequential Monte Carlo approach to Thompson sampling for Bayesian optimization

    Hildo Bijl, Thomas B Schön, Jan-Willem van Wingerden, and Michel Verhaegen. A sequential Monte Carlo ap- proach to Thompson sampling for Bayesian optimization. arXiv preprint arXiv:1604.00169,

  3. [3]

    Advanced population diversity measures in genetic programming

    Edmund Burke, Steven Gustafson, Graham Kendall, and Na- talio Krasnogor. Advanced population diversity measures in genetic programming. In Parallel Problem Solving from Nature—PPSN VII: 7th International Conference Granada, Spain, September 7–11, 2002 Proceedings 7, pages 341–350. Springer,

  4. [4]

    Cambridge online dictionary

    Cambridge. Cambridge online dictionary. https://dictionary.cambridge.org/us/ dictionary/english/exploration. Accessed: 2025-02-08. Nachol Chaiyaratana, Theera Piroonratana, and Nuntapon Sangkawelert. Effects of diversity control in single- objective and multi-objective genetic algorithms. Journal of Heuristics, 13:1–34,

  5. [5]

    A Unified Framework for Entropy Search and Expected Improvement in Bayesian Optimization

    Nuojin Cheng, Leonard Papenmeier, Stephen Becker, and Luigi Nardi. A unified framework for entropy search and expected improvement in Bayesian optimization. arXiv preprint arXiv:2501.18756,

  6. [6]

    Accessed: 2025-02-06

    URL https: //botorch.org/docs/tutorials/turbo_1/. Accessed: 2025-02-06. David Eriksson and Martin Jankowiak. High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Sub- spaces. In Uncertainty in Artificial Intelligence, pages 493–503. PMLR,

  7. [7]

    Towards Gaus- sian process-based optimization with finite time horizon

    David Ginsbourger and Rodolphe Le Riche. Towards Gaus- sian process-based optimization with finite time horizon. In mODa 9–Advances in Model-Oriented Design and Analysis: Proceedings of the 9th International Workshop in Model-Oriented Design and Analysis held in Bertinoro, Italy, June 14-18, 2010, pages 89–96. Springer,

  8. [8]

    The CMA Evolution Strategy: A Tutorial

    Nikolaus Hansen. The CMA evolution strategy: A tutorial. arXiv preprint arXiv:1604.00772,

  9. [9]

    CMA-ES/pycma on Github

    Nikolaus Hansen, Youhei Akimoto, and Petr Baudis. CMA-ES/pycma on Github. Zenodo, DOI:10.5281/zenodo.2559634, February

  10. [10]

    URL https://doi.org/10.5281/zenodo. 2559634. Philipp Hennig and Christian J Schuler. Entropy Search for Information-Efficient Global Optimization. Journal of Machine Learning Research, 13(6),

  11. [11]

    Analyzing exploration exploita- tion trade-off by means of PI similarity index and dic- tyostelium based genetic algorithm

    Kazuyuki Inoue, Taku Hasegawa, Naoki Mori, and Keinosuke Matsumoto. Analyzing exploration exploita- tion trade-off by means of PI similarity index and dic- tyostelium based genetic algorithm. In 2015 IEEE Congress on Evolutionary Computation (CEC) , pages 2548–2555. IEEE,

  12. [12]

    Algorithms for multi-armed bandit problems

    V olodymyr Kuleshov and Doina Precup. Algorithms for multi-armed bandit problems. arXiv preprint arXiv:1402.6028,

  13. [13]

    Approximate algorithms for the traveling sales- person problem

    Daniel J Rosenkrantz, Richard Edwin Stearns, and Philip M Lewis. Approximate algorithms for the traveling sales- person problem. In 15th Annual Symposium on Switching and Automata Theory (swat 1974), pages 33–42. IEEE,

  14. [14]

    Plan- ning to be surprised: Optimal Bayesian exploration in dynamic environments

    Yi Sun, Faustino Gomez, and Jürgen Schmidhuber. Plan- ning to be surprised: Optimal Bayesian exploration in dynamic environments. In Artificial General Intelligence: 4th International Conference, AGI 2011, Mountain View, CA, USA, August 3-6,

  15. [15]

    Virtual library of simulation experiments: Test functions and datasets, optimization test problems

    Sonja Surjanovic and Derek Bingham. Virtual library of simulation experiments: Test functions and datasets, optimization test problems. https://www.sfu. ca/~ssurjano/optimization.html. Accessed: 2024-09-01. Richard S Sutton. Reinforcement learning: An introduction. A Bradford Book,

  16. [16]

    Our proof follows from Balogh et al

    11 Appendix (Supplementary Material) Leonard Papenmeier*1 Nuojin Cheng*2 Stephen Becker2 Luigi Nardi1, 3 1Department of Computer Science, Lund University, Lund, Sweden 2Department of Applied Mathematics, University of Colorado Boulder, Boulder, Colorado, USA 3DBtune A PROOFS A.1 OTSD UPPER BOUND Proof. Our proof follows from Balogh et al. [2024, Theorem 1...

  17. [17]

    A more recent study [Devroye and Györfi, 2021] presents a consistency result for the KL estimator even when the density function is not smooth

    extended these results to non-compactly supported densities, providing an upper bound for the bias of order O(t−2/d). A more recent study [Devroye and Györfi, 2021] presents a consistency result for the KL estimator even when the density function is not smooth. However, in our study the sample points in Xt are generally not independent and identically dis...

  18. [18]

    E RUNTIME ANALYSIS We analyze the runtimes for OTSD and OE

    Figure 33: Optimization performance of the different UCB variations on the benchmarks. E RUNTIME ANALYSIS We analyze the runtimes for OTSD and OE. We measure OTSD and OE on random sequences of length 1000 in different dimensionalities, ranging from 10 – 1000 for OTSD and from 2 – 20 for OE. We repeat each experiment 20 times and observe the runtimes for c...