A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: Avoiding Unreliable Conclusions

Elisabetta Valiante; Ignacio Rozada; Masoud Mohseni; Moslem Noori; Thomas Van Vaerenbergh

arxiv: 2503.16589 · v2 · pith:2Z7IPK72new · submitted 2025-03-20 · 💻 cs.LG · cs.ET· math.ST· stat.TH

A Statistical Analysis for Per-Instance Evaluation of Stochastic Optimizers: Avoiding Unreliable Conclusions

Moslem Noori , Elisabetta Valiante , Thomas Van Vaerenbergh , Masoud Mohseni , Ignacio Rozada This is my paper

Pith reviewed 2026-05-22 22:54 UTC · model grok-4.3

classification 💻 cs.LG cs.ETmath.STstat.TH

keywords stochastic optimizersperformance evaluationconfidence intervalssample sizeadaptive algorithmsbenchmarkingstatistical analysisreliable conclusions

0 comments

The pith

A derived lower bound sets the minimum repeats needed to guarantee accuracy in stochastic optimizer metrics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that performance metrics for stochastic optimizers vary across independent runs and that their estimation accuracy must be controlled through statistical methods. It derives an explicit lower bound on the number of repeats required to meet a target accuracy level using standard confidence-interval relations. This bound then supports an adaptive algorithm that increases repeats only as needed until the accuracy guarantee is met. A sympathetic reader would care because benchmarking, hyperparameter tuning, and optimizer comparisons currently risk unreliable conclusions when the repeat count is chosen without such guarantees.

Core claim

The central claim is that a lower bound on the number of repeats, obtained from the relationship between sample size and confidence-interval width for common metrics, guarantees a prescribed accuracy; an adaptive procedure based on this bound then determines the minimal number of runs that achieves the target accuracy for any given optimizer and problem instance.

What carries the argument

The lower bound on the number of repeats derived from confidence-interval formulas for the performance metrics, which directly determines the minimal sample size guaranteeing the desired accuracy.

If this is right

Benchmarking studies can report metrics only after the adaptive procedure confirms the target accuracy has been reached.
Hyperparameter searches for stochastic methods become reliable once the repeat count satisfies the derived bound.
Direct comparisons between optimizers avoid premature conclusions by enforcing the same accuracy guarantee on each.
Experiment design shifts from fixed repeat counts to data-driven stopping rules that stop only when accuracy is assured.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bound and adaptive rule could be applied to other stochastic procedures whose outputs are evaluated by mean or quantile metrics.
Papers that currently use a fixed small number of repeats would need to increase that number or adopt the adaptive method to meet the accuracy standard.
If the underlying metric distribution deviates strongly from the assumed form, the bound may be either conservative or insufficient, suggesting a need for distribution-robust alternatives.

Load-bearing premise

The performance metrics obtained from independent runs follow a distribution for which standard confidence-interval formulas and sample-size bounds apply.

What would settle it

Run a large-scale simulation in which the true metric distribution is known; compute the adaptive procedure's repeat count for a target accuracy; then check whether the empirical coverage probability of the resulting intervals falls below the nominal level.

Figures

Figures reproduced from arXiv: 2503.16589 by Elisabetta Valiante, Ignacio Rozada, Masoud Mohseni, Moslem Noori, Thomas Van Vaerenbergh.

**Figure 2.** Figure 2: FIG. 2: Relative width of the confidence interval [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3: Effect of [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4: Actual (a) estimate and confidence interval of [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5: Relative estimate error for different values of [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6: Relative estimate error for different values of [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

read the original abstract

A key trait of stochastic optimizers is that multiple runs of the same optimizer in attempting to solve the same problem can produce different results. As a result, their performance is evaluated over several repeats, or runs, on the problem. However, the accuracy of the estimated performance metrics depends on the number of runs and should be studied using statistical tools. We present a statistical analysis of the common metrics, and develop guidelines for experiment design to measure the optimizer's performance using these metrics to a high level of confidence and accuracy. To this end, we first discuss the confidence interval of the metrics and how they are related to the number of runs of an experiment. We then derive a lower bound on the number of repeats in order to guarantee achieving a given accuracy in the metrics. Using this bound, we propose an algorithm to adaptively adjust the number of repeats needed to ensure the accuracy of the evaluated metric. Our simulation results demonstrate the utility of our analysis and how it allows us to conduct reliable benchmarking as well as hyperparameter tuning and prevent us from drawing premature conclusions regarding the performance of stochastic optimizers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper supplies a concrete lower bound on repeats plus an adaptive algorithm for per-instance optimizer metrics, but the derivation leans on untested i.i.d. and normality assumptions that often fail for real stochastic runs.

read the letter

This paper derives a lower bound on the number of repeats needed to hit a target accuracy for per-instance metrics and gives an adaptive algorithm that adds runs until the confidence interval is tight enough. It applies standard sample-size formulas to things like mean loss or success rate and validates the approach in simulations. That is the actual new piece: a practical, per-instance rule rather than a generic power analysis. It does address a real problem—many optimizer papers still run three or five seeds and treat the numbers as settled—which makes the adaptive procedure worth having around for benchmarking or tuning loops. The simulations appear to show it avoids both under-sampling and wasted repeats in the cases they tested. The main soft spot is the distributional assumption. The bound and the stopping rule rest on the metrics behaving like i.i.d. draws from something normal or binomial. Optimizer outputs are frequently heavy-tailed or correlated across seeds because the random seed interacts with the loss landscape. The abstract and the stress-test note give no indication that the authors checked robustness under those conditions or supplied a diagnostic. If the bound is simply conservative when the assumption is violated, that is acceptable, but it should be stated. The work is aimed at people who run repeated stochastic optimizer trials and want defensible error bars without burning extra compute. A reader focused on experimental methodology will find the algorithm and bound directly usable. It is worth sending to peer review because the core tool is simple to implement and targets a genuine weakness in current practice, even if the assumptions need more scrutiny in revision.

Referee Report

2 major / 1 minor

Summary. The paper claims that performance metrics of stochastic optimizers, obtained from multiple independent runs, can be analyzed using standard statistical tools to derive confidence intervals, a lower bound on the required number of repeats for a target accuracy, and an adaptive algorithm that adjusts the number of repeats on the fly. Simulation experiments are presented to illustrate that the approach supports reliable benchmarking, hyperparameter tuning, and avoidance of premature conclusions.

Significance. If the central derivation and adaptive procedure remain valid for the metric distributions encountered in practice, the work supplies a concrete, actionable guideline that could reduce under-powered evaluations across the stochastic optimization literature. The explicit lower-bound formula and the adaptive stopping rule constitute a practical methodological contribution that directly targets a widespread source of unreliable empirical claims.

major comments (2)

[Abstract / derivation of lower bound] Abstract and the derivation of the lower bound: the bound is obtained from standard sample-size formulas for confidence intervals; these formulas presuppose either approximate normality (for means) or exact binomial/exact models (for success rates) together with i.i.d. runs. No section demonstrates that the bound remains valid, or supplies a diagnostic, when the metric (final loss, wall-clock time, success indicator) is heavy-tailed or multimodal, both of which are common for stochastic optimizers.
[Adaptive algorithm] Adaptive algorithm section: the procedure inherits the same distributional assumptions as the bound; without a robustness argument or a fallback when the empirical distribution deviates from the assumed family, the guarantee that the target accuracy is achieved cannot be assured for the metrics the paper intends to cover.

minor comments (1)

[Introduction] The abstract refers to 'the metrics' without an early, explicit enumeration of which performance measures (mean loss, median, success rate, etc.) are treated; adding a short table or list in the introduction would improve readability.

Circularity Check

0 steps flagged

No circularity; derivation applies standard CI/sample-size formulas to metrics

full rationale

The paper's central steps consist of (1) recalling the standard confidence-interval formula for a metric, (2) algebraically rearranging that formula to obtain a lower bound on the number of repeats n needed for a target accuracy, and (3) using the bound to define a simple adaptive loop that stops when the observed CI width meets the target. None of these steps defines a quantity in terms of itself, renames a fitted parameter as a prediction, or relies on a self-citation whose content is unverified. The derivation is therefore self-contained once the usual i.i.d. and distributional assumptions of classical statistics are granted; those assumptions are external to the paper and not generated by its own equations.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard statistical assumptions for constructing confidence intervals and sample-size bounds; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Independent runs of the optimizer produce i.i.d. observations of the performance metric
Required for the validity of the confidence-interval formulas and the derived lower bound on the number of repeats

pith-pipeline@v0.9.0 · 5749 in / 1212 out tokens · 88890 ms · 2026-05-22T22:54:57.370714+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Accelerating Hybrid XOR$-$CNF Boolean Satisfiability Problems Natively with In-Memory Computing
cs.ET 2025-04 unverdicted novelty 7.0

A memristor-based in-memory computing accelerator natively embeds and solves hybrid XOR-CNF SAT problems, with simulations showing roughly 10x gains in speed, energy, and area over CNF translation methods and 1000x en...

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · cited by 1 Pith paper

[1]

The results for (p1, p2) ∈ { (0.25, 0.2), (0.5, 0.45), (0.75, 0.7), (0.99, 0.94)} are presented in Table I

To investigate how frequently this hap- pens, we run the experiment 1000 times to collect sta- tistical data for several pairs of ( p1, p2). The results for (p1, p2) ∈ { (0.25, 0.2), (0.5, 0.45), (0.75, 0.7), (0.99, 0.94)} are presented in Table I. The table shows the chance of correctly finding ˆR1 99 < ˆR2 99 as well as the chance of having no overlap b...

work page
[2]

We now evaluate the utility of Algorithm 1 in choosing the number of repeats

For n = 10,000, the better solver can always be identified with a very high level of confidence. We now evaluate the utility of Algorithm 1 in choosing the number of repeats. For this purpose, we generate syn- thetic data for different values of a true optimizer’s suc- cess probability p. We again use a Bernoulli distribution B(p) to model the outcome of ...

work page
[3]

Bianchi, M

L. Bianchi, M. Dorigo, L. M. Gambardella, and W. J. Gutjahr, A survey on metaheuristics for stochastic combinatorial optimization, Natural Computing 8, 239 (2009)

work page 2009
[4]

Collet and J.-P

P. Collet and J.-P. Rennard, Stochastic optimization al- gorithms, in Intelligent information technologies: Con- cepts, methodologies, tools, and applications (IGI Global,

work page
[5]

H. Goto, K. Endo, M. Suzuki, Y. Sakai, T. Kanao, Y. Hamakawa, R. Hidaka, M. Yamasaki, and K. Tat- sumura, High-performance combinatorial optimization based on classical mechanics, Science Advances 7, eabe7953 (2021)

work page 2021
[6]

Tsukamoto, M

S. Tsukamoto, M. Takatsu, S. Matsubara, and H. Tamura, An accelerator architecture for combinatorial optimization problems, Fujitsu Sci. Tech. J 53, 8 (2017)

work page 2017
[7]

H. Goto, K. Tatsumura, and A. R. Dixon, Combina- torial optimization by simulating adiabatic bifurcations in nonlinear Hamiltonian systems, Science advances 5, eaav2372 (2019)

work page 2019
[8]

Morita and H

S. Morita and H. Nishimori, Mathematical foundation of quantum annealing, Journal of Mathematical Physics 49 (2008)

work page 2008
[9]

Hizzani, A

M. Hizzani, A. Heittmann, G. Hutchinson, D. Dobrynin, T. Van Vaerenbergh, T. Bhattacharya, A. Renaudineau, D. Strukov, and J. P. Strachan, Memristor-based hard- ware and algorithms for higher-order Hopfield optimiza- 11 tion solver outperforming quadratic Ising machines, in 2024 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2024) pp. 1–5

work page 2024
[10]

W. Moy, I. Ahmed, P.-w. Chiu, J. Moy, S. S. Sapatnekar, and C. H. Kim, A 1,968-node coupled ring oscillator cir- cuit for combinatorial optimization problem solving, Na- ture Electronics 5, 310 (2022)

work page 2022
[11]

Bhattacharya, G

T. Bhattacharya, G. H. Hutchinson, G. Pedretti, X. Sheng, J. Ignowski, T. Van Vaerenbergh, R. Beau- soleil, J. P. Strachan, and D. B. Strukov, Comput- ing high-degree polynomial gradients in memory, Nature Communications 15, 8211 (2024)

work page 2024
[12]

Zhang, U

Y. Zhang, U. K. R. Vengalam, A. Sharma, M. Huang, and Z. Ignjatovic, Qubrim: A CMOS compatible resistively- coupled Ising machine with quantized nodal interactions, in Proceedings of the 41st IEEE/ACM International Con- ference on Computer-Aided Design (2022) pp. 1–8

work page 2022
[13]

Cılasun, Z

H. Cılasun, Z. Zeng, A. Kumar, H. Lo, W. Cho, W. Moy, C. H. Kim, U. R. Karpuzcu, and S. S. Sapatnekar, 3SAT on an all-to-all-connected CMOS Ising solver chip, Sci- entific reports 14, 10757 (2024)

work page 2024
[14]

J. Si, S. Yang, Y. Cen, J. Chen, Y. Huang, Z. Yao, D.-J. Kim, K. Cai, J. Yoo, X. Fong, et al. , Energy- efficient superparamagnetic Ising machine and its appli- cation to traveling salesman problems, Nature Commu- nications 15, 3457 (2024)

work page 2024
[15]

Patel, P

S. Patel, P. Canoza, and S. Salahuddin, Logically syn- thesized and hardware-accelerated restricted Boltzmann machines for combinatorial optimization and integer fac- torization, Nature Electronics 5, 92 (2022)

work page 2022
[16]

J. R. Koza, Genetic programming II: automatic discovery of reusable programs (MIT Press, 1994)

work page 1994
[17]

T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Defining and detecting quantum speedup, science 345, 420 (2014)

work page 2014
[18]

Kowalsky, T

M. Kowalsky, T. Albash, I. Hen, and D. A. Lidar, 3- regular three-XORSAT planted solutions benchmark of classical and quantum heuristic optimizers, Quantum Science and Technology 7, 025008 (2022)

work page 2022
[19]

Pedretti, F

G. Pedretti, F. B¨ ohm, T. Bhattacharya, A. Heittman, X. Zhang, M. Hizzani, G. Hutchinson, D. Kwon, J. Moon, E. Valiante, et al. , Solving Boolean satisfiability prob- lems with resistive content addressable memories, arXiv preprint arXiv:2501.07733 (2025)

work page arXiv 2025
[20]

P. J. Angeline, An investigation into the sensitivity of genetic programming to the frequency of leaf selection during subtree crossover, in Proceedings of the 1st annual conference on genetic programming (1996) pp. 21–29

work page 1996
[21]

Christensen and F

S. Christensen and F. Oppacher, An analysis of Koza’s computational effort statistic for genetic programming, in Proceedings of 5th European Conference on Genetic Programming (EuroGP) (Springer, 2002) pp. 182–191

work page 2002
[22]

Keijzer, V

M. Keijzer, V. Babovic, C. Ryan, M. O’Neill, and M. Cat- tolico, Adaptive logic programming, in Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (2001) pp. 42–49

work page 2001
[23]

D. F. Barrero, B. Castano, M. D. R-Moreno, and D. Ca- macho, A study on Koza’s performance measures, Ge- netic Programming and Evolvable Machines 16, 327 (2015)

work page 2015
[24]

Horowitz and E

M. Horowitz and E. Grumbling, Quantum computing: progress and prospects(National Academies Press, 2019)

work page 2019
[25]

Verma, H

N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and P. Deaville, In-memory computing: Advances and prospects, IEEE Solid-State Circuits Mag- azine 11, 43 (2019)

work page 2019
[26]

C. D. Schuman, S. R. Kulkarni, M. Parsa, J. P. Mitchell, P. Date, and B. Kay, Opportunities for neuromorphic computing algorithms and applications, Nature Compu- tational Science 2, 10 (2022)

work page 2022
[27]

S. Cai, K. Su, and C. Luo, Improving WalkSAT for ran- dom k-satisfiability problem with k > 3, in Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 27 (2013) pp. 145–151

work page 2013
[28]

Beeson and B

P. Beeson and B. Ames, Trac-ik: An open-source library for improved solving of generic inverse kinematics, in 2015 IEEE-RAS 15th International Conference on Hu- manoid Robots (Humanoids) (IEEE, 2015) pp. 928–935

work page 2015
[29]

Chmiela, E

A. Chmiela, E. Khalil, A. Gleixner, A. Lodi, and S. Pokutta, Learning to schedule heuristics in branch and bound, Advances in Neural Information Processing Sys- tems 34, 24235 (2021)

work page 2021
[30]

McGeoch, How not to fool the masses when giving per- formance results for quantum computers, arXiv preprint arXiv:2411.08860 (2024)

C. McGeoch, How not to fool the masses when giving per- formance results for quantum computers, arXiv preprint arXiv:2411.08860 (2024)

work page arXiv 2024
[31]

Albash and D

T. Albash and D. A. Lidar, Demonstration of a scaling advantage for a quantum annealer over simulated anneal- ing, Physical Review X 8, 031016 (2018)

work page 2018
[32]

Perdomo-Ortiz, A

A. Perdomo-Ortiz, A. Feldman, A. Ozaeta, S. V. Isakov, Z. Zhu, B. O’Gorman, H. G. Katzgraber, A. Diedrich, H. Neven, J. de Kleer, et al., Readiness of quantum op- timization machines for industrial applications, Physical Review Applied 12, 014004 (2019)

work page 2019
[33]

L. D. Brown, T. T. Cai, and A. DasGupta, Interval esti- mation for a binomial proportion, Statistical science 16, 101 (2001)

work page 2001
[34]

L. D. Brown, T. T. Cai, and A. DasGupta, Confidence intervals for a binomial proportion and asymptotic ex- pansions, The Annals of Statistics 30, 160 (2002)

work page 2002
[35]

P. G. Andersson, The Wald confidence interval for a bino- mial p as an illuminating “bad” example, The American Statistician 77, 443 (2023)

work page 2023
[36]

Gon¸ calves, M

L. Gon¸ calves, M. R. de Oliveira, C. Pascoal, and A. Pires, Sample size for estimating a binomial proportion: com- parison of different methods, Journal of Applied Statis- tics 39, 2453 (2012)

work page 2012
[37]

Krishnamoorthy and J

K. Krishnamoorthy and J. Peng, Some properties of the exact and score methods for binomial proportion and sample size calculation, Communications in Statis- tics—Simulation and Computation 36, 1171 (2007)

work page 2007
[38]

H. H. Hoos et al., An adaptive noise mechanism for Walk- SAT, in AAAI/IAAI (2002) pp. 655–660

work page 2002

[1] [1]

The results for (p1, p2) ∈ { (0.25, 0.2), (0.5, 0.45), (0.75, 0.7), (0.99, 0.94)} are presented in Table I

To investigate how frequently this hap- pens, we run the experiment 1000 times to collect sta- tistical data for several pairs of ( p1, p2). The results for (p1, p2) ∈ { (0.25, 0.2), (0.5, 0.45), (0.75, 0.7), (0.99, 0.94)} are presented in Table I. The table shows the chance of correctly finding ˆR1 99 < ˆR2 99 as well as the chance of having no overlap b...

work page

[2] [2]

We now evaluate the utility of Algorithm 1 in choosing the number of repeats

For n = 10,000, the better solver can always be identified with a very high level of confidence. We now evaluate the utility of Algorithm 1 in choosing the number of repeats. For this purpose, we generate syn- thetic data for different values of a true optimizer’s suc- cess probability p. We again use a Bernoulli distribution B(p) to model the outcome of ...

work page

[3] [3]

Bianchi, M

L. Bianchi, M. Dorigo, L. M. Gambardella, and W. J. Gutjahr, A survey on metaheuristics for stochastic combinatorial optimization, Natural Computing 8, 239 (2009)

work page 2009

[4] [4]

Collet and J.-P

P. Collet and J.-P. Rennard, Stochastic optimization al- gorithms, in Intelligent information technologies: Con- cepts, methodologies, tools, and applications (IGI Global,

work page

[5] [5]

H. Goto, K. Endo, M. Suzuki, Y. Sakai, T. Kanao, Y. Hamakawa, R. Hidaka, M. Yamasaki, and K. Tat- sumura, High-performance combinatorial optimization based on classical mechanics, Science Advances 7, eabe7953 (2021)

work page 2021

[6] [6]

Tsukamoto, M

S. Tsukamoto, M. Takatsu, S. Matsubara, and H. Tamura, An accelerator architecture for combinatorial optimization problems, Fujitsu Sci. Tech. J 53, 8 (2017)

work page 2017

[7] [7]

H. Goto, K. Tatsumura, and A. R. Dixon, Combina- torial optimization by simulating adiabatic bifurcations in nonlinear Hamiltonian systems, Science advances 5, eaav2372 (2019)

work page 2019

[8] [8]

Morita and H

S. Morita and H. Nishimori, Mathematical foundation of quantum annealing, Journal of Mathematical Physics 49 (2008)

work page 2008

[9] [9]

Hizzani, A

M. Hizzani, A. Heittmann, G. Hutchinson, D. Dobrynin, T. Van Vaerenbergh, T. Bhattacharya, A. Renaudineau, D. Strukov, and J. P. Strachan, Memristor-based hard- ware and algorithms for higher-order Hopfield optimiza- 11 tion solver outperforming quadratic Ising machines, in 2024 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, 2024) pp. 1–5

work page 2024

[10] [10]

W. Moy, I. Ahmed, P.-w. Chiu, J. Moy, S. S. Sapatnekar, and C. H. Kim, A 1,968-node coupled ring oscillator cir- cuit for combinatorial optimization problem solving, Na- ture Electronics 5, 310 (2022)

work page 2022

[11] [11]

Bhattacharya, G

T. Bhattacharya, G. H. Hutchinson, G. Pedretti, X. Sheng, J. Ignowski, T. Van Vaerenbergh, R. Beau- soleil, J. P. Strachan, and D. B. Strukov, Comput- ing high-degree polynomial gradients in memory, Nature Communications 15, 8211 (2024)

work page 2024

[12] [12]

Zhang, U

Y. Zhang, U. K. R. Vengalam, A. Sharma, M. Huang, and Z. Ignjatovic, Qubrim: A CMOS compatible resistively- coupled Ising machine with quantized nodal interactions, in Proceedings of the 41st IEEE/ACM International Con- ference on Computer-Aided Design (2022) pp. 1–8

work page 2022

[13] [13]

Cılasun, Z

H. Cılasun, Z. Zeng, A. Kumar, H. Lo, W. Cho, W. Moy, C. H. Kim, U. R. Karpuzcu, and S. S. Sapatnekar, 3SAT on an all-to-all-connected CMOS Ising solver chip, Sci- entific reports 14, 10757 (2024)

work page 2024

[14] [14]

J. Si, S. Yang, Y. Cen, J. Chen, Y. Huang, Z. Yao, D.-J. Kim, K. Cai, J. Yoo, X. Fong, et al. , Energy- efficient superparamagnetic Ising machine and its appli- cation to traveling salesman problems, Nature Commu- nications 15, 3457 (2024)

work page 2024

[15] [15]

Patel, P

S. Patel, P. Canoza, and S. Salahuddin, Logically syn- thesized and hardware-accelerated restricted Boltzmann machines for combinatorial optimization and integer fac- torization, Nature Electronics 5, 92 (2022)

work page 2022

[16] [16]

J. R. Koza, Genetic programming II: automatic discovery of reusable programs (MIT Press, 1994)

work page 1994

[17] [17]

T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov, D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer, Defining and detecting quantum speedup, science 345, 420 (2014)

work page 2014

[18] [18]

Kowalsky, T

M. Kowalsky, T. Albash, I. Hen, and D. A. Lidar, 3- regular three-XORSAT planted solutions benchmark of classical and quantum heuristic optimizers, Quantum Science and Technology 7, 025008 (2022)

work page 2022

[19] [19]

Pedretti, F

G. Pedretti, F. B¨ ohm, T. Bhattacharya, A. Heittman, X. Zhang, M. Hizzani, G. Hutchinson, D. Kwon, J. Moon, E. Valiante, et al. , Solving Boolean satisfiability prob- lems with resistive content addressable memories, arXiv preprint arXiv:2501.07733 (2025)

work page arXiv 2025

[20] [20]

P. J. Angeline, An investigation into the sensitivity of genetic programming to the frequency of leaf selection during subtree crossover, in Proceedings of the 1st annual conference on genetic programming (1996) pp. 21–29

work page 1996

[21] [21]

Christensen and F

S. Christensen and F. Oppacher, An analysis of Koza’s computational effort statistic for genetic programming, in Proceedings of 5th European Conference on Genetic Programming (EuroGP) (Springer, 2002) pp. 182–191

work page 2002

[22] [22]

Keijzer, V

M. Keijzer, V. Babovic, C. Ryan, M. O’Neill, and M. Cat- tolico, Adaptive logic programming, in Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation (2001) pp. 42–49

work page 2001

[23] [23]

D. F. Barrero, B. Castano, M. D. R-Moreno, and D. Ca- macho, A study on Koza’s performance measures, Ge- netic Programming and Evolvable Machines 16, 327 (2015)

work page 2015

[24] [24]

Horowitz and E

M. Horowitz and E. Grumbling, Quantum computing: progress and prospects(National Academies Press, 2019)

work page 2019

[25] [25]

Verma, H

N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and P. Deaville, In-memory computing: Advances and prospects, IEEE Solid-State Circuits Mag- azine 11, 43 (2019)

work page 2019

[26] [26]

C. D. Schuman, S. R. Kulkarni, M. Parsa, J. P. Mitchell, P. Date, and B. Kay, Opportunities for neuromorphic computing algorithms and applications, Nature Compu- tational Science 2, 10 (2022)

work page 2022

[27] [27]

S. Cai, K. Su, and C. Luo, Improving WalkSAT for ran- dom k-satisfiability problem with k > 3, in Proceedings of the AAAI Conference on Artificial Intelligence , Vol. 27 (2013) pp. 145–151

work page 2013

[28] [28]

Beeson and B

P. Beeson and B. Ames, Trac-ik: An open-source library for improved solving of generic inverse kinematics, in 2015 IEEE-RAS 15th International Conference on Hu- manoid Robots (Humanoids) (IEEE, 2015) pp. 928–935

work page 2015

[29] [29]

Chmiela, E

A. Chmiela, E. Khalil, A. Gleixner, A. Lodi, and S. Pokutta, Learning to schedule heuristics in branch and bound, Advances in Neural Information Processing Sys- tems 34, 24235 (2021)

work page 2021

[30] [30]

McGeoch, How not to fool the masses when giving per- formance results for quantum computers, arXiv preprint arXiv:2411.08860 (2024)

C. McGeoch, How not to fool the masses when giving per- formance results for quantum computers, arXiv preprint arXiv:2411.08860 (2024)

work page arXiv 2024

[31] [31]

Albash and D

T. Albash and D. A. Lidar, Demonstration of a scaling advantage for a quantum annealer over simulated anneal- ing, Physical Review X 8, 031016 (2018)

work page 2018

[32] [32]

Perdomo-Ortiz, A

A. Perdomo-Ortiz, A. Feldman, A. Ozaeta, S. V. Isakov, Z. Zhu, B. O’Gorman, H. G. Katzgraber, A. Diedrich, H. Neven, J. de Kleer, et al., Readiness of quantum op- timization machines for industrial applications, Physical Review Applied 12, 014004 (2019)

work page 2019

[33] [33]

L. D. Brown, T. T. Cai, and A. DasGupta, Interval esti- mation for a binomial proportion, Statistical science 16, 101 (2001)

work page 2001

[34] [34]

L. D. Brown, T. T. Cai, and A. DasGupta, Confidence intervals for a binomial proportion and asymptotic ex- pansions, The Annals of Statistics 30, 160 (2002)

work page 2002

[35] [35]

P. G. Andersson, The Wald confidence interval for a bino- mial p as an illuminating “bad” example, The American Statistician 77, 443 (2023)

work page 2023

[36] [36]

Gon¸ calves, M

L. Gon¸ calves, M. R. de Oliveira, C. Pascoal, and A. Pires, Sample size for estimating a binomial proportion: com- parison of different methods, Journal of Applied Statis- tics 39, 2453 (2012)

work page 2012

[37] [37]

Krishnamoorthy and J

K. Krishnamoorthy and J. Peng, Some properties of the exact and score methods for binomial proportion and sample size calculation, Communications in Statis- tics—Simulation and Computation 36, 1171 (2007)

work page 2007

[38] [38]

H. H. Hoos et al., An adaptive noise mechanism for Walk- SAT, in AAAI/IAAI (2002) pp. 655–660

work page 2002