Benchmarking Optimization Algorithms for Automated Calibration of Quantum Devices
Pith reviewed 2026-05-18 17:57 UTC · model grok-4.3
The pith
CMA-ES outperforms other optimizers for quantum device calibration in tests mimicking real conditions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors benchmark optimization algorithms for calibrating quantum devices inside a simulated setting that reproduces real experimental challenges. The comparison covers standard methods including Nelder-Mead and the CMA-ES algorithm, applied to low-dimensional cases that match current simple control pulses and to high-dimensional cases that match complex pulses with many parameters. The results indicate that CMA-ES delivers superior performance across all scenarios, which leads directly to the recommendation that it be used for automated bring-up, tune-up, and system identification.
What carries the argument
Benchmark comparison of optimization algorithms, with CMA-ES adapting its covariance matrix to search parameter spaces for quantum pulse calibration.
If this is right
- Automated calibration procedures can adopt CMA-ES to reach target performance with fewer iterations in both simple and complex pulse designs.
- System identification tasks become more practical when the optimizer handles high-dimensional parameter spaces reliably.
- Bringing up new quantum devices requires less manual adjustment once CMA-ES is integrated into the tuning workflow.
- Current optimal control protocols gain efficiency by switching to the recommended algorithm for parameter search.
Where Pith is reading between the lines
- If the simulation matches hardware closely enough, direct tests on real devices would likely confirm the same performance ordering.
- The same benchmarking method could be applied to other quantum tasks such as variational circuit optimization or error mitigation parameter tuning.
- Extending the study to include additional noise models or hardware-specific constraints would strengthen the case for using CMA-ES in larger systems.
Load-bearing premise
The simulation accurately reproduces the noise sources, constraints, and failure modes that appear during real quantum device calibration.
What would settle it
Executing the same calibration tasks on physical quantum hardware and observing that CMA-ES no longer outperforms the other algorithms or that the ranking changes.
Figures
read the original abstract
We present the results of a comprehensive study of optimization algorithms for the calibration of quantum devices. As part of our ongoing efforts to automate bring-up, tune-up, and system identification procedures, we investigate a broad range of optimizers within a simulated environment designed to closely mimic the challenges of real-world experimental conditions. Our benchmark includes widely used algorithms such as Nelder-Mead and the state-of-the-art Covariance Matrix Adaptation Evolution Strategy (CMA-ES). We evaluate performance in both low-dimensional settings, representing simple pulse shapes used in current optimal control protocols with a limited number of parameters, and high-dimensional regimes, which reflect the demands of complex control pulses with many parameters. Based on our findings, we recommend the CMA-ES algorithm and provide empirical evidence for its superior performance across all tested scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript reports a benchmark of optimization algorithms for automated calibration of quantum devices. It evaluates a range of methods, prominently including Nelder-Mead and CMA-ES, inside a simulated environment constructed to reproduce experimental challenges. Performance is compared in low-dimensional regimes (simple pulse shapes with few parameters) and high-dimensional regimes (complex control pulses with many parameters). The central conclusion is that CMA-ES exhibits superior performance across all tested scenarios and is therefore recommended for practical use in quantum-device bring-up and tune-up.
Significance. If the simulated cost landscapes and noise models faithfully capture the dominant experimental difficulties, the empirical ranking supplies actionable guidance for selecting optimizers in automated quantum calibration workflows. The explicit comparison across dimensionality regimes is a useful contribution to the growing literature on quantum control automation.
major comments (2)
- [Abstract and §3] Abstract and §3 (simulation setup): the recommendation of CMA-ES rests on the premise that the simulated environment 'closely mimic[s] the challenges of real-world experimental conditions.' No quantitative validation—such as side-by-side comparison of optimizer rankings on hardware versus simulation, or explicit matching of decoherence spectra, readout noise statistics, or parameter drift—is presented. This assumption is load-bearing for the transferability of the empirical ranking to real devices.
- [§4] §4 (results and statistical analysis): the abstract asserts 'empirical evidence for its superior performance,' yet the manuscript provides neither the number of independent runs, confidence intervals on the reported metrics, nor statistical tests comparing CMA-ES against Nelder-Mead. Without these, it is impossible to judge whether observed differences are robust or could arise from random variation in the simulated landscapes.
minor comments (2)
- [Abstract] The abstract states that 'a broad range of optimizers' was investigated but names only Nelder-Mead and CMA-ES. Listing the complete set of algorithms and their hyper-parameter choices in a table would improve reproducibility and clarity.
- [Figures] Figure captions and axis labels should explicitly state the cost-function definition and the precise noise model used in each panel to allow readers to assess the simulation fidelity without returning to the main text.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We respond point-by-point to the major comments and indicate planned revisions.
read point-by-point responses
-
Referee: [Abstract and §3] Abstract and §3 (simulation setup): the recommendation of CMA-ES rests on the premise that the simulated environment 'closely mimic[s] the challenges of real-world experimental conditions.' No quantitative validation—such as side-by-side comparison of optimizer rankings on hardware versus simulation, or explicit matching of decoherence spectra, readout noise statistics, or parameter drift—is presented. This assumption is load-bearing for the transferability of the empirical ranking to real devices.
Authors: We agree that the absence of quantitative hardware validation limits strong claims about transferability. The simulation incorporates standard models of decoherence, readout noise, and control imperfections drawn from the quantum control literature, but we do not claim or demonstrate exact matching of experimental spectra or drift statistics. In revision we will add an explicit limitations paragraph in §3, qualify the recommendation as applying to the modeled environments, and remove any implication of direct real-device equivalence without further validation. revision: partial
-
Referee: [§4] §4 (results and statistical analysis): the abstract asserts 'empirical evidence for its superior performance,' yet the manuscript provides neither the number of independent runs, confidence intervals on the reported metrics, nor statistical tests comparing CMA-ES against Nelder-Mead. Without these, it is impossible to judge whether observed differences are robust or could arise from random variation in the simulated landscapes.
Authors: We accept this criticism. The revised §4 will report the exact number of independent runs per configuration, include confidence intervals or standard errors on all performance metrics, and apply non-parametric statistical tests (e.g., Wilcoxon signed-rank) to assess whether CMA-ES differences versus Nelder-Mead are significant. These additions will allow readers to evaluate robustness directly. revision: yes
- Direct quantitative validation of the simulation against real hardware, including side-by-side optimizer rankings and explicit matching of decoherence spectra or parameter drift, as this requires new experimental campaigns on physical quantum devices outside the scope of the present simulation study.
Circularity Check
No circularity: empirical benchmark of standard optimizers
full rationale
The paper is a straightforward empirical comparison of off-the-shelf optimization algorithms (Nelder-Mead, CMA-ES, etc.) inside a simulated calibration environment. No mathematical derivation chain exists that could reduce a claimed result to its own inputs by construction. Performance rankings are reported from direct runs on the simulator; the recommendation of CMA-ES follows from those observed outcomes rather than from any fitted parameter, self-definition, or self-citation that is load-bearing for the central claim. The simulation-fidelity assumption is an external-validity concern, not an internal circularity.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
l(x) = ln(1 - 1/N ∑ M(ŝ_n(x))) with M the ground-state population after an ORBIT sequence of Clifford gates
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Benchmarking … in both low-dimensional settings (DRAG pulse) and high-dimensional regimes (PWC pulse with 82 parameters)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Derivative Removal by Adiabatic Gate (DRAG) Pulse The input signal of the DRAG pulse is given by the following equation, derived from the derivative removal by adiabatic gate (DRAG) method: ε(t) =AΩ Gauss(t) cos(ωdt+ϕ xy) + 1 δ A ˙ΩGauss(t) sin(ωdt+ϕ xy). (5) Here,Ais the amplitude of the pulse, Ω Gauss(t) is a Gaus- sian envelope,ω d is the drive frequen...
-
[2]
Piecewise Constant (PWC) Pulse For the Piecewise Constant Pulse (PWC), the previ- ous envelope Ω Gauss is replaced by a piecewise constant envelope ΩPWC. In this setup, to provide a working ini- tial guess, the shape of the step function is chosen as a discretization of Ω Gauss. The optimization task for the PWC pulse lies in the fine-tuning of each indiv...
-
[3]
Realistic Starting Position To make the simulation as realistic as possible, the initial detuning of the pulse parameters from their fine- tuned values must be defined. We chose a worst-case scenario in which each parameter is initially detuned by 5% from its optimal value. This detuning is intended to mimic the state of the sys- tem after a rough calibra...
- [4]
-
[5]
P. Jurcevic, A. Javadi-Abhari, L. S. Bishop, I. Lauer, D. F. Bogorin, M. Brink, L. Capelluto, O. G¨ unl¨ uk, T. Itoko, N. Kanazawa,et al., Quantum Science and Technology6, 025020 (2021)
work page 2021
-
[6]
S. Sheldon, E. Magesan, J. M. Chow, and J. M. Gam- betta, Phys. Rev. A93, 060302(R) (2016)
work page 2016
-
[7]
S. Sheldon, L. S. Bishop, E. Magesan, S. Filipp, J. M. Chow, and J. M. Gambetta, Phys. Rev. A93, 012301 (2016)
work page 2016
-
[8]
T. Proctor, M. Revelle, E. Nielsen, K. Rudinger, D. Lob- ser, P. Maunz, R. Blume-Kohout, and K. Young, Nature Communications11, 5396 (2020)
work page 2020
-
[9]
J. J. Burnett, A. Bengtsson, M. Scigliuzzo, D. Niepce, M. Kudra, P. Delsing, and J. Bylander, npj Quantum Information5, 54 (2019)
work page 2019
-
[10]
M. Werninghaus, D. J. Egger, F. Roy, S. Machnes, F. K. Wilhelm, and S. Filipp, npj Quantum Information7, 14 (2021)
work page 2021
-
[12]
M. A. Rol, C. C. Bultink, T. E. O’Brien, S. R. de Jong, L. S. Theis, X. Fu, F. Luthi, R. F. L. Vermeulen, J. C. de Sterke, A. Bruno, D. Deurloo, R. N. Schouten, F. K. Wilhelm, and L. DiCarlo, Phys. Rev. Appl.7, 041001 (2017)
work page 2017
-
[13]
Physical qubit calibration on a directed acyclic graph
J. Kelly, P. O’Malley, M. Neeley, H. Neven, and J. M. Martinis, arXiv:1803.03226 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[14]
N. Wittler, F. Roy, K. Pack, M. Werninghaus, A. S. Roy, D. J. Egger, S. Filipp, F. K. Wilhelm, and S. Machnes, Phys. Rev. Appl.15, 034080 (2021)
work page 2021
-
[15]
A. S. Roy, K. Pack, N. Wittler, and S. Machnes, in 2025 17th International Conference on COMmunication Systems and NETworks (COMSNETS)(2025) pp. 1062– 1067
work page 2025
-
[16]
S. J. Glaser, U. Boscain, T. Calarco, C. P. Koch, W. K¨ ockenberger, R. Kosloff, I. Kuprov, B. Luy, S. Schirmer, T. Schulte-Herbr¨ uggen, D. Sugny, and F. K. Wilhelm, Eur. Phys. J. D69, 279 (2015)
work page 2015
-
[17]
S. Machnes, U. Sander, S. J. Glaser, P. de Fouqui` eres, A. Gruslys, S. Schirmer, and T. Schulte-Herbr¨ uggen, Phys. Rev. A84, 022305 (2011)
work page 2011
-
[18]
D. J. Egger and F. K. Wilhelm, Phys. Rev. Lett.112, 240503 (2014)
work page 2014
-
[19]
E. Magesan, J. M. Gambetta, and J. Emerson, Phys. Rev. Lett.106, 180504 (2011)
work page 2011
- [20]
-
[21]
J. M. Chow, J. M. Gambetta, L. Tornberg, J. Koch, L. S. Bishop, A. A. Houck, B. R. Johnson, L. Frunzio, S. M. Girvin, and R. J. Schoelkopf, Phys. Rev. Lett.102, 090502 (2009)
work page 2009
- [22]
-
[23]
A. D. C´ orcoles, J. M. Gambetta, J. M. Chow, J. A. Smolin, M. Ware, J. Strand, B. L. T. Plourde, and M. Steffen, Phys. Rev. A87, 030301(R) (2013)
work page 2013
-
[24]
J. L. O’Brien, G. J. Pryde, A. Gilchrist, D. F. V. James, N. K. Langford, T. C. Ralph, and A. G. White, Phys. Rev. Lett.93, 080502 (2004)
work page 2004
-
[25]
J. Kelly, R. Barends, B. Campbell, Y. Chen, Z. Chen, B. Chiaro, A. Dunsworth, A. G. Fowler, I.-C. Hoi, E. Jef- frey, A. Megrant, J. Mutus, C. Neill, P. J. J. O’Malley, C. Quintana, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. C. White, A. N. Cleland, and J. M. Mar- tinis, Phys. Rev. Lett.112, 240504 (2014)
work page 2014
- [26]
-
[27]
P. Probst, A.-L. Boulesteix, and B. Bischl, Journal of Machine Learning Research20, 1 (2019)
work page 2019
- [28]
-
[29]
Z. Li, X. Lin, Q. Zhang, and H. Liu, Swarm and Evolu- tionary Computation56, 100694 (2020)
work page 2020
-
[30]
J. Kennedy and R. C. Eberhart, inProceedings of the IEEE International Conference on Neural Networks (ICNN’95)(Perth, Australia, 1995) pp. 1942–1948
work page 1995
-
[31]
D. Bratton and J. Kennedy, inProceedings of the IEEE Swarm Intelligence Symposium(Honolulu, HI, 2007) pp. 120–127
work page 2007
-
[32]
M. J. Kochenderfer and T. A. Wheeler,Algorithms for Optimization(MIT Press, 2019)
work page 2019
-
[33]
Wang, An overview of spsa: recent development and applications (2020)
C. Wang, An overview of spsa: recent development and applications (2020)
work page 2020
-
[34]
J. A. Nelder and R. Mead, The Computer Journal7, 308 (1965)
work page 1965
- [35]
-
[36]
B. Shahriari, K. Swersky, Z. Wang, R. P. Adams, and N. de Freitas, Proceedings of the IEEE104, 148 (2016)
work page 2016
-
[37]
N. Hansen, S. D. M¨ uller, and P. Koumout- sakos, Evol. Comput.11, 1 (2003), https://doi.org/10.1162/106365603321828970
-
[38]
The CMA Evolution Strategy: A Tutorial
N. Hansen, The cma evolution strategy: A tutorial (2023), arXiv:1604.00772 [cs.LG]
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[39]
M. J. D. Powell, The Computer Journal7, 155 (1964), https://academic.oup.com/comjnl/article- 12 pdf/7/2/155/959784/070155.pdf
work page 1964
-
[40]
P. Virtanen, R. Gommers, T. E. Oliphant, M. Haber- land, T. Reddy, D. Cournapeau, E. Burovski, P. Pe- terson, W. Weckesser, J. Bright, S. J. van der Walt, M. Brett, J. Wilson, K. J. Millman, N. Mayorov, A. R. J. Nelson, E. Jones, R. Kern, E. Larson, C. J. Carey, ˙I. Po- lat, Y. Feng, E. W. Moore, J. VanderPlas, D. Laxalde, J. Perktold, R. Cimrman, I. Henr...
work page 2020
-
[41]
J. Rapin and O. Teytaud, Nevergrad - A gradient- free optimization platform, https://GitHub.com/ FacebookResearch/Nevergrad (2018)
work page 2018
- [43]
-
[44]
Z. Chen, J. Kelly, C. Quintana, R. Barends, B. Campbell, Y. Chen, B. Chiaro, A. Dunsworth, A. Fowler, E. Lucero, E. Jeffrey, A. Megrant, J. Mutus, M. Neeley, C. Neill, P. O’Malley, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. White, A. Korotkov, and J. M. Martinis, Phys. Rev. Lett.116, 020501 (2016)
work page 2016
-
[45]
Z. Chen, J. Kelly, C. Quintana, R. Barends, B. Camp- bell, Y. Chen, B. Chiaro, A. Dunsworth, A. G. Fowler, E. Lucero, E. Jeffrey, A. Megrant, J. Mutus, M. Nee- ley, C. Neill, P. J. J. O’Malley, P. Roushan, D. Sank, A. Vainsencher, J. Wenner, T. C. White, A. N. Korotkov, and J. M. Martinis, Phys. Rev. Lett.116, 020501 (2016)
work page 2016
- [46]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.