pith. sign in

arxiv: 2605.23138 · v1 · pith:PYRJ6LYVnew · submitted 2026-05-22 · 🪐 quant-ph · cs.AI· cs.ET· cs.LG

Classical State Preparation for Variational Quantum Algorithms via Reinforcement Learning

Pith reviewed 2026-05-25 04:58 UTC · model grok-4.3

classification 🪐 quant-ph cs.AIcs.ETcs.LG
keywords variational quantum algorithmsQAOAVQEClifford circuitsreinforcement learningstate preparationMonte Carlo Tree Search
0
0 comments X

The pith

A reinforcement learning agent selects Clifford gates to prepare initial states that improve energy accuracy in variational quantum algorithms by a mean of 3.17 times.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that framing Clifford prefix selection as a sequential decision problem solved by a Transformer-guided Monte Carlo Tree Search agent trained through self-play and curriculum learning produces initial states that accelerate VQA convergence. A sympathetic reader would care because this approach runs entirely in classical polynomial time via stabilizer simulation and leaves the original parameterized circuit unchanged. The method is tested on QAOA instances with up to 22 qubits and 1370 parameters, where it consistently beats prior Clifford initialization heuristics. The same framework shows robustness when applied to VQE tasks.

Core claim

CRiSP formulates discrete prefix selection as a sequential decision-making problem. CRiSP utilizes Neural-Guided Monte Carlo Tree Search, driven by a Transformer-based policy trained via self-play, to insert learned Clifford gates before fixed parameterized rotations. This enables the construction of high-quality initial states entirely through polynomial-time classical stabilizer simulation without altering the underlying circuit architecture. By integrating a curriculum learning strategy that progressively expands the search horizon, the agent efficiently scales to deep circuits.

What carries the argument

Neural-Guided Monte Carlo Tree Search with a Transformer-based policy trained via self-play and curriculum learning to select Clifford gate prefixes.

If this is right

  • On QAOA benchmarks up to 22 qubits, CRiSP delivers a mean 3.17× gain in average energy accuracy and 2.44× gain in best-achieved energy accuracy over existing Clifford methods.
  • The largest observed improvements reach 45.02× in average accuracy and 16.01× in best accuracy.
  • The same preparation procedure improves performance on VQE tasks without modification to the core method.
  • All state preparation occurs classically in polynomial time and does not change the structure of the subsequent parameterized quantum circuit.
  • Curriculum expansion of the search horizon enables scaling to circuits containing more than a thousand variational parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the policy generalizes beyond the training distribution, the approach could lower the cost of repeated hyperparameter searches when deploying VQAs to new problem families.
  • Pairing CRiSP with hardware-specific error mitigation might amplify its benefit on noisy intermediate-scale devices.
  • The same sequential-decision framing could be reused for other discrete classical choices inside quantum circuit compilation pipelines.
  • Direct measurement of wall-clock classical preparation time versus observed quantum optimization speedup would quantify the net resource savings.

Load-bearing premise

The quality advantage of states found by the learned policy transfers from classical simulation to execution on actual quantum hardware and to problem instances outside the training distribution.

What would settle it

Executing QAOA or VQE optimizations on a physical quantum processor with CRiSP-prepared states versus standard Clifford initial states and observing no statistically significant improvement in final energy values or number of iterations required.

Figures

Figures reproduced from arXiv: 2605.23138 by Dhanvi Bharadwaj, Gino Kwun, Gokul Subramanian Ravi.

Figure 1
Figure 1. Figure 1: Classical initialization for VQA. Among simulation-compatible gate sets, the Clifford group provides a theoretically principled discrete search space. Since Clifford circuits can be simulated in poly￾nomial time classically, they enable ex￾act computation of the initial energy ex￾pectation value without requiring quan￾tum hardware. Existing Clifford initial￾ization methods exploit this simulability through… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of our proposed neural-guided framework (CRiSP) for state preparation in VQAs. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Classical initialization accuracy for QAOA benchmarks. Asterisks (*) indicate tasks using [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: (a) Training history of MaxCut_12 benchmark. [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
read the original abstract

Variational Quantum Algorithms (VQAs) potentially offer a pathway to practical quantum advantage, but their optimization is heavily hindered by barren plateaus and numerous local minima. While classically simulable Clifford circuits can warm-start VQAs to accelerate convergence, existing heuristic-based initialization methods struggle to scale within vast combinatorial search spaces. To overcome this bottleneck, we propose CRiSP (a Clifford Reinforcement Learning agent for State Preparation), a framework that formulates discrete prefix selection as a sequential decision-making problem. CRiSP utilizes Neural-Guided Monte Carlo Tree Search, driven by a Transformer-based policy trained via self-play, to insert learned Clifford gates before fixed parameterized rotations. This enables the construction of high-quality initial states entirely through polynomial-time classical stabilizer simulation without altering the underlying circuit architecture. By integrating a curriculum learning strategy that progressively expands the search horizon, the agent efficiently scales to deep circuits. Evaluated on QAOA benchmarks of up to $22$ qubits and $1{,}370$ parameters, CRiSP outperforms state-of-the-art Clifford initialization methods by a mean of $3.17\times$ (max $45.02\times$) in average energy accuracy and $2.44\times$ (max $16.01\times$) in best-achieved energy accuracy. Assessments on VQE tasks further demonstrate the framework's robustness and generalizability.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript introduces CRiSP, a reinforcement learning framework that uses a Transformer-based policy trained via self-play, Neural-Guided Monte Carlo Tree Search, and curriculum learning to construct Clifford initial states for VQAs. These states are inserted before fixed parameterized rotations and are classically simulable in polynomial time via stabilizer methods. The central claim is that CRiSP outperforms existing Clifford initialization heuristics by a mean factor of 3.17× (maximum 45.02×) in average energy accuracy and 2.44× (maximum 16.01×) in best-achieved energy accuracy on QAOA benchmarks with up to 22 qubits and 1,370 parameters; additional robustness is claimed on VQE tasks.

Significance. If the performance advantage is shown to arise from genuine generalization, the work would offer a concrete, scalable classical preprocessing technique that mitigates barren-plateaus and local-minima issues in VQAs without changing the ansatz architecture. The combination of RL-driven discrete gate selection with efficient stabilizer simulation is technically coherent and could seed follow-on work on learned circuit initializers. The absence of hardware-noise or out-of-distribution results limits immediate claims of practical impact.

major comments (1)
  1. [Abstract] Abstract (and the Evaluation section): the headline performance numbers (3.17× mean, up to 45.02×) are reported on QAOA benchmark instances, yet the manuscript supplies no information on the instance-generation procedure, train/test splits, or confirmation that the 22-qubit graphs and depths used for evaluation were held out from the curriculum-expansion phase. This is load-bearing for the central claim; without explicit evidence that the benchmarks lie outside the training distribution, the reported gains cannot be unambiguously attributed to the superiority of the learned policy rather than in-distribution fitting.
minor comments (1)
  1. [Abstract] Abstract: the quantitative claims are presented without reference to the number of random seeds, hyperparameter sensitivity, or whether the reported energy accuracies include error bars or statistical significance tests.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their detailed and constructive review. The concern about benchmark transparency is well-taken and directly affects the strength of the generalization claim. We address it below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract (and the Evaluation section): the headline performance numbers (3.17× mean, up to 45.02×) are reported on QAOA benchmark instances, yet the manuscript supplies no information on the instance-generation procedure, train/test splits, or confirmation that the 22-qubit graphs and depths used for evaluation were held out from the curriculum-expansion phase. This is load-bearing for the central claim; without explicit evidence that the benchmarks lie outside the training distribution, the reported gains cannot be unambiguously attributed to the superiority of the learned policy rather than in-distribution fitting.

    Authors: We agree that the manuscript currently omits the necessary details on instance generation, train/test splits, and held-out status, which weakens the ability to attribute gains to generalization. In the revised version we will add a new subsection (Evaluation: Benchmark Construction and Data Splits) that explicitly describes: (i) the precise procedure used to generate the QAOA MaxCut instances (including graph ensemble, edge-weight distribution, and depth selection), (ii) the composition of the curriculum used during policy training, and (iii) verification that all 22-qubit evaluation graphs and depths were excluded from both the self-play training set and the curriculum-expansion schedule. We will also report the exact number of training instances and the random seeds employed to enable independent reproduction. These additions will allow readers to confirm that the reported 3.17× mean improvement reflects out-of-distribution performance. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical gains measured against external baselines on fixed benchmarks

full rationale

The paper's central claims consist of empirical performance numbers (mean 3.17× improvement etc.) obtained by running the trained CRiSP policy on QAOA benchmark instances and comparing energy accuracy directly to published state-of-the-art Clifford initializers. These numbers are not obtained by re-expressing any fitted parameter or self-citation as a prediction; the evaluation protocol is external to the training loop and does not reduce to the paper's own equations by construction. No load-bearing uniqueness theorem, ansatz smuggling, or renaming of known results appears in the provided text. The derivation chain (RL policy + curriculum + stabilizer simulation) therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The abstract relies on standard assumptions of efficient classical Clifford simulation and standard RL training procedures; no new free parameters, axioms, or invented entities are explicitly introduced beyond those implicit in any RL application.

axioms (1)
  • standard math Clifford circuits admit efficient classical simulation via the stabilizer formalism
    Invoked to justify that state preparation and evaluation occur in polynomial time without quantum hardware.

pith-pipeline@v0.9.0 · 5781 in / 1345 out tokens · 43630 ms · 2026-05-25T04:58:58.708431+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

108 extracted references · 46 canonical work pages · 10 internal anchors

  1. [1]

    Improved simulation of stabilizer circuits.Physical Review A—Atomic, Molecular, and Optical Physics, 70(5):052328, 2004

    Scott Aaronson and Daniel Gottesman. Improved simulation of stabilizer circuits.Physical Review A—Atomic, Molecular, and Optical Physics, 70(5):052328, 2004

  2. [2]

    Quadratization of higher de- gree pseudo-boolean functions

    Martin Anthony, Endre Boros, Yves Crama, and Aritanan Gruber. Quadratization of higher de- gree pseudo-boolean functions. InInternational Conference on Integer Programming and Com- binatorial Optimization, pages 458–469. Springer, 2017. doi: 10.1007/978-3-319-59250-3_37

  3. [3]

    Layer Normalization

    Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. Layer normalization.arXiv preprint arXiv:1607.06450, 2016

  4. [4]

    Beating the random assignment on constraint satisfaction problems of bounded degree.arXiv, August

    Boaz Barak, Ankur Moitra, Ryan O’Donnell, Prasad Raghavendra, Oded Regev, David Steurer, Luca Trevisan, Aravindan Vijayaraghavan, David Witmer, and John Wright. Beating the random assignment on constraint satisfaction problems of bounded degree.arXiv, August

  5. [5]

    doi: 10.48550/arXiv.1505.03424

  6. [6]

    Curriculum learning

    Yoshua Bengio, Jérôme Louradour, Ronan Collobert, and Jason Weston. Curriculum learning. InProceedings of the 26th annual international conference on machine learning, pages 41–48, 2009

  7. [7]

    Scalable clifford-based classical initialization for the quantum approximate optimization algorithm

    Dhanvi Bharadwaj, Yuewen Hou, Guang-Yi Li, and Gokul Subramanian Ravi. Scalable clifford-based classical initialization for the quantum approximate optimization algorithm. arXiv preprint arXiv:2602.14327, 2026

  8. [8]

    Quantum machine learning.Nature, 549(7671):195–202, 2017

    Jacob Biamonte, Peter Wittek, Nicola Pancotti, Patrick Rebentrost, Nathan Wiebe, and Seth Lloyd. Quantum machine learning.Nature, 549(7671):195–202, 2017

  9. [9]

    A review on quantum approximate optimization algorithm and its variants.Physics Reports, 1068:1–66, 2024

    Kostas Blekos, Dean Brand, Andrea Ceschini, Chiao-Hui Chou, Rui-Hao Li, Komal Pandya, and Alessandro Summer. A review on quantum approximate optimization algorithm and its variants.Physics Reports, 1068:1–66, 2024

  10. [10]

    Heisenberg xxz model and quantum galilei group.Journal of Physics A: Mathematical and General, 25(15):L939, aug 1992

    F Bonechi, E Celeghini, R Giachetti, E Sorace, and M Tarlini. Heisenberg xxz model and quantum galilei group.Journal of Physics A: Mathematical and General, 25(15):L939, aug 1992. doi: 10.1088/0305-4470/25/15/007. URL https://dx.doi.org/10.1088/ 0305-4470/25/15/007

  11. [11]

    Endre Boros and Peter L. Hammer. Pseudo-boolean optimization.Discrete Applied Mathe- matics, 123(1-3):155–225, 2002. doi: 10.1016/S0166-218X(01)00341-9. 10

  12. [12]

    Local search heuristics for quadratic unconstrained binary optimization (qubo).Journal of Heuristics, 13(2):99–132, 2007

    Endre Boros, Peter L Hammer, and Gabriel Tavares. Local search heuristics for quadratic unconstrained binary optimization (qubo).Journal of Heuristics, 13(2):99–132, 2007

  13. [13]

    Universal quantum computation with ideal clifford gates and noisy ancillas.Physical Review A—Atomic, Molecular, and Optical Physics, 71(2):022316, 2005

    Sergey Bravyi and Alexei Kitaev. Universal quantum computation with ideal clifford gates and noisy ancillas.Physical Review A—Atomic, Molecular, and Optical Physics, 71(2):022316, 2005

  14. [14]

    A survey of monte carlo tree search methods.IEEE Transactions on Computational Intelligence and AI in games, 4(1):1–43, 2012

    Cameron B Browne, Edward Powley, Daniel Whitehouse, Simon M Lucas, Peter I Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez, Spyridon Samothrakis, and Simon Colton. A survey of monte carlo tree search methods.IEEE Transactions on Computational Intelligence and AI in games, 4(1):1–43, 2012

  15. [15]

    Variational quantum algorithms.Nature Reviews Physics, 3(9):625–644, 2021

    Marco Cerezo, Andrew Arrasmith, Ryan Babbush, Simon C Benjamin, Suguru Endo, Keisuke Fujii, Jarrod R McClean, Kosuke Mitarai, Xiao Yuan, Lukasz Cincio, et al. Variational quantum algorithms.Nature Reviews Physics, 3(9):625–644, 2021

  16. [16]

    Clifford circuit initialization for variational quantum algorithms.Physical Review A, 111(6):062413, 2025

    MH Cheng, KE Khosla, CN Self, M Lin, BX Li, AC Medina, and MS Kim. Clifford circuit initialization for variational quantum algorithms.Physical Review A, 111(6):062413, 2025

  17. [17]

    Love, Juspreet Singh Sandhu, and Jonathan Shi

    Chi-Ning Chou, Peter J. Love, Juspreet Singh Sandhu, and Jonathan Shi. Limitations of Local Quantum Algorithms on Random Max-k-XOR and Beyond, February 2022

  18. [18]

    Barry A. Cipra. An introduction to the ising model.The American Mathematical Monthly, 94 (10):937–959, 1987. doi: 10.1080/00029890.1987.12000742. URL https://doi.org/10. 1080/00029890.1987.12000742

  19. [19]

    Smith, Jonathan M

    Siddharth Dangwal, Gokul Subramanian Ravi, Poulami Das, Kaitlin N. Smith, Jonathan M. Baker, and Frederic T. Chong. Varsaw: Application-tailored measurement error mitigation for variational quantum algorithms, 2023

  20. [20]

    A Quantum Approximate Optimization Algorithm Applied to a Bounded Occurrence Constraint Problem

    E. Farhi, J. Goldstone, and S. Gutmann. A Quantum Approximate Optimization Algorithm Applied to a Bounded Occurrence Constraint Problem.arXiv, June 2015. doi: 10.48550/arXiv. 1412.6062

  21. [21]

    Restrictions on transversal encoded quantum gate sets

    Bryan Eastin and Emanuel Knill. Restrictions on transversal encoded quantum gate sets. Physical review letters, 102(11):110502, 2009

  22. [22]

    Investigating the limits of randomized benchmarking protocols.Physical Review A, 89(6):062321, 2014

    Jeffrey M Epstein, Andrew W Cross, Easwar Magesan, and Jay M Gambetta. Investigating the limits of randomized benchmarking protocols.Physical Review A, 89(6):062321, 2014

  23. [23]

    A Quantum Approximate Optimization Algorithm

    Edward Farhi, Jeffrey Goldstone, and Sam Gutmann. A quantum approximate optimization algorithm.arXiv preprint arXiv:1411.4028, 2014

  24. [24]

    The Quantum Approximate Optimization Algorithm and the Sherrington-Kirkpatrick Model at Infinite Size.Quantum, 6: 759, July 2022

    Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Leo Zhou. The Quantum Approximate Optimization Algorithm and the Sherrington-Kirkpatrick Model at Infinite Size.Quantum, 6: 759, July 2022. ISSN 2521-327X. doi: 10.22331/q-2022-07-07-759

  25. [25]

    Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, et al

    Alhussein Fawzi, Matej Balog, Aja Huang, Thomas Hubert, Bernardino Romera-Paredes, Mohammadamin Barekatain, Alexander Novikov, Francisco J R. Ruiz, Julian Schrittwieser, Grzegorz Swirszcz, et al. Discovering faster matrix multiplication algorithms with reinforce- ment learning.Nature, 610(7930):47–53, 2022

  26. [26]

    Implementing the nelder-mead simplex algorithm with adaptive parameters.Comput

    Fuchang Gao and Lixing Han. Implementing the nelder-mead simplex algorithm with adaptive parameters.Comput. Optim. Appl., 51(1):259–277, January 2012. ISSN 0926-6003. doi: 10.1007/s10589-010-9329-3. URLhttps://doi.org/10.1007/s10589-010-9329-3

  27. [27]

    Stim: a fast stabilizer circuit simulator.Quantum, 5:497, July 2021

    Craig Gidney. Stim: a fast stabilizer circuit simulator.Quantum, 5:497, July 2021. ISSN 2521-327X. doi: 10.22331/q-2021-07-06-497. URL https://doi.org/10.22331/ q-2021-07-06-497

  28. [28]

    Magic state cultivation: growing T states as cheap as CNOT gates

    Craig Gidney, Noah Shutty, and Cody Jones. Magic state cultivation: growing t states as cheap as cnot gates.arXiv preprint arXiv:2409.17595, 2024. 11

  29. [29]

    Digital zero noise extrapolation for quantum error mitigation

    Tudor Giurgica-Tiron, Yousef Hindy, Ryan LaRose, Andrea Mari, and William J Zeng. Digital zero noise extrapolation for quantum error mitigation. In2020 IEEE International Conference on Quantum Computing and Engineering (QCE), pages 306–316. IEEE, 2020

  30. [30]

    A tutorial on formulating and using qubo models

    Fred Glover, Gary Kochenberger, and Yu Du. A tutorial on formulating and using qubo models. arXiv preprint arXiv:1811.11538, 2019. URLhttps://arxiv.org/abs/1811.11538

  31. [31]

    Google quantum computing roadmap

    Google. Google quantum computing roadmap. https://quantumai.google/roadmap,

  32. [33]

    The Heisenberg Representation of Quantum Computers

    Daniel Gottesman. The heisenberg representation of quantum computers.arXiv preprint quant-ph/9807006, 1998

  33. [34]

    An initializa- tion strategy for addressing barren plateaus in parametrized quantum circuits.Quantum, 3:214, 2019

    Edward Grant, Leonard Wossnig, Mateusz Ostaszewski, and Marcello Benedetti. An initializa- tion strategy for addressing barren plateaus in parametrized quantum circuits.Quantum, 3:214, 2019

  34. [35]

    Grimsley, Sophia E

    Harper R. Grimsley, Sophia E. Economou, Edwin Barnes, and Nicholas J. Mayhall. An adaptive variational algorithm for exact molecular simulations on a quantum computer.Nature Communications, 10(1), Jul 2019. ISSN 2041-1723. doi: 10.1038/s41467-019-10988-2. URL http://dx.doi.org/10.1038/s41467-019-10988-2

  35. [36]

    A fast quantum mechanical algorithm for database search

    Lov K Grover. A fast quantum mechanical algorithm for database search. InProceedings of the twenty-eighth annual ACM symposium on Theory of computing, pages 212–219, 1996

  36. [37]

    Gurobi Optimizer Reference Manual, 2024

    Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2024. URL https://www. gurobi.com

  37. [38]

    Rieffel, Davide Venturelli, and Rupak Biswas

    Stuart Hadfield, Zhihui Wang, Bryan O’Gorman, Eleanor G. Rieffel, Davide Venturelli, and Rupak Biswas. From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz.Algorithms, 12(2):34, February 2019. doi: 10.3390/a12020034

  38. [39]

    Reducing t gates with unitary synthesis.arXiv preprint arXiv:2503.15843, 2025

    Tianyi Hao, Amanda Xu, and Swamit Tannu. Reducing t gates with unitary synthesis.arXiv preprint arXiv:2503.15843, 2025

  39. [40]

    Array programming with numpy.nature, 585(7825):357–362, 2020

    Charles R Harris, K Jarrod Millman, Stéfan J Van Der Walt, Ralf Gommers, Pauli Virtanen, David Cournapeau, Eric Wieser, Julian Taylor, Sebastian Berg, Nathaniel J Smith, et al. Array programming with numpy.nature, 585(7825):357–362, 2020

  40. [41]

    M. B. Hastings. Classical and Quantum Bounded Depth Approximation Algorithms.arXiv, August 2019. doi: 10.48550/arXiv.1905.07047

  41. [42]

    Gaussian Error Linear Units (GELUs)

    Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016

  42. [43]

    Multi-angle quantum approximate optimization algorithm.Scientific Reports, 12(1):6781, 2022

    Rebekah Herrman, Phillip C Lotshaw, James Ostrowski, Travis S Humble, and George Siopsis. Multi-angle quantum approximate optimization algorithm.Scientific Reports, 12(1):6781, 2022

  43. [44]

    Robust estimation of a location parameter

    Peter J Huber. Robust estimation of a location parameter. InBreakthroughs in statistics: Methodology and distribution, pages 492–518. Springer, 1992

  44. [45]

    https://www.ibm.com/products/ ilog-cplex-optimization-studio

    IBM.IBM ILOG CPLEX Optimization Studio, 2024. https://www.ibm.com/products/ ilog-cplex-optimization-studio

  45. [46]

    Ibm quantum computing roadmap

    IBM. Ibm quantum computing roadmap. https://www.ibm.com/roadmaps/quantum,

  46. [47]

    Accessed: 2024-11-22

  47. [48]

    Preskill

    J. Preskill. Beyond nisq: The megaquop machine.ACM Transactions on Quantum Computing, 6(3):1–7, April 2025. ISSN 2643-6817. doi: 10.1145/3723153. URL http://dx.doi.org/ 10.1145/3723153. 12

  48. [49]

    Tetris: A compilation framework for vqa applications in quantum computing

    Yuwei Jin, Zirui Li, Fei Hua, Tianyi Hao, Huiyang Zhou, Yipeng Huang, and Eddy Z Zhang. Tetris: A compilation framework for vqa applications in quantum computing. In2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA), pages 277–292. IEEE, 2024

  49. [50]

    Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets.Nature, 549(7671):242–246, 2017

    Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M Chow, and Jay M Gambetta. Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets.Nature, 549(7671):242–246, 2017

  50. [51]

    Amara Katabarwa, Katerina Gratsea, Athena Caesura, and Peter D. Johnson. Early fault- tolerant quantum computing.PRX Quantum, 5:020101, Jun 2024. doi: 10.1103/PRXQuantum. 5.020101. URLhttps://link.aps.org/doi/10.1103/PRXQuantum.5.020101

  51. [52]

    Evidence for the utility of quantum computing before fault tolerance.Nature, 618(7965): 500–505, 2023

    Youngseok Kim, Andrew Eddins, Sajant Anand, Ken Xuan Wei, Ewout Van Den Berg, Sami Rosenblatt, Hasan Nayfeh, Yantao Wu, Michael Zaletel, Kristan Temme, and Abhinav Kandala. Evidence for the utility of quantum computing before fault tolerance.Nature, 618(7965): 500–505, 2023

  52. [53]

    Adam: A Method for Stochastic Optimization

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization, 2017. URL https://arxiv.org/abs/1412.6980

  53. [54]

    Fast and efficient exact synthesis of single qubit unitaries generated by Clifford and T gates

    Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. Fast and efficient exact synthesis of single qubit unitaries generated by clifford and t gates.arXiv preprint arXiv:1206.5236, 2012

  54. [55]

    The unconstrained binary quadratic programming problem: a survey.Journal of Combinatorial Optimization, 28:58–81, 2014

    Gary Kochenberger, Jin-Kao Hao, Fred Glover, Mark Lewis, Zhipeng Lü, Haibo Wang, and Yang Wang. The unconstrained binary quadratic programming problem: a survey.Journal of Combinatorial Optimization, 28:58–81, 2014. doi: 10.1007/s10878-014-9734-0

  55. [56]

    Optimizing the non-clifford- count in unitary synthesis using reinforcement learning.arXiv preprint arXiv:2509.21709, 2025

    David Kremer, Ali Javadi-Abhari, and Priyanka Mukhopadhyay. Optimizing the non-clifford- count in unitary synthesis using reinforcement learning.arXiv preprint arXiv:2509.21709, 2025

  56. [57]

    Classical optimizers for noisy intermediate-scale quantum devices

    Wim Lavrijsen, Ana Tudor, Juliane Müller, Costin Iancu, and Wibe de Jong. Classical optimizers for noisy intermediate-scale quantum devices. In2020 IEEE International Conference on Quantum Computing and Engineering (QCE), pages 267–277, 2020. doi: 10.1109/QCE49297.2020.00041

  57. [58]

    A Depth- Progressive Initialization Strategy for Quantum Approximate Optimization Algorithm.Mathe- matics, 11(9):2176, 2023

    Xinwei Lee, Ningyi Xie, Dongsheng Cai, Yoshiyuki Saito, and Nobuyoshi Asai. A Depth- Progressive Initialization Strategy for Quantum Approximate Optimization Algorithm.Mathe- matics, 11(9):2176, 2023. doi: 10.3390/math11092176

  58. [59]

    Iterative layerwise training for the quantum approximate optimization algorithm.Physical Review A, 109(5):052406, 2024

    Xinwei Lee, Xinjian Yan, Ningyi Xie, Dongsheng Cai, Yoshiyuki Saito, and Nobuyoshi Asai. Iterative layerwise training for the quantum approximate optimization algorithm.Physical Review A, 109(5):052406, 2024

  59. [60]

    Paulihedral: a generalized block-wise compiler optimization framework for quantum simulation kernels

    Gushu Li, Anbang Wu, Yunong Shi, Ali Javadi-Abhari, Yufei Ding, and Yuan Xie. Paulihedral: a generalized block-wise compiler optimization framework for quantum simulation kernels. InProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, pages 554–569, 2022

  60. [61]

    Performance of QAOA on Typical Instances of Constraint Satisfaction Problems with Bounded Degree.arXiv, January 2016

    Cedric Yen-Yu Lin and Yechao Zhu. Performance of QAOA on Typical Instances of Constraint Satisfaction Problems with Bounded Degree.arXiv, January 2016. doi: 10.48550/arXiv.1601. 01744

  61. [62]

    Magic state distillation: Not as costly as you think.Quantum, 3:205, 2019

    Daniel Litinski. Magic state distillation: Not as costly as you think.Quantum, 3:205, 2019

  62. [63]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

  63. [64]

    Cross-entropy loss functions: Theoretical analysis and applications

    Anqi Mao, Mehryar Mohri, and Yutao Zhong. Cross-entropy loss functions: Theoretical analysis and applications. InInternational conference on Machine learning, pages 23803– 23828. pmlr, 2023. 13

  64. [65]

    Local classical MAX-CUT algorithm outperforms $p=2$ QAOA on high-girth regular graphs.Quantum, 5, April 2021

    Kunal Marwaha. Local classical MAX-CUT algorithm outperforms $p=2$ QAOA on high-girth regular graphs.Quantum, 5, April 2021. ISSN 2521-327X. doi: 10.22331/ q-2021-04-20-437

  65. [66]

    Reinforcement learning for combinatorial optimization: A survey.Computers & Operations Research, 134: 105400, 2021

    Nina Mazyavkina, Sergey Sviridov, Sergei Ivanov, and Evgeny Burnaev. Reinforcement learning for combinatorial optimization: A survey.Computers & Operations Research, 134: 105400, 2021

  66. [67]

    The theory of variational hybrid quantum-classical algorithms.New Journal of Physics, 18(2):023023, 2016

    Jarrod R McClean, Jonathan Romero, Ryan Babbush, and Alán Aspuru-Guzik. The theory of variational hybrid quantum-classical algorithms.New Journal of Physics, 18(2):023023, 2016

  67. [68]

    Barren plateaus in quantum neural network training landscapes.Nat

    Jarrod R McClean, Sergio Boixo, Vadim N Smelyanskiy, Ryan Babbush, and Hartmut Neven. Barren plateaus in quantum neural network training landscapes.Nat. Commun., 9(1):4812, 2018

  68. [69]

    Quantum optimization using variational algorithms on near-term quantum devices

    Nikolaj Moll, Panagiotis Barkoutsos, Lev Bishop, Jerry Chow, Andrew Cross, Daniel Egger, Stefan Filipp, Andreas Fuhrer, Jay Gambetta, Marc Ganzhorn, Abhinav Kandala, Antonio Mezzacapo, Peter Müller, Walter Riess, Gian Salis, John Smolin, Ivano Tavernelli, and Kristan Temme. Quantum optimization using variational algorithms on near-term quantum devices. Qu...

  69. [70]

    Nielsen and Isaac L

    Michael A. Nielsen and Isaac L. Chuang.Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press, 2010

  70. [71]

    Combinatorial optimization in science and engineering.Current Science, pages 2268–2274, 2017

    Julius Beneoluchi Odili. Combinatorial optimization in science and engineering.Current Science, pages 2268–2274, 2017

  71. [72]

    Campbell

    Joe O’Gorman and Earl T. Campbell. Quantum computation with realistic magic-state factories. Physical Review A, 95(3), Mar 2017. ISSN 2469-9934. doi: 10.1103/physreva.95.032338. URLhttp://dx.doi.org/10.1103/PhysRevA.95.032338

  72. [73]

    Horizon reduction makes rl scalable.arXiv preprint arXiv:2506.04168, 2025

    Seohong Park, Kevin Frans, Deepinder Mann, Benjamin Eysenbach, Aviral Kumar, and Sergey Levine. Horizon reduction makes rl scalable.arXiv preprint arXiv:2506.04168, 2025

  73. [74]

    Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library.Advances in neural information processing systems, 32, 2019

  74. [75]

    A variational eigenvalue solver on a photonic quantum processor.Nature communications, 5:4213, 2014

    Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J Love, Alán Aspuru-Guzik, and Jeremy L O’brien. A variational eigenvalue solver on a photonic quantum processor.Nature communications, 5:4213, 2014

  75. [76]

    The one-dimensional ising model with a transverse field.ANNALS of Physics, 57(1):79–90, 1970

    Pierre Pfeuty. The one-dimensional ising model with a transverse field.ANNALS of Physics, 57(1):79–90, 1970

  76. [77]

    Quantum computing in the nisq era and beyond.Quantum, 2:79, 2018

    John Preskill. Quantum computing in the nisq era and beyond.Quantum, 2:79, 2018

  77. [78]

    Qiskit: An open-source framework for quantum computing, 2023

    Qiskit contributors. Qiskit: An open-source framework for quantum computing, 2023

  78. [79]

    Prentice Hall Upper Saddle River, NJ, 1998

    Ronald L Rardin.Optimization in operations research, volume 166. Prentice Hall Upper Saddle River, NJ, 1998

  79. [80]

    Smith, Pranav Gokhale, Andrea Mari, Nathan Earnest, Ali Javadi-Abhari, and Frederic T

    Gokul Subramanian Ravi, Kaitlin N. Smith, Pranav Gokhale, Andrea Mari, Nathan Earnest, Ali Javadi-Abhari, and Frederic T. Chong. Vaqem: A variational approach to quantum error mitigation, 2021

  80. [81]

    Cafqa: A classical simulation bootstrap for variational quantum algorithms

    Gokul Subramanian Ravi, Pranav Gokhale, Yi Ding, William Kirby, Kaitlin Smith, Jonathan M Baker, Peter J Love, Henry Hoffmann, Kenneth R Brown, and Frederic T Chong. Cafqa: A classical simulation bootstrap for variational quantum algorithms. InProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Opera...

Showing first 80 references.