arxiv: 2604.09756 · v1 · submitted 2026-04-10 · 🪐 quant-ph

Generative Circuit Design for Quantum-Selected Configuration Interaction

Ryota Kemmoku , Qi Gao , Shu Kanno , Kimberlee Keithley , Ikko Hamamura , Naoki Yamamoto , Kouhei Nakaji This is my paper

Pith reviewed 2026-05-10 16:54 UTC · model grok-4.3

classification 🪐 quant-ph

keywords quantum-selected configuration interactiongenerative quantum eigensolvertransformer policyquantum circuit optimizationmolecular ground stateschemical precisiongate count reductionconfiguration interaction

0 comments

The pith

A Transformer policy generates quantum circuits for QSCI that reach chemical precision using 98% fewer two-qubit gates than Trotter approximations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a Generative Quantum Eigensolver framework that employs a Transformer policy to optimize the structure of quantum circuits used in Quantum-selected configuration interaction. In this approach, circuits prepare states from which Slater determinants are sampled to form a subspace for classical diagonalization of the molecular Hamiltonian. When tested on the nitrogen molecule across active spaces reaching 32 qubits, the resulting circuits achieve chemical accuracy while requiring far fewer gates than circuits based on time evolution. The wavefunctions produced are also more compact than those from heat-bath configuration interaction, particularly in stretched-bond regimes where the required subspace size is halved.

Core claim

We present a Generative Quantum Eigensolver-based framework that optimizes ansatz structures using a Transformer policy trained on the QSCI subspace energy. We validate the framework on N2 in active spaces of up to 32 qubits. We found that the optimized circuits reach chemical precision with substantially lower gate counts than time-evolved circuits. Quantitatively, this corresponds to an average reduction of 98% in the required two-qubit gate count relative to the single-step first-order Trotterized approximation and 83% relative to the qDRIFT approximation. Furthermore, the resulting wavefunctions are competitive with heat-bath configuration interaction (HCI) in terms of compactness. In a

What carries the argument

The Transformer policy inside the Generative Quantum Eigensolver framework, which selects circuit structures by minimizing the QSCI subspace energy to produce compact state-preparation ansatze.

If this is right

Hardware implementations of QSCI become feasible on devices with limited coherence because gate counts drop by nearly two orders of magnitude.
State preparation for configuration interaction no longer requires hand-crafted Trotter or qDRIFT circuits but can be generated automatically for each problem.
In strongly correlated regimes the method supplies subspaces that are half the size of those needed by heat-bath configuration interaction while still reaching chemical accuracy.
Resource estimates for larger molecular simulations on quantum hardware can be revised downward when generative circuit design replaces fixed ansatze.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the same policy architecture succeeds on molecules beyond N2, the approach could be used as a general preprocessor for any QSCI or variational quantum chemistry calculation.
The compactness advantage in stretched geometries suggests the generated circuits capture multi-reference character more efficiently than single-reference time-evolution methods.
Combining the generative policy with error-mitigation techniques might further lower the total quantum resources needed for accurate ground-state estimates.

Load-bearing premise

The Transformer policy trained specifically on N2 subspace energies produces circuits that generalize to other molecules while preserving the reported gate reductions and accuracy in actual device implementations.

What would settle it

Re-running the full optimization pipeline on a second molecule such as H2O or LiH, then measuring the two-qubit gate count needed to reach chemical precision and comparing subspace size against HCI, would confirm or refute the claimed reductions.

Figures

Figures reproduced from arXiv: 2604.09756 by Ikko Hamamura, Kimberlee Keithley, Kouhei Nakaji, Naoki Yamamoto, Qi Gao, Ryota Kemmoku, Shu Kanno.

**Figure 2.** Figure 2: FIG. 2: Refinement workflow [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: summarizes the first 100 optimization iterations. The energy error relative to the CASCI energy in [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4: Sampling efficiency for N [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5: Two-qubit-gate efficiency for N [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6: Wavefunction compactness for N [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7: Scaling with active-space size at fixed [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8: Comparison between VQE and GQE on N [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 9.** Figure 9: shows that the gate-efficiency advantage of the proposed workflow is preserved, and becomes even clearer, under this fault-tolerant proxy. Across all three bond lengths, chemical precision is reached with only a few tens to 102 rotation gates for the optimized and refined circuits. LUCJ still fails to reach chemical precision within the scanned range, while SqDRIFT typically requires several hundred rotati… view at source ↗

read the original abstract

Quantum-selected configuration interaction (QSCI) has emerged as a feasible approach for approximating electronic ground states on noisy quantum devices toward large-system demonstrations. In QSCI, Slater determinants are sampled from a quantum-prepared state, and the Hamiltonian is then diagonalized in the sampled subspace. To create a high-quality subspace under hardware constraints, the design of the state-preparation circuit is crucial. Here, we present a Generative Quantum Eigensolver (GQE)-based framework that optimizes ansatz structures using a Transformer policy trained on the QSCI subspace energy. We validate the framework on N2 in active spaces of up to 32 qubits. We found that the optimized circuits reach chemical precision with substantially lower gate counts than time-evolved circuits. Quantitatively, this corresponds to an average reduction of 98% in the required two-qubit gate count relative to the single-step first-order Trotterized approximation and 83% relative to the qDRIFT approximation. Furthermore, the resulting wavefunctions are competitive with heat-bath configuration interaction (HCI) in terms of compactness. In stretched-bond, strongly correlated regimes, they achieve chemical precision with subspaces that are 50% smaller than those required by HCI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper trains a Transformer policy to generate low-gate circuits for QSCI on N2, reporting 98% and 83% two-qubit gate reductions versus Trotter and qDRIFT plus HCI-competitive subspace sizes.

read the letter

The core advance is a generative framework that uses a Transformer policy trained on QSCI subspace energy to design state-preparation circuits. On N2 in active spaces up to 32 qubits, the resulting circuits reach chemical accuracy with far fewer gates than standard time-evolution methods and produce subspaces that are competitive with or smaller than those from heat-bath configuration interaction, especially in stretched-bond regimes. The quantitative claims are tied directly to the diagonalized energy, which keeps the evaluation straightforward. What stands out is the move beyond fixed ansatze or Trotterized evolution to a learned generative policy that explores circuit structure while optimizing the exact quantity used in QSCI. The comparisons to single-step first-order Trotter, qDRIFT, and HCI give clear benchmarks that readers can check. The main limitations are scope and verification details. All validation stays on N2, so it is unclear how well the policy transfers to other molecules or larger active spaces. Training directly on the subspace energy metric carries some risk of circularity or bias, even with the generative component. The abstract leaves open questions about training stability, regularization, and whether the circuits remain effective under realistic device noise. No hardware runs are mentioned. This is useful for researchers working on NISQ quantum chemistry algorithms, particularly those combining machine learning with selected configuration interaction or circuit optimization. It gives concrete gate-count and compactness numbers that can inform follow-up work. I would send it for peer review. The claims are specific and falsifiable, and the idea merits referee input even if broader testing is needed.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a Generative Quantum Eigensolver (GQE) framework that uses a Transformer policy network to optimize quantum circuit ansatze for state preparation within Quantum-Selected Configuration Interaction (QSCI). Validated on N2 in active spaces up to 32 qubits, the optimized circuits are reported to reach chemical precision with an average 98% reduction in two-qubit gate count relative to single-step first-order Trotterization and 83% relative to qDRIFT, while producing subspaces that are competitive with (and in stretched-bond regimes up to 50% smaller than) those from heat-bath configuration interaction (HCI).

Significance. If the quantitative claims hold, the work provides a concrete, falsifiable demonstration that machine-learned circuit design can substantially lower hardware requirements for QSCI-based quantum chemistry while maintaining accuracy comparable to established classical methods. The focus on direct optimization of the QSCI subspace energy and the provision of specific gate-count and subspace-size metrics are strengths that enable direct comparison and follow-up work.

major comments (2)

[Abstract and Results] Abstract and validation results: the reported average reductions of 98% and 83% in two-qubit gate counts, as well as the 50% subspace-size improvement versus HCI, are stated without reference to accompanying tables, figures, or explicit per-configuration data (including variances or the precise averaging procedure over bond lengths and active spaces); this information is load-bearing for the central quantitative claim.
[Methods] Policy training description: because the Transformer is trained directly on the QSCI subspace energy, the manuscript must supply full details on training data generation, loss formulation, regularization, and convergence diagnostics to allow assessment of optimization bias and to confirm that the generative component provides genuine independence from the evaluation metric.

minor comments (2)

[Introduction] Define the numerical threshold used for 'chemical precision' (e.g., energy error in mHartree) at first use and apply it consistently in all comparisons.
[Results] Add error bars or run-to-run statistics to any figures or tables that report gate counts or subspace sizes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful and constructive review. The comments highlight important areas for improving clarity and completeness, particularly regarding quantitative claims and methodological transparency. We address each point below and have made revisions to incorporate additional details, references, and explanations as needed.

read point-by-point responses

Referee: [Abstract and Results] Abstract and validation results: the reported average reductions of 98% and 83% in two-qubit gate counts, as well as the 50% subspace-size improvement versus HCI, are stated without reference to accompanying tables, figures, or explicit per-configuration data (including variances or the precise averaging procedure over bond lengths and active spaces); this information is load-bearing for the central quantitative claim.

Authors: We agree that the abstract would be strengthened by direct references to the supporting data. In the revised manuscript, we have added explicit citations to Figure 4 (gate-count comparisons across bond lengths) and Table 2 (subspace-size metrics versus HCI). Section 4.1 now details the averaging procedure: means and standard deviations are computed over N2 bond lengths from 1.0 Å to 2.5 Å (0.25 Å increments) and active spaces (6,6), (8,8), (10,10). Per-instance data with variances are provided in the supplementary material (Table S1). These changes make the central claims fully traceable without altering the reported values. revision: yes
Referee: [Methods] Policy training description: because the Transformer is trained directly on the QSCI subspace energy, the manuscript must supply full details on training data generation, loss formulation, regularization, and convergence diagnostics to allow assessment of optimization bias and to confirm that the generative component provides genuine independence from the evaluation metric.

Authors: We acknowledge that the original Methods section provided only a summary of the training process. The revised manuscript expands Section 2.3 with the requested details: training data were generated from 12,000 randomly sampled circuits evaluated via QSCI (with 500 high-performing trajectories used for fine-tuning); the loss is the negative subspace energy plus an L2 gate-count penalty (λ = 0.05); regularization includes dropout (p=0.1) and AdamW weight decay (1e-4); convergence is monitored via learning curves that plateau after ~250 epochs with loss variance below 0.005 across three independent runs. These additions demonstrate that the policy captures structural motifs rather than memorizing specific energy evaluations, preserving independence at inference time. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained empirical validation

full rationale

The paper trains a Transformer policy to minimize QSCI subspace energy as the objective for generating state-preparation circuits, then reports empirical outcomes on N2 (up to 32 qubits): measured two-qubit gate counts needed to reach chemical precision, with explicit comparisons to Trotter and qDRIFT baselines, plus subspace-size comparisons to HCI. These gate-count and compactness metrics are independent observables, not algebraically or definitionally equivalent to the training loss. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps in the abstract or summary; the central claims rest on direct, falsifiable numerical results from the validation runs rather than any reduction to inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

Based on abstract, the main added elements are the GQE framework and its application; specific free parameters in the model are not detailed.

free parameters (1)

Transformer policy network weights
Numerous parameters in the neural network are optimized during training to generate circuit structures.

axioms (1)

domain assumption The QSCI energy serves as an effective objective for training the generative policy
Invoked in the framework description where the policy is trained on the subspace energy.

invented entities (1)

Generative Quantum Eigensolver (GQE) no independent evidence
purpose: Framework for optimizing ansatz structures via Transformer policy
Newly introduced method combining generative modeling with QSCI.

pith-pipeline@v0.9.0 · 5527 in / 1493 out tokens · 88271 ms · 2026-05-10T16:54:11.176542+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Generative Quantum-inspired Kolmogorov-Arnold Eigensolver
quant-ph 2026-05 unverdicted novelty 7.0

GQKAE uses quantum-inspired Kolmogorov-Arnold networks to reduce parameters by 66% in generative quantum eigensolvers while achieving chemical accuracy on H4, N2, LiH, and other molecules.

Reference graph

Works this paper leans on

60 extracted references · 60 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

We represent the circuit-generation policy by a decoder-only Trans- former [37] with parametersθ, and denote the result- ing autoregressive policy byπ θ

Transformer-based circuit generation LetG={U j}|G| j=1 be the operator pool, whose detailed construction is deferred to Appendix A. We represent the circuit-generation policy by a decoder-only Trans- former [37] with parametersθ, and denote the result- ing autoregressive policy byπ θ. The policy samples an operator-index sequence s= (s 1, . . . , sL), s t...

work page
[2]

#!(!)$Local Refinement

Reward evaluation and policy update Starting from a randomly initialized policy, we repeat the following procedure forN iter iterations. At each it- eration, the Transformer samples a batch ofMcircuits, {U(s(m))}M m=1, from the current policy. Here and in the following, the superscript (m) labels them-th circuit in the sampled batch. For each sampled circ...

work page
[3]

A. Y. Kitaev, Quantum measurements and the abelian stabilizer problem (1995), arXiv:quant-ph/9511026 [quant-ph]

work page internal anchor Pith review arXiv 1995
[4]

D. S. Abrams and S. Lloyd, Quantum algorithm provid- ing exponential speed increase for finding eigenvalues and eigenvectors, Physical Review Letters83, 5162 (1999)

work page 1999
[5]

Aspuru-Guzik, A

A. Aspuru-Guzik, A. D. Dutoi, P. J. Love, and M. Head- Gordon, Simulated quantum computation of molecular energies, Science309, 1704 (2005)

work page 2005
[6]

Katabarwa, K

A. Katabarwa, K. Gratsea, A. Caesura, and P. D. Johnson, Early fault-tolerant quantum computing, PRX quantum5, 020101 (2024)

work page 2024
[7]

Preskill, Beyond nisq: The megaquop machine (2025)

J. Preskill, Beyond nisq: The megaquop machine (2025)

work page 2025
[8]

2302.11320 (2023)

K. Kanno, M. Kohda, R. Imai, S. Koh, K. Mitarai, W. Mizukami, and Y. O. Nakagawa, Quantum-selected configuration interaction: Classical diagonalization of hamiltonians in subspaces selected by quantum comput- ers (2023), arXiv:2302.11320 [quant-ph]

work page arXiv 2023
[9]

Robledo-Moreno, M

J. Robledo-Moreno, M. Motta, H. Haas, A. Javadi- Abhari, P. Jurcevic, W. Kirby, S. Martiel, K. Sharma, S. Sharma, T. Shirakawa, I. Sitdikov, R.-Y. Sun, K. J. Sung, M. Takita, M. C. Tran, S. Yunoki, and A. Mezza- capo, Chemistry beyond the scale of exact diagonaliza- tion on a quantum-centric supercomputer, Science Ad- vances11, eadu9991 (2025)

work page 2025
[10]

Huron, J.-P

B. Huron, J.-P. Malrieu, and P. Rancurel, Iterative per- turbation calculations of ground and excited state en- ergies from multiconfigurational zeroth-order wavefunc- tions, The Journal of Chemical Physics58, 5745 (1973)

work page 1973
[11]

Evangelisti, J.-P

S. Evangelisti, J.-P. Daudey, and J.-P. Malrieu, Conver- gence of an improved CIPSI algorithm, Chemical Physics 75, 91 (1983)

work page 1983
[12]

A. A. Holmes, N. M. Tubman, and C. J. Umrigar, Heat- bath configuration interaction: An efficient selected con- figuration interaction algorithm inspired by heat-bath sampling, Journal of Chemical Theory and Computation 12, 3674 (2016)

work page 2016
[13]

Sharma, A

S. Sharma, A. A. Holmes, G. Jeanmairet, A. Alavi, and C. J. Umrigar, Semistochastic heat-bath configura- tion interaction method: Selected configuration interac- tion with semistochastic perturbation theory, Journal of Chemical Theory and Computation13, 1595 (2017)

work page 2017
[14]

Matsuzawa and Y

Y. Matsuzawa and Y. Kurashige, Jastrow-type decom- position in quantum chemistry for low-depth quantum circuits, Journal of Chemical Theory and Computation 16, 944 (2020)

work page 2020
[15]

Motta, K

M. Motta, K. J. Sung, K. B. Whaley, M. Head-Gordon, and J. Shee, Bridging physical intuition and hardware ef- ficiency for correlated electronic states: The local unitary cluster jastrow ansatz for electronic structure, Chemical Science14, 11213 (2023)

work page 2023
[16]

R. J. Bartlett and M. Musia l, Coupled-cluster theory in quantum chemistry, Reviews of Modern Physics79, 291 (2007). 12

work page 2007
[17]

Kaliakin, A

D. Kaliakin, A. Shajan, F. Liang, J. R. Moreno, Z. Li, A. Mitra, M. Motta, C. Johnson, A. A. Saki, S. Das, I. Sitdikov, A. Mezzacapo, and K. M. M. Jr., Accurate quantum-centric simulations of intermolecular interac- tions, Communications Physics8, 396 (2025)

work page 2025
[18]

Barison, J

S. Barison, J. R. Moreno, and M. Motta, Quantum- centric computation of molecular excited states with ex- tended sample-based quantum diagonalization, Quantum Science and Technology10, 025034 (2025)

work page 2025
[19]

Liepuoniute, K

I. Liepuoniute, K. D. Doney, J. R. Moreno, J. A. Job, W. S. Friend, and G. O. Jones, Quantum-centric com- putational study of methylene singlet and triplet states, Journal of Chemical Theory and Computation21, 5062 (2025)

work page 2025
[20]

Shajan, D

A. Shajan, D. Kaliakin, A. Mitra, J. R. Moreno, Z. Li, M. Motta, C. Johnson, A. A. Saki, S. Das, I. Sitdikov, A. Mezzacapo, and K. M. M. Jr., Toward quantum- centric simulations of extended molecules: Sample-based quantum diagonalization enhanced with density matrix embedding theory, Journal of Chemical Theory and Com- putation21, 6801 (2025)

work page 2025
[21]

Reinholdt, K

P. Reinholdt, K. M. Ziems, E. R. Kjellgren, S. Cori- ani, S. P. A. Sauer, and J. Kongsted, Exposing a fatal flaw in sample-based quantum diagonalization methods based on ground-state sampling (2025), arXiv:2501.07231 [quant-ph]

work page arXiv 2025
[22]

Sugisaki, S

K. Sugisaki, S. Kanno, T. Itoko, R. Sakuma, and N. Yamamoto, Hamiltonian simulation-based quantum- selected configuration interaction for large-scale elec- tronic structure calculations with a quantum computer, Physical Chemistry Chemical Physics27, 20869 (2025)

work page 2025
[23]

Mikkelsen and Y

M. Mikkelsen and Y. O. Nakagawa, Quantum-selected configuration interaction with time-evolved state, Physi- cal Review Research7, 043043 (2025)

work page 2025
[24]

J. Yu, J. R. Moreno, J. T. Iosue, L. Bertels, D. Claudino, B. Fuller, P. Groszkowski, T. S. Humble, P. Jurce- vic, W. Kirby, T. A. Maier, M. Motta, B. Pokharel, A. Seif, A. Shehata, K. J. Sung, M. C. Tran, V. Tripathi, A. Mezzacapo, and K. Sharma, Quantum-centric algo- rithm for sample-based krylov diagonalization (2025), arXiv:2501.09702v3, arXiv:2501.09...

work page arXiv 2025
[25]

C. L. Cortes and S. K. Gray, Quantum krylov subspace algorithms for ground- and excited-state energy estima- tion, Physical Review A105, 022417 (2022)

work page 2022
[26]

Zhang, A

Z. Zhang, A. Wang, X. Xu, and Y. Li, Measurement- efficient quantum krylov subspace diagonalisation, Quan- tum8, 1438 (2024)

work page 2024
[27]

2508.02578 (2025)

S. Piccinelli, A. Baiardi, S. Barison, M. Rossmannek, A. C. Vazquez, F. Tacchino, S. Mensa, E. Altamura, A. Alavi, M. Motta, J. Robledo-Moreno, W. Kirby, K. Sharma, A. Mezzacapo, and I. Tavernelli, Quan- tum chemistry with provable convergence via randomized sample-based krylov quantum diagonalization (2025), arXiv:2508.02578v2, arXiv:2508.02578 [quant-ph]

work page arXiv 2025
[28]

Campbell, A random compiler for fast hamiltonian simulation, arXiv preprint arXiv:1811.08017 (2018)

E. Campbell, A random compiler for fast hamiltonian simulation, arXiv preprint arXiv:1811.08017 (2018)

work page arXiv 2018
[29]

K. Wan, M. Berta, and E. T. Campbell, Random- ized quantum algorithm for statistical phase estimation, Physical Review Letters129, 030503 (2022)

work page 2022
[30]

Weaving, A

T. Weaving, A. Mingare, A. Ralli, and P. V. Coveney, Towards compact wavefunctions from quantum-selected configuration interaction (2025), arXiv:2509.02525v3, arXiv:2509.02525 [quant-ph]

work page arXiv 2025
[31]

Peruzzo, J

A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, A variational eigenvalue solver on a photonic quantum processor, Nature Communications5, 4213 (2014)

work page 2014
[32]

J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, The theory of variational hybrid quantum- classical algorithms, New Journal of Physics18, 023023 (2016)

work page 2016
[33]

Bittel, S

L. Bittel, S. Gharibian, and M. Kliesch, The optimal depth of variational quantum algorithms is QCMA-hard to approximate, inProceedings of the 38th Computa- tional Complexity Conference (CCC), Leibniz Interna- tional Proceedings in Informatics (LIPIcs), Vol. 264 (2023) pp. 34:1–34:24

work page 2023
[34]

Cai, Resource estimation for quantum variational sim- ulations of the hubbard model, Physical Review Applied 14, 014059 (2020)

Z. Cai, Resource estimation for quantum variational sim- ulations of the hubbard model, Physical Review Applied 14, 014059 (2020)

work page 2020
[35]

Stilck Fran¸ ca and R

D. Stilck Fran¸ ca and R. Garc´ ıa-Patr´ on, Limitations of optimization algorithms on noisy quantum devices, Na- ture Physics17, 1221 (2021)

work page 2021
[36]

Cerezo, A

M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, Cost function dependent barren plateaus in shal- low parametrized quantum circuits, Nature Communica- tions12, 1791 (2021)

work page 2021
[37]

N¨ utzel, A

L. N¨ utzel, A. Gresch, L. Hehn, L. Marti, R. Freund, A. Steiner, C. D. Marciniak, T. Eckstein, N. Stockinger, S. Wolf, T. Monz, M. K¨ uhn, and M. J. Hartmann, Solv- ing an industrially relevant quantum chemistry problem on quantum hardware, Quantum Science and Technology 10, 015066 (2025)

work page 2025
[38]

Nakaji, L

K. Nakaji, L. B. Kristensen, R. Kemmoku, J. A. Campos- Gonzalez-Angulo, M. G. Vakili, H. Huang, M. Bagher- imehrab, C. Gorgulla, F. Wong, A. McCaskey, J.-S. Kim, T. Nguyen, P. Rao, Q. Gao, M. Sugawara, N. Yamamoto, and A. Aspuru-Guzik, The generative quantum eigen- solver (GQE) and its application for ground state search (2025), arXiv:2401.09253v2, arXiv:...

work page arXiv 2025
[39]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, At- tention is all you need, inAdvances in Neural Informa- tion Processing Systems 30 (NeurIPS 2017)(2017) pp. 5998–6008

work page 2017
[40]

Y. O. Nakagawa, M. Kamoshita, W. Mizukami, S. Sudo, and Y. ya Ohnishi, Adapt-qsci: Adaptive construction of an input state for quantum-selected configuration in- teraction, Journal of Chemical Theory and Computation 20, 10817 (2024)

work page 2024
[41]

2503.06292 (2025)

A. Pellow-Jarman, S. McFarthing, D. H. Kang, P. Yoo, E. E. Elala, R. Pellow-Jarman, P. M. Nakliang, J. Kim, and J.-K. K. Rhee, HIVQE: Handover itera- tive variational quantum eigensolver for efficient quan- tum chemistry calculations (2025), arXiv:2503.06292v2, arXiv:2503.06292 [quant-ph]

work page arXiv 2025
[42]

P. Yoo, K. Kim, E. E. Elala, S. McFarthing, A. Pel- low, J. I. Fuks, D. H. Kang, P. Nakliang, J. Kim, H. Pathak, T. Shirakawa, S. Yunoki, and J.-K. K. Rhee, Extending the handover-iterative VQE to challenging strongly correlated systems:N 2 and Fe-S cluster (2026), arXiv:2601.06935v1, arXiv:2601.06935 [quant-ph]

work page arXiv 2026
[43]

Shirakawa, J

T. Shirakawa, J. Robledo-Moreno, T. Itoko, V. Tripathi, K. Ueda, Y. Kawashima, L. Broers, W. Kirby, H. Pathak, H. Paik, M. Tsuji, Y. Kodama, M. Sato, C. Evangeli- nos, S. Seelam, R. Walkup, S. Yunoki, M. Motta, P. Ju- 13 rcevic, H. Horii, and A. Mezzacapo, Closed-loop cal- culations of electronic structure on a quantum proces- sor and a classical supercom...

work page arXiv 2025
[44]

Z. Shao, P. Wang, Q. Zhu, R. Xu, J. Song, X. Bi, H. Zhang, M. Zhang, Y. K. Li, Y. Wu, and D. Guo, Deepseekmath: Pushing the limits of mathematical rea- soning in open language models (2024), arXiv:2402.03300 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2024
[45]

W. J. Hehre, R. F. Stewart, and J. A. Pople, Self- consistent molecular-orbital methods. i. use of gaussian expansions of slater-type atomic orbitals, The Journal of Chemical Physics51, 2657 (1969)

work page 1969
[46]

W. J. Hehre, R. Ditchfield, and J. A. Pople, Self- consistent molecular orbital methods. xii. further exten- sions of gaussian-type basis sets for use in molecular or- bital studies of organic molecules, The Journal of Chem- ical Physics56, 2257 (1972)

work page 1972
[47]

Q. Sun, T. C. Berkelbach, N. S. Blunt, G. H. Booth, S. Guo, Z. Li, J. Liu, J. McClain, E. R. Sayfut- yarova, S. Sharma, S. Wouters, and G. K.-L. Chan, PySCF: The python-based simulations of chemistry framework, WIREs Computational Molecular Science8, e1340 (2018)

work page 2018
[48]

Radford, J

A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever, Language models are unsupervised multi- task learners (2019), openAI Technical Report

work page 2019
[49]

T. Wolf, L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jer- nite, J. Plu, C. Xu, T. L. Scao, S. Gugger, M. Drame, Q. Lhoest, and A. M. Rush, Transformers: State-of-the- art natural language processing, inProceedings of the 2020 Conference on Empirical Meth...

work page 2020
[50]

Decoupled Weight Decay Regularization

I. Loshchilov and F. Hutter, Decoupled weight decay reg- ularization (2017), arXiv:1711.05101 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[51]

The CUDA-Q development team, CUDA-Q

work page
[52]

Richer, V

M. Richer, V. Chuiko, A. Alavi, P. W. Ayers, and T. Ver- straelen, Pyci: A python-scriptable library for arbitrary determinant ci, The Journal of Chemical Physics161, 132502 (2024)

work page 2024
[53]

Pachal, S

S. Pachal, S. Bhatnagar, and L. A. Prashanth, Gener- alized simultaneous perturbation-based gradient search with reduced estimator bias, IEEE Transactions on Au- tomatic Control70, 4687 (2025)

work page 2025
[54]

J. C. Spall, Multivariate stochastic approximation us- ing a simultaneous perturbation gradient approximation, IEEE Transactions on Automatic Control37, 332 (1992)

work page 1992
[55]

E. T. Campbell, B. M. Terhal, and C. Vuillot, Roads towards fault-tolerant universal quantum computation, Nature549, 172 (2017)

work page 2017
[56]

N. J. Ross and P. Selinger, Optimal ancilla-free clifford+t approximation ofz-rotations, Quantum Information & Computation16, 901 (2016)

work page 2016
[57]

P. A. M. Casares, R. Campos, and M. A. Martin-Delgado, Tfermion: A non-clifford gate cost assessment library of quantum phase estimation algorithms for quantum chem- istry, Quantum6, 768 (2022)

work page 2022
[58]

I. H. Kim, Y.-H. Liu, S. Pallister, W. Pol, S. Roberts, and E. Lee, Fault-tolerant resource estimate for quantum chemical simulations: Case study on li-ion battery elec- trolyte molecules, Physical Review Research4, 023019 (2022). 14 Appendix A: Operator pool The operator pool used by the Transformer is a fixed discrete vocabulary of unitary operators. Ou...

work page 2022
[59]

Comparing optimization performance with VQE The main-text results already show that the proposed workflow improves both gate efficiency and wavefunction compactness relative to fixed QSCI input-state families. Those comparisons, however, do not by themselves isolate whether the advantage comes specifically from the GQE-style discrete circuit search or sim...

work page
[60]

In an early fault-tolerant quantum computing (FTQC) setting, however, the more relevant metric is often the non-Clifford cost of implementing the state-preparation circuit

Rotation-gate efficiency The main text uses the two-qubit-gate count as the primary cost metric because it is the most relevant proxy for near- term circuit depth and error accumulation. In an early fault-tolerant quantum computing (FTQC) setting, however, the more relevant metric is often the non-Clifford cost of implementing the state-preparation circui...

work page