Learning to learn with quantum neural networks via classical neural networks

Guillaume Verdon; Hartmut Neven; Jarrod R. McClean; Kevin J. Sung; Masoud Mohseni; Michael Broughton; Ryan Babbush; Zhang Jiang

arxiv: 1907.05415 · v1 · pith:74W7H2KTnew · submitted 2019-07-11 · 🪐 quant-ph · cs.LG

Learning to learn with quantum neural networks via classical neural networks

Guillaume Verdon , Michael Broughton , Jarrod R. McClean , Kevin J. Sung , Ryan Babbush , Zhang Jiang , Hartmut Neven , Masoud Mohseni This is my paper

Pith reviewed 2026-05-24 22:58 UTC · model grok-4.3

classification 🪐 quant-ph cs.LG

keywords quantum neural networksmeta-learningvariational algorithmsQAOAVQEparameter initializationrecurrent neural networks

0 comments

The pith

Classical neural networks can be trained to suggest initial parameters that reduce the iterations needed for quantum variational algorithms to converge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to train a classical recurrent neural network on small examples of QAOA and VQE problems so that it outputs parameter values close to good local minima. Starting a standard optimizer from those suggested values cuts the total number of cost-function evaluations required to reach a target accuracy. The same trained network works across different problem sizes, which means strategies learned from easy instances can be applied to harder ones. This hybrid setup lowers the number of quantum queries needed during optimization.

Core claim

Training classical recurrent neural networks on small instances of QAOA for MaxCut, QAOA for the Sherrington-Kirkpatrick model, and VQE for the Hubbard model produces parameter guesses that, when used to initialize other optimizers, measurably decrease the iteration count to a given accuracy; the learned mapping generalizes across a range of instance sizes.

What carries the argument

A classical recurrent neural network that takes a problem instance description and outputs approximate optimal parameters for the corresponding quantum circuit, acting as a learned initializer for variational optimization.

If this is right

Fewer total optimization steps are needed for the tested QAOA and VQE problems when started from the network's suggestions.
The learned initialization strategy transfers from small to larger problem instances without retraining.
Training can be performed entirely on classically simulatable sizes and then deployed on quantum hardware for bigger instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be tested on other variational algorithms whose landscapes share similar structure.
One could measure how the quality of the network's guesses scales with the size gap between training and target instances.
If the network also outputs a suggested learning rate or optimizer choice, further reductions in quantum queries might be possible.

Load-bearing premise

The parameter guesses produced by the network trained on small instances remain useful when applied to larger instances that cannot be simulated classically.

What would settle it

Train the network on small instances, then measure on a fresh set of larger instances whether the initialized optimizer still requires fewer iterations than random or default initialization to reach the same accuracy threshold.

Figures

Figures reproduced from arXiv: 1907.05415 by Guillaume Verdon, Hartmut Neven, Jarrod R. McClean, Kevin J. Sung, Masoud Mohseni, Michael Broughton, Ryan Babbush, Zhang Jiang.

**Figure 1.** Figure 1: This hybrid computational graph can be con [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 3.** Figure 3: The specific class of VQE problems we chose to [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 3.** Figure 3: Displayed above are the average relative errors with respect to the number of objective function queries during the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: The above histograms represent the parameter [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

Quantum Neural Networks (QNNs) are a promising variational learning paradigm with applications to near-term quantum processors, however they still face some significant challenges. One such challenge is finding good parameter initialization heuristics that ensure rapid and consistent convergence to local minima of the parameterized quantum circuit landscape. In this work, we train classical neural networks to assist in the quantum learning process, also know as meta-learning, to rapidly find approximate optima in the parameter landscape for several classes of quantum variational algorithms. Specifically, we train classical recurrent neural networks to find approximately optimal parameters within a small number of queries of the cost function for the Quantum Approximate Optimization Algorithm (QAOA) for MaxCut, QAOA for Sherrington-Kirkpatrick Ising model, and for a Variational Quantum Eigensolver for the Hubbard model. By initializing other optimizers at parameter values suggested by the classical neural network, we demonstrate a significant improvement in the total number of optimization iterations required to reach a given accuracy. We further demonstrate that the optimization strategies learned by the neural network generalize well across a range of problem instance sizes. This opens up the possibility of training on small, classically simulatable problem instances, in order to initialize larger, classically intractably simulatable problem instances on quantum devices, thereby significantly reducing the number of required quantum-classical optimization iterations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes training classical recurrent neural networks (RNNs) to meta-learn approximate optimal parameters for variational quantum algorithms, specifically QAOA applied to MaxCut and the Sherrington-Kirkpatrick Ising model as well as VQE for the Hubbard model. By using the RNN-suggested parameters to initialize standard classical optimizers, the authors report a reduction in the total number of optimization iterations needed to reach a target accuracy. They further claim that the learned initialization strategies generalize across a range of problem instance sizes, which would allow training on small classically simulatable instances for deployment on larger classically intractable instances.

Significance. If the reported iteration reductions and cross-size generalization are robust, the method could reduce the number of quantum circuit evaluations required for variational algorithms on near-term hardware, by shifting part of the optimization burden to classical pre-training on small instances.

major comments (2)

[Abstract] Abstract: the central utility claim rests on the statement that 'the optimization strategies learned by the neural network generalize well across a range of problem instance sizes' and thereby open the possibility of training on small simulatable instances for larger intractable ones; however, the manuscript supplies no quantitative transfer results (e.g., iteration savings measured on instance sizes or graph ensembles outside the training distribution) that would substantiate this extrapolation once classical simulation becomes impossible.
[Experimental sections] Experimental sections: the abstract and manuscript provide no description of the RNN training procedure, data splits, baseline optimizers, number of independent trials, or statistical significance tests for the claimed reduction in optimization iterations; without these details the empirical improvements cannot be assessed for robustness or reproducibility.

minor comments (1)

[Abstract] Abstract: 'also know as' should read 'also known as'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address the major points below and will revise the manuscript to improve reproducibility and strengthen the presentation of generalization results.

read point-by-point responses

Referee: [Abstract] Abstract: the central utility claim rests on the statement that 'the optimization strategies learned by the neural network generalize well across a range of problem instance sizes' and thereby open the possibility of training on small simulatable instances for larger intractable ones; however, the manuscript supplies no quantitative transfer results (e.g., iteration savings measured on instance sizes or graph ensembles outside the training distribution) that would substantiate this extrapolation once classical simulation becomes impossible.

Authors: The manuscript reports quantitative iteration reductions when applying the RNN-initialized parameters to problem instances larger than those used during RNN training, for both QAOA (MaxCut and SK model) and VQE (Hubbard), with the savings shown across a range of sizes in the experimental figures. These results provide evidence of generalization within the simulatable regime. We agree that more explicit out-of-distribution transfer metrics would better support the extrapolation claim and will add a dedicated analysis with held-out size ranges and ensemble variations in the revised experimental section, along with updated abstract wording if appropriate. revision: partial
Referee: [Experimental sections] Experimental sections: the abstract and manuscript provide no description of the RNN training procedure, data splits, baseline optimizers, number of independent trials, or statistical significance tests for the claimed reduction in optimization iterations; without these details the empirical improvements cannot be assessed for robustness or reproducibility.

Authors: We acknowledge that these methodological details require expansion for full reproducibility. We will revise the experimental sections to include: (i) the complete RNN training procedure (architecture, meta-learning objective, and optimizer for the RNN itself); (ii) data generation and splits used for meta-training; (iii) the specific classical baseline optimizers compared; (iv) the number of independent trials per experiment; and (v) the statistical tests applied to the iteration reductions. Hyperparameter tables and pseudocode will also be added. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical meta-learning study with no derivation chain

full rationale

The paper describes training classical RNNs on small QAOA/VQE instances to suggest parameter initializations, then empirically measures reduced optimizer iterations on held-out instances and reports generalization across sizes via direct testing. No equations, derivations, uniqueness theorems, or ansatzes are presented that could reduce to fitted inputs or self-citations. All claims rest on experimental results against external simulation benchmarks rather than any self-referential construction, making the work self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unexamined assumption that meta-learned initializations transfer across instance sizes.

axioms (1)

domain assumption Meta-learned initialization strategies from small instances transfer to larger instances of the same problem class.
Required for the claim that small-instance training can initialize classically intractable problems.

pith-pipeline@v0.9.0 · 5789 in / 1203 out tokens · 43527 ms · 2026-05-24T22:58:06.968107+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Graph-Conditioned Meta-Optimizer for QAOA Parameter Generation on Multiple Problem Classes
quant-ph 2026-04 unverdicted novelty 7.0

A graph-conditioned meta-optimizer learns QAOA parameter trajectories from one problem class and transfers them to others, yielding better initializations than standard methods in an empirical study of 64 settings.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · cited by 1 Pith paper · 23 internal anchors

[1]

We consider an eﬃcient optimizer to be one which ﬁnds suﬃciently optimal approximate local minima of cost functions in as few function queries as possible

Meta-Training & Loss functions The objective of quantum-classical meta-learning is to train our RNN to learn an eﬃcient parameter update scheme for a family of cost functions of interest, i.e., to discover an optimizer which eﬃciently optimizes a cer- tain distribution of optimizees, on average. We consider an eﬃcient optimizer to be one which ﬁnds suﬃcie...

work page
[2]

(Hubbard VQE). We provide a brief introduction to each of these three classes, as well as describe the dis- tribution of instances from these classes from which we sampled to generate training and testing instances. A. Quantum Approximate Optimization Algorithms Let us ﬁrst introduce a general QAOA ansatz before we specialize to applications to MaxCut pro...

work page
[3]

Let us ﬁrst provide a brief introduc- tion to the MaxCut problem

MaxCut QAOA The problem for which the QAOA was ﬁrst explored was for MaxCut [3]. Let us ﬁrst provide a brief introduc- tion to the MaxCut problem. Suppose we have a graph G ={V,E} whereE are the edges and V the vertices. Given a partition of these vertices into a subset P0 and its complement P1 =V\P 0, the corresponding cut set C⊆E is the subset of edges ...

work page
[4]

Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural networks [24, 66])

Ising QAOA Another domain of application where we tested quantum-classical meta-learning was with the QAOA for ﬁnding low energy states of a type of Ising spin glass model known as the Sherrington-Kirkpatrick (SK) model. Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural network...

work page
[5]

Hubbard Model VQE Here we describe the variational quantum eigensolver (VQE) ansatze that were used to generate the results in Fig. 3. The speciﬁc class of VQE problems we chose to consider were for variational preparation of ground states of Hubbard model lattices [39]. The Hubbard model is an idealized model of fermions interacting on a lattice. The 2D ...

work page
[6]

For the testing instances used to generate the results presented in Figure 3, following a standard prescription

work page
[7]

for the number of repetitions required to guarantee an upper bound to the variance of 0.05, the number of rep- etitions (QNN inference runs) should be of (7 ± 4)× 105 repetitions for MaxCut QAOA, (2 .3± 0.6)× 104 repe- titions for Ising QAOA, and (3 ± 2)× 105 repetitions for Hubbard VQE. In terms of wall clock time, assuming that the QPU can execute 10000...

work page
[8]

Quantum Computing in the NISQ era and beyond

J. Preskill, arXiv preprint arXiv:1801.00862 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[9]

Classification with Quantum Neural Networks on Near Term Processors

E. Farhi and H. Neven, arXiv preprint arXiv:1802.06002 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[10]

A Quantum Approximate Optimization Algorithm

E. Farhi, J. Goldstone, and S. Gutmann, arXiv preprint arXiv:1411.4028 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014
[11]

Peruzzo, J

A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. Obrien, Nature communications 5, 4213 (2014)

work page 2014
[12]

Killoran, T

N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, and S. Lloyd, arXiv preprint arXiv:1806.06871 (2018)

work page arXiv 2018
[13]

Wecker, M

D. Wecker, M. B. Hastings, and M. Troyer, Phys. Rev. A 92, 042303 (2015)

work page 2015
[14]

Biamonte, P

J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature 549, 195 (2017)

work page 2017
[15]

Zhou, S.-T

L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, arXiv preprint arXiv:1812.01041 (2018)

work page arXiv 2018
[16]

J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

work page 2016
[17]

From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz

S. Hadﬁeld, Z. Wang, B. O’Gorman, E. G. Rief- fel, D. Venturelli, and R. Biswas, arXiv preprint arXiv:1709.03489 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[18]

Grant, M

E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, V. Stojevic, A. G. Green, and S. Severini, npj Quantum Information 4, 65 (2018)

work page 2018
[19]

Khatri, R

S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum 3, 140 (2019)

work page 2019
[20]

Schuld and N

M. Schuld and N. Killoran, Physical review letters 122, 040504 (2019)

work page 2019
[21]

McArdle, T

S. McArdle, T. Jones, S. Endo, Y. Li, S. Benjamin, and X. Yuan, arXiv preprint arXiv:1804.03023 (2018)

work page arXiv 2018
[22]

Benedetti, E

M. Benedetti, E. Grant, L. Wossnig, and S. Severini, New Journal of Physics 21, 043023 (2019)

work page 2019
[23]

B. Nash, V. Gheorghiu, and M. Mosca, arXiv preprint arXiv:1904.01972 (2019)

work page arXiv 1904
[24]

Jiang, J

Z. Jiang, J. McClean, R. Babbush, and H. Neven, arXiv preprint arXiv:1812.08190 (2018)

work page arXiv 2018
[25]

G. R. Steinbrecher, J. P. Olson, D. Englund, and J. Car- olan, arXiv preprint arXiv:1808.10047 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[26]

A quantum alternating operator ansatz with hard and soft constraints for lattice protein folding

M. Fingerhuth, T. Babej, et al. , arXiv preprint arXiv:1810.13411 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[27]

Variational Quantum State Diagonalization

R. LaRose, A. Tikku, ´E. O’Neel-Judy, L. Cincio, and P. J. Coles, arXiv preprint arXiv:1810.10506 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[28]

Cincio, Y

L. Cincio, Y. Suba¸ sı, A. T. Sornborger, and P. J. Coles, New Journal of Physics 20, 113022 (2018)

work page 2018
[29]

H. Situ, Z. Huang, X. Zou, and S. Zheng, Quantum Information Processing 18, 230 (2019)

work page 2019
[30]

H. Chen, L. Wossnig, S. Severini, H. Neven, and M. Mohseni, arXiv preprint arXiv:1805.08654 (2018)

work page arXiv 2018
[31]

Verdon, M

G. Verdon, M. Broughton, and J. Biamonte, arXiv preprint arXiv:1712.05304 (2017)

work page arXiv 2017
[32]

LeCun, Y

Y. LeCun, Y. Bengio, and G. Hinton, nature 521, 436 (2015)

work page 2015
[33]

Goodfellow, Y

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, Vol. 1 (MIT press Cambridge, 2016)

work page 2016
[34]

Schmidhuber, Neural networks 61, 85 (2015)

J. Schmidhuber, Neural networks 61, 85 (2015)

work page 2015
[35]

A Universal Training Algorithm for Quantum Deep Learning

G. Verdon, J. Pye, and M. Broughton, arXiv preprint arXiv:1806.09729 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[36]

J. R. Mcclean, S. Boixo, V. N. Smelyanskiy, R. Bab- bush, and H. Neven, Nature Communications 9 (2018), 10.1038/s41467-018-07090-4

work page doi:10.1038/s41467-018-07090-4 2018
[37]

Evaluating analytic gradients on quantum hardware

M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, arXiv preprint arXiv:1811.11184 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[38]

Harrow and J

A. Harrow and J. Napp, arXiv preprint arXiv:1901.05374 (2019)

work page arXiv 1901
[39]

Z.-C. Yang, A. Rahmani, A. Shabani, H. Neven, and C. Chamon, Physical Review X 7, 021027 (2017). 12

work page 2017
[40]

Grant, L

E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, arXiv preprint arXiv:1903.05076 (2019)

work page arXiv 1903
[41]

Y. Chen, M. W. Hoﬀman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas, arXiv preprint arXiv:1611.03824 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[42]

Andrychowicz, M

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoﬀman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, in Advances in Neural Information Processing Systems (2016) pp. 3981–3989

work page 2016
[43]

Neural Architecture Search with Reinforcement Learning

B. Zoph and Q. V. Le, arXiv preprint arXiv:1611.01578 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[44]

C. Finn, P. Abbeel, and S. Levine, arXiv preprint arXiv:1703.03400 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[45]

M. Long, Y. Cao, J. Wang, and M. I. Jordan, arXiv preprint arXiv:1502.02791 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[46]

I. D. Kivlichan, J. McClean, N. Wiebe, C. Gidney, A. Aspuru-Guzik, G. K.-L. Chan, and R. Babbush, Phys. Rev. Lett. 120, 110501 (2018)

work page 2018
[47]

Jiang, K

Z. Jiang, K. J. Sung, K. Kechedzhi, V. N. Smelyanskiy, and S. Boixo, Physical Review Applied 9, 044036 (2018)

work page 2018
[48]

J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, SIAM Journal on optimization 9, 112 (1998)

work page 1998
[49]

Nannicini, Physical Review E 99, 013304 (2019)

G. Nannicini, Physical Review E 99, 013304 (2019)

work page 2019
[50]

G. G. Guerreschi and M. Smelyanskiy, arXiv preprint arXiv:1701.01450 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[51]

J. C. Spall, IEEE Transactions on aerospace and elec- tronic systems 34, 817 (1998)

work page 1998
[52]

M. J. Powell, Cambridge NA Report NA2009/06, Uni- versity of Cambridge, Cambridge , 26 (2009)

work page 2009
[53]

Performance of hybrid quantum/classical variational heuristics for combinatorial optimization

G. Nannicini, arXiv preprint arXiv:1805.12037 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[54]

McClean, J

J. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

work page 2016
[55]

For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

In general, one may relay the raw measurement results to the classical processing unit, which can then compute the expectation value of the cost function. For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

work page
[56]

A. Y. Kitaev, A. Shen, M. N. Vyalyi, and M. N. Vya- lyi, Classical and quantum computation , 47 (American Mathematical Soc., 2002)

work page 2002
[57]

N. C. Rubin, R. Babbush, and J. McClean, New Journal of Physics 20, 053020 (2018)

work page 2018
[58]

LeCun, D

Y. LeCun, D. Touresky, G. Hinton, and T. Sejnowski, in Proceedings of the 1988 connectionist models summer school, Vol. 1 (CMU, Pittsburgh, Pa: Morgan Kaufmann,

work page 1988
[59]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, nature 323, 533 (1986)

work page 1986
[60]

Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions

J. Romero and A. Aspuru-Guzik, arXiv preprint arXiv:1901.00848 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1901
[61]

On First-Order Meta-Learning Algorithms

A. Nichol and J. Schulman, arXiv preprint arXiv:1803.02999 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[62]

A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pas- canu, S. Osindero, and R. Hadsell, arXiv preprint arXiv:1807.05960 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[63]

Blackbox and derivative- free optimization: theory, algorithms and applications,

C. Audet and M. Kokkolaras, “Blackbox and derivative- free optimization: theory, algorithms and applications,” (2016)

work page 2016
[64]

Z. C. Lipton, J. Berkowitz, and C. Elkan, arXiv preprint arXiv:1506.00019 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[65]

Pascanu, T

R. Pascanu, T. Mikolov, and Y. Bengio, in International Conference on Machine Learning (2013) pp. 1310–1318

work page 2013
[66]

Hochreiter and J

S. Hochreiter and J. Schmidhuber, Neural computation 9, 1735 (1997)

work page 1997
[67]

Bapat and S

A. Bapat and S. Jordan, arXiv preprint arXiv:1812.02746 (2018)

work page arXiv 2018
[68]

A Quantum Approximate Optimization Algorithm for continuous problems

G. Verdon, J. M. Arrazola, K. Br´ adler, and N. Killoran, arXiv preprint arXiv:1902.00409 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1902
[69]

C. E. Rasmussen, in Advanced lectures on machine learn- ing (Springer, 2004) pp. 63–71

work page 2004
[70]

F. G. Brandao, M. Broughton, E. Farhi, S. Gutmann, and H. Neven, arXiv preprint arXiv:1812.04170 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[71]

Suzuki, Physics Letters A 146, 319 (1990)

M. Suzuki, Physics Letters A 146, 319 (1990)

work page 1990
[72]

Kadowaki and H

T. Kadowaki and H. Nishimori, Physical Review E 58, 5355 (1998)

work page 1998
[73]

M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Physical Review X 8, 021050 (2018)

work page 2018
[74]

J. R. McClean, I. D. Kivlichan, D. S. Steiger, Y. Cao, E. S. Fried, C. Gidney, T. H¨ aner, V. Havl´ ıˇ cek, Z. Jiang, M. Neeley, et al. , arXiv preprint arXiv:1710.07629 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[75]

Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp

L. Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp. 55–69

work page 1998
[76]

Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,

M. LLC, “Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,” (2018)

work page 2018
[77]

Abadi, P

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al

work page
[78]

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

S. Ioﬀe and C. Szegedy, arXiv preprint arXiv:1502.03167 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[79]

D. J. Wales and J. P. Doye, The Journal of Physical Chemistry A 101, 5111 (1997)

work page 1997
[80]

Omalley, R

P. Omalley, R. Babbush, I. D. Kivlichan, J. Romero, J. R. McClean, R. Barends, J. Kelly, P. Roushan, A. Tranter, N. Ding, et al. , Physical Review X 6, 031007 (2016)

work page 2016

Showing first 80 references.

[1] [1]

We consider an eﬃcient optimizer to be one which ﬁnds suﬃciently optimal approximate local minima of cost functions in as few function queries as possible

Meta-Training & Loss functions The objective of quantum-classical meta-learning is to train our RNN to learn an eﬃcient parameter update scheme for a family of cost functions of interest, i.e., to discover an optimizer which eﬃciently optimizes a cer- tain distribution of optimizees, on average. We consider an eﬃcient optimizer to be one which ﬁnds suﬃcie...

work page

[2] [2]

(Hubbard VQE). We provide a brief introduction to each of these three classes, as well as describe the dis- tribution of instances from these classes from which we sampled to generate training and testing instances. A. Quantum Approximate Optimization Algorithms Let us ﬁrst introduce a general QAOA ansatz before we specialize to applications to MaxCut pro...

work page

[3] [3]

Let us ﬁrst provide a brief introduc- tion to the MaxCut problem

MaxCut QAOA The problem for which the QAOA was ﬁrst explored was for MaxCut [3]. Let us ﬁrst provide a brief introduc- tion to the MaxCut problem. Suppose we have a graph G ={V,E} whereE are the edges and V the vertices. Given a partition of these vertices into a subset P0 and its complement P1 =V\P 0, the corresponding cut set C⊆E is the subset of edges ...

work page

[4] [4]

Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural networks [24, 66])

Ising QAOA Another domain of application where we tested quantum-classical meta-learning was with the QAOA for ﬁnding low energy states of a type of Ising spin glass model known as the Sherrington-Kirkpatrick (SK) model. Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural network...

work page

[5] [5]

Hubbard Model VQE Here we describe the variational quantum eigensolver (VQE) ansatze that were used to generate the results in Fig. 3. The speciﬁc class of VQE problems we chose to consider were for variational preparation of ground states of Hubbard model lattices [39]. The Hubbard model is an idealized model of fermions interacting on a lattice. The 2D ...

work page

[6] [6]

For the testing instances used to generate the results presented in Figure 3, following a standard prescription

work page

[7] [7]

for the number of repetitions required to guarantee an upper bound to the variance of 0.05, the number of rep- etitions (QNN inference runs) should be of (7 ± 4)× 105 repetitions for MaxCut QAOA, (2 .3± 0.6)× 104 repe- titions for Ising QAOA, and (3 ± 2)× 105 repetitions for Hubbard VQE. In terms of wall clock time, assuming that the QPU can execute 10000...

work page

[8] [8]

Quantum Computing in the NISQ era and beyond

J. Preskill, arXiv preprint arXiv:1801.00862 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[9] [9]

Classification with Quantum Neural Networks on Near Term Processors

E. Farhi and H. Neven, arXiv preprint arXiv:1802.06002 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[10] [10]

A Quantum Approximate Optimization Algorithm

E. Farhi, J. Goldstone, and S. Gutmann, arXiv preprint arXiv:1411.4028 (2014)

work page internal anchor Pith review Pith/arXiv arXiv 2014

[11] [11]

Peruzzo, J

A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. Obrien, Nature communications 5, 4213 (2014)

work page 2014

[12] [12]

Killoran, T

N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, and S. Lloyd, arXiv preprint arXiv:1806.06871 (2018)

work page arXiv 2018

[13] [13]

Wecker, M

D. Wecker, M. B. Hastings, and M. Troyer, Phys. Rev. A 92, 042303 (2015)

work page 2015

[14] [14]

Biamonte, P

J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature 549, 195 (2017)

work page 2017

[15] [15]

Zhou, S.-T

L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, arXiv preprint arXiv:1812.01041 (2018)

work page arXiv 2018

[16] [16]

J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

work page 2016

[17] [17]

From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz

S. Hadﬁeld, Z. Wang, B. O’Gorman, E. G. Rief- fel, D. Venturelli, and R. Biswas, arXiv preprint arXiv:1709.03489 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[18] [18]

Grant, M

E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, V. Stojevic, A. G. Green, and S. Severini, npj Quantum Information 4, 65 (2018)

work page 2018

[19] [19]

Khatri, R

S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum 3, 140 (2019)

work page 2019

[20] [20]

Schuld and N

M. Schuld and N. Killoran, Physical review letters 122, 040504 (2019)

work page 2019

[21] [21]

McArdle, T

S. McArdle, T. Jones, S. Endo, Y. Li, S. Benjamin, and X. Yuan, arXiv preprint arXiv:1804.03023 (2018)

work page arXiv 2018

[22] [22]

Benedetti, E

M. Benedetti, E. Grant, L. Wossnig, and S. Severini, New Journal of Physics 21, 043023 (2019)

work page 2019

[23] [23]

B. Nash, V. Gheorghiu, and M. Mosca, arXiv preprint arXiv:1904.01972 (2019)

work page arXiv 1904

[24] [24]

Jiang, J

Z. Jiang, J. McClean, R. Babbush, and H. Neven, arXiv preprint arXiv:1812.08190 (2018)

work page arXiv 2018

[25] [25]

G. R. Steinbrecher, J. P. Olson, D. Englund, and J. Car- olan, arXiv preprint arXiv:1808.10047 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[26] [26]

A quantum alternating operator ansatz with hard and soft constraints for lattice protein folding

M. Fingerhuth, T. Babej, et al. , arXiv preprint arXiv:1810.13411 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[27] [27]

Variational Quantum State Diagonalization

R. LaRose, A. Tikku, ´E. O’Neel-Judy, L. Cincio, and P. J. Coles, arXiv preprint arXiv:1810.10506 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[28] [28]

Cincio, Y

L. Cincio, Y. Suba¸ sı, A. T. Sornborger, and P. J. Coles, New Journal of Physics 20, 113022 (2018)

work page 2018

[29] [29]

H. Situ, Z. Huang, X. Zou, and S. Zheng, Quantum Information Processing 18, 230 (2019)

work page 2019

[30] [30]

H. Chen, L. Wossnig, S. Severini, H. Neven, and M. Mohseni, arXiv preprint arXiv:1805.08654 (2018)

work page arXiv 2018

[31] [31]

Verdon, M

G. Verdon, M. Broughton, and J. Biamonte, arXiv preprint arXiv:1712.05304 (2017)

work page arXiv 2017

[32] [32]

LeCun, Y

Y. LeCun, Y. Bengio, and G. Hinton, nature 521, 436 (2015)

work page 2015

[33] [33]

Goodfellow, Y

I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, Vol. 1 (MIT press Cambridge, 2016)

work page 2016

[34] [34]

Schmidhuber, Neural networks 61, 85 (2015)

J. Schmidhuber, Neural networks 61, 85 (2015)

work page 2015

[35] [35]

A Universal Training Algorithm for Quantum Deep Learning

G. Verdon, J. Pye, and M. Broughton, arXiv preprint arXiv:1806.09729 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[36] [36]

J. R. Mcclean, S. Boixo, V. N. Smelyanskiy, R. Bab- bush, and H. Neven, Nature Communications 9 (2018), 10.1038/s41467-018-07090-4

work page doi:10.1038/s41467-018-07090-4 2018

[37] [37]

Evaluating analytic gradients on quantum hardware

M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, arXiv preprint arXiv:1811.11184 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[38] [38]

Harrow and J

A. Harrow and J. Napp, arXiv preprint arXiv:1901.05374 (2019)

work page arXiv 1901

[39] [39]

Z.-C. Yang, A. Rahmani, A. Shabani, H. Neven, and C. Chamon, Physical Review X 7, 021027 (2017). 12

work page 2017

[40] [40]

Grant, L

E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, arXiv preprint arXiv:1903.05076 (2019)

work page arXiv 1903

[41] [41]

Y. Chen, M. W. Hoﬀman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas, arXiv preprint arXiv:1611.03824 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[42] [42]

Andrychowicz, M

M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoﬀman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, in Advances in Neural Information Processing Systems (2016) pp. 3981–3989

work page 2016

[43] [43]

Neural Architecture Search with Reinforcement Learning

B. Zoph and Q. V. Le, arXiv preprint arXiv:1611.01578 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[44] [44]

C. Finn, P. Abbeel, and S. Levine, arXiv preprint arXiv:1703.03400 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[45] [45]

M. Long, Y. Cao, J. Wang, and M. I. Jordan, arXiv preprint arXiv:1502.02791 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[46] [46]

I. D. Kivlichan, J. McClean, N. Wiebe, C. Gidney, A. Aspuru-Guzik, G. K.-L. Chan, and R. Babbush, Phys. Rev. Lett. 120, 110501 (2018)

work page 2018

[47] [47]

Jiang, K

Z. Jiang, K. J. Sung, K. Kechedzhi, V. N. Smelyanskiy, and S. Boixo, Physical Review Applied 9, 044036 (2018)

work page 2018

[48] [48]

J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, SIAM Journal on optimization 9, 112 (1998)

work page 1998

[49] [49]

Nannicini, Physical Review E 99, 013304 (2019)

G. Nannicini, Physical Review E 99, 013304 (2019)

work page 2019

[50] [50]

G. G. Guerreschi and M. Smelyanskiy, arXiv preprint arXiv:1701.01450 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[51] [51]

J. C. Spall, IEEE Transactions on aerospace and elec- tronic systems 34, 817 (1998)

work page 1998

[52] [52]

M. J. Powell, Cambridge NA Report NA2009/06, Uni- versity of Cambridge, Cambridge , 26 (2009)

work page 2009

[53] [53]

Performance of hybrid quantum/classical variational heuristics for combinatorial optimization

G. Nannicini, arXiv preprint arXiv:1805.12037 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[54] [54]

McClean, J

J. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

work page 2016

[55] [55]

For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

In general, one may relay the raw measurement results to the classical processing unit, which can then compute the expectation value of the cost function. For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

work page

[56] [56]

A. Y. Kitaev, A. Shen, M. N. Vyalyi, and M. N. Vya- lyi, Classical and quantum computation , 47 (American Mathematical Soc., 2002)

work page 2002

[57] [57]

N. C. Rubin, R. Babbush, and J. McClean, New Journal of Physics 20, 053020 (2018)

work page 2018

[58] [58]

LeCun, D

Y. LeCun, D. Touresky, G. Hinton, and T. Sejnowski, in Proceedings of the 1988 connectionist models summer school, Vol. 1 (CMU, Pittsburgh, Pa: Morgan Kaufmann,

work page 1988

[59] [59]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, nature 323, 533 (1986)

work page 1986

[60] [60]

Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions

J. Romero and A. Aspuru-Guzik, arXiv preprint arXiv:1901.00848 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1901

[61] [61]

On First-Order Meta-Learning Algorithms

A. Nichol and J. Schulman, arXiv preprint arXiv:1803.02999 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[62] [62]

A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pas- canu, S. Osindero, and R. Hadsell, arXiv preprint arXiv:1807.05960 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[63] [63]

Blackbox and derivative- free optimization: theory, algorithms and applications,

C. Audet and M. Kokkolaras, “Blackbox and derivative- free optimization: theory, algorithms and applications,” (2016)

work page 2016

[64] [64]

Z. C. Lipton, J. Berkowitz, and C. Elkan, arXiv preprint arXiv:1506.00019 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[65] [65]

Pascanu, T

R. Pascanu, T. Mikolov, and Y. Bengio, in International Conference on Machine Learning (2013) pp. 1310–1318

work page 2013

[66] [66]

Hochreiter and J

S. Hochreiter and J. Schmidhuber, Neural computation 9, 1735 (1997)

work page 1997

[67] [67]

Bapat and S

A. Bapat and S. Jordan, arXiv preprint arXiv:1812.02746 (2018)

work page arXiv 2018

[68] [68]

A Quantum Approximate Optimization Algorithm for continuous problems

G. Verdon, J. M. Arrazola, K. Br´ adler, and N. Killoran, arXiv preprint arXiv:1902.00409 (2019)

work page internal anchor Pith review Pith/arXiv arXiv 1902

[69] [69]

C. E. Rasmussen, in Advanced lectures on machine learn- ing (Springer, 2004) pp. 63–71

work page 2004

[70] [70]

F. G. Brandao, M. Broughton, E. Farhi, S. Gutmann, and H. Neven, arXiv preprint arXiv:1812.04170 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[71] [71]

Suzuki, Physics Letters A 146, 319 (1990)

M. Suzuki, Physics Letters A 146, 319 (1990)

work page 1990

[72] [72]

Kadowaki and H

T. Kadowaki and H. Nishimori, Physical Review E 58, 5355 (1998)

work page 1998

[73] [73]

M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Physical Review X 8, 021050 (2018)

work page 2018

[74] [74]

J. R. McClean, I. D. Kivlichan, D. S. Steiger, Y. Cao, E. S. Fried, C. Gidney, T. H¨ aner, V. Havl´ ıˇ cek, Z. Jiang, M. Neeley, et al. , arXiv preprint arXiv:1710.07629 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[75] [75]

Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp

L. Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp. 55–69

work page 1998

[76] [76]

Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,

M. LLC, “Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,” (2018)

work page 2018

[77] [77]

Abadi, P

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al

work page

[78] [78]

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

S. Ioﬀe and C. Szegedy, arXiv preprint arXiv:1502.03167 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[79] [79]

D. J. Wales and J. P. Doye, The Journal of Physical Chemistry A 101, 5111 (1997)

work page 1997

[80] [80]

Omalley, R

P. Omalley, R. Babbush, I. D. Kivlichan, J. Romero, J. R. McClean, R. Barends, J. Kelly, P. Roushan, A. Tranter, N. Ding, et al. , Physical Review X 6, 031007 (2016)

work page 2016