pith. sign in

arxiv: 1907.05415 · v1 · pith:74W7H2KTnew · submitted 2019-07-11 · 🪐 quant-ph · cs.LG

Learning to learn with quantum neural networks via classical neural networks

Pith reviewed 2026-05-24 22:58 UTC · model grok-4.3

classification 🪐 quant-ph cs.LG
keywords quantum neural networksmeta-learningvariational algorithmsQAOAVQEparameter initializationrecurrent neural networks
0
0 comments X

The pith

Classical neural networks can be trained to suggest initial parameters that reduce the iterations needed for quantum variational algorithms to converge.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to train a classical recurrent neural network on small examples of QAOA and VQE problems so that it outputs parameter values close to good local minima. Starting a standard optimizer from those suggested values cuts the total number of cost-function evaluations required to reach a target accuracy. The same trained network works across different problem sizes, which means strategies learned from easy instances can be applied to harder ones. This hybrid setup lowers the number of quantum queries needed during optimization.

Core claim

Training classical recurrent neural networks on small instances of QAOA for MaxCut, QAOA for the Sherrington-Kirkpatrick model, and VQE for the Hubbard model produces parameter guesses that, when used to initialize other optimizers, measurably decrease the iteration count to a given accuracy; the learned mapping generalizes across a range of instance sizes.

What carries the argument

A classical recurrent neural network that takes a problem instance description and outputs approximate optimal parameters for the corresponding quantum circuit, acting as a learned initializer for variational optimization.

If this is right

  • Fewer total optimization steps are needed for the tested QAOA and VQE problems when started from the network's suggestions.
  • The learned initialization strategy transfers from small to larger problem instances without retraining.
  • Training can be performed entirely on classically simulatable sizes and then deployed on quantum hardware for bigger instances.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The method could be tested on other variational algorithms whose landscapes share similar structure.
  • One could measure how the quality of the network's guesses scales with the size gap between training and target instances.
  • If the network also outputs a suggested learning rate or optimizer choice, further reductions in quantum queries might be possible.

Load-bearing premise

The parameter guesses produced by the network trained on small instances remain useful when applied to larger instances that cannot be simulated classically.

What would settle it

Train the network on small instances, then measure on a fresh set of larger instances whether the initialized optimizer still requires fewer iterations than random or default initialization to reach the same accuracy threshold.

Figures

Figures reproduced from arXiv: 1907.05415 by Guillaume Verdon, Hartmut Neven, Jarrod R. McClean, Kevin J. Sung, Masoud Mohseni, Michael Broughton, Ryan Babbush, Zhang Jiang.

Figure 1
Figure 1. Figure 1: Unrolling the temporal quantum-classical hybrid [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 1
Figure 1. Figure 1: This hybrid computational graph can be con [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: The specific class of VQE problems we chose to [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 3
Figure 3. Figure 3: Displayed above are the average relative errors with respect to the number of objective function queries during the [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The above histograms represent the parameter [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
read the original abstract

Quantum Neural Networks (QNNs) are a promising variational learning paradigm with applications to near-term quantum processors, however they still face some significant challenges. One such challenge is finding good parameter initialization heuristics that ensure rapid and consistent convergence to local minima of the parameterized quantum circuit landscape. In this work, we train classical neural networks to assist in the quantum learning process, also know as meta-learning, to rapidly find approximate optima in the parameter landscape for several classes of quantum variational algorithms. Specifically, we train classical recurrent neural networks to find approximately optimal parameters within a small number of queries of the cost function for the Quantum Approximate Optimization Algorithm (QAOA) for MaxCut, QAOA for Sherrington-Kirkpatrick Ising model, and for a Variational Quantum Eigensolver for the Hubbard model. By initializing other optimizers at parameter values suggested by the classical neural network, we demonstrate a significant improvement in the total number of optimization iterations required to reach a given accuracy. We further demonstrate that the optimization strategies learned by the neural network generalize well across a range of problem instance sizes. This opens up the possibility of training on small, classically simulatable problem instances, in order to initialize larger, classically intractably simulatable problem instances on quantum devices, thereby significantly reducing the number of required quantum-classical optimization iterations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes training classical recurrent neural networks (RNNs) to meta-learn approximate optimal parameters for variational quantum algorithms, specifically QAOA applied to MaxCut and the Sherrington-Kirkpatrick Ising model as well as VQE for the Hubbard model. By using the RNN-suggested parameters to initialize standard classical optimizers, the authors report a reduction in the total number of optimization iterations needed to reach a target accuracy. They further claim that the learned initialization strategies generalize across a range of problem instance sizes, which would allow training on small classically simulatable instances for deployment on larger classically intractable instances.

Significance. If the reported iteration reductions and cross-size generalization are robust, the method could reduce the number of quantum circuit evaluations required for variational algorithms on near-term hardware, by shifting part of the optimization burden to classical pre-training on small instances.

major comments (2)
  1. [Abstract] Abstract: the central utility claim rests on the statement that 'the optimization strategies learned by the neural network generalize well across a range of problem instance sizes' and thereby open the possibility of training on small simulatable instances for larger intractable ones; however, the manuscript supplies no quantitative transfer results (e.g., iteration savings measured on instance sizes or graph ensembles outside the training distribution) that would substantiate this extrapolation once classical simulation becomes impossible.
  2. [Experimental sections] Experimental sections: the abstract and manuscript provide no description of the RNN training procedure, data splits, baseline optimizers, number of independent trials, or statistical significance tests for the claimed reduction in optimization iterations; without these details the empirical improvements cannot be assessed for robustness or reproducibility.
minor comments (1)
  1. [Abstract] Abstract: 'also know as' should read 'also known as'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and constructive comments. We address the major points below and will revise the manuscript to improve reproducibility and strengthen the presentation of generalization results.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central utility claim rests on the statement that 'the optimization strategies learned by the neural network generalize well across a range of problem instance sizes' and thereby open the possibility of training on small simulatable instances for larger intractable ones; however, the manuscript supplies no quantitative transfer results (e.g., iteration savings measured on instance sizes or graph ensembles outside the training distribution) that would substantiate this extrapolation once classical simulation becomes impossible.

    Authors: The manuscript reports quantitative iteration reductions when applying the RNN-initialized parameters to problem instances larger than those used during RNN training, for both QAOA (MaxCut and SK model) and VQE (Hubbard), with the savings shown across a range of sizes in the experimental figures. These results provide evidence of generalization within the simulatable regime. We agree that more explicit out-of-distribution transfer metrics would better support the extrapolation claim and will add a dedicated analysis with held-out size ranges and ensemble variations in the revised experimental section, along with updated abstract wording if appropriate. revision: partial

  2. Referee: [Experimental sections] Experimental sections: the abstract and manuscript provide no description of the RNN training procedure, data splits, baseline optimizers, number of independent trials, or statistical significance tests for the claimed reduction in optimization iterations; without these details the empirical improvements cannot be assessed for robustness or reproducibility.

    Authors: We acknowledge that these methodological details require expansion for full reproducibility. We will revise the experimental sections to include: (i) the complete RNN training procedure (architecture, meta-learning objective, and optimizer for the RNN itself); (ii) data generation and splits used for meta-training; (iii) the specific classical baseline optimizers compared; (iv) the number of independent trials per experiment; and (v) the statistical tests applied to the iteration reductions. Hyperparameter tables and pseudocode will also be added. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical meta-learning study with no derivation chain

full rationale

The paper describes training classical RNNs on small QAOA/VQE instances to suggest parameter initializations, then empirically measures reduced optimizer iterations on held-out instances and reports generalization across sizes via direct testing. No equations, derivations, uniqueness theorems, or ansatzes are presented that could reduce to fitted inputs or self-citations. All claims rest on experimental results against external simulation benchmarks rather than any self-referential construction, making the work self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated. The central claim rests on the unexamined assumption that meta-learned initializations transfer across instance sizes.

axioms (1)
  • domain assumption Meta-learned initialization strategies from small instances transfer to larger instances of the same problem class.
    Required for the claim that small-instance training can initialize classically intractable problems.

pith-pipeline@v0.9.0 · 5789 in / 1203 out tokens · 43527 ms · 2026-05-24T22:58:06.968107+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Graph-Conditioned Meta-Optimizer for QAOA Parameter Generation on Multiple Problem Classes

    quant-ph 2026-04 unverdicted novelty 7.0

    A graph-conditioned meta-optimizer learns QAOA parameter trajectories from one problem class and transfers them to others, yielding better initializations than standard methods in an empirical study of 64 settings.

Reference graph

Works this paper leans on

81 extracted references · 81 canonical work pages · cited by 1 Pith paper · 23 internal anchors

  1. [1]

    We consider an efficient optimizer to be one which finds sufficiently optimal approximate local minima of cost functions in as few function queries as possible

    Meta-Training & Loss functions The objective of quantum-classical meta-learning is to train our RNN to learn an efficient parameter update scheme for a family of cost functions of interest, i.e., to discover an optimizer which efficiently optimizes a cer- tain distribution of optimizees, on average. We consider an efficient optimizer to be one which finds sufficie...

  2. [2]

    (Hubbard VQE). We provide a brief introduction to each of these three classes, as well as describe the dis- tribution of instances from these classes from which we sampled to generate training and testing instances. A. Quantum Approximate Optimization Algorithms Let us first introduce a general QAOA ansatz before we specialize to applications to MaxCut pro...

  3. [3]

    Let us first provide a brief introduc- tion to the MaxCut problem

    MaxCut QAOA The problem for which the QAOA was first explored was for MaxCut [3]. Let us first provide a brief introduc- tion to the MaxCut problem. Suppose we have a graph G ={V,E} whereE are the edges and V the vertices. Given a partition of these vertices into a subset P0 and its complement P1 =V\P 0, the corresponding cut set C⊆E is the subset of edges ...

  4. [4]

    Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural networks [24, 66])

    Ising QAOA Another domain of application where we tested quantum-classical meta-learning was with the QAOA for finding low energy states of a type of Ising spin glass model known as the Sherrington-Kirkpatrick (SK) model. Many problems in combinatorial optimization can be mapped to these models [65] (for example, training Boltz- mann machine neural network...

  5. [5]

    Hubbard Model VQE Here we describe the variational quantum eigensolver (VQE) ansatze that were used to generate the results in Fig. 3. The specific class of VQE problems we chose to consider were for variational preparation of ground states of Hubbard model lattices [39]. The Hubbard model is an idealized model of fermions interacting on a lattice. The 2D ...

  6. [6]

    For the testing instances used to generate the results presented in Figure 3, following a standard prescription

  7. [7]

    for the number of repetitions required to guarantee an upper bound to the variance of 0.05, the number of rep- etitions (QNN inference runs) should be of (7 ± 4)× 105 repetitions for MaxCut QAOA, (2 .3± 0.6)× 104 repe- titions for Ising QAOA, and (3 ± 2)× 105 repetitions for Hubbard VQE. In terms of wall clock time, assuming that the QPU can execute 10000...

  8. [8]

    Quantum Computing in the NISQ era and beyond

    J. Preskill, arXiv preprint arXiv:1801.00862 (2018)

  9. [9]

    Classification with Quantum Neural Networks on Near Term Processors

    E. Farhi and H. Neven, arXiv preprint arXiv:1802.06002 (2018)

  10. [10]

    A Quantum Approximate Optimization Algorithm

    E. Farhi, J. Goldstone, and S. Gutmann, arXiv preprint arXiv:1411.4028 (2014)

  11. [11]

    Peruzzo, J

    A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. Obrien, Nature communications 5, 4213 (2014)

  12. [12]

    Killoran, T

    N. Killoran, T. R. Bromley, J. M. Arrazola, M. Schuld, N. Quesada, and S. Lloyd, arXiv preprint arXiv:1806.06871 (2018)

  13. [13]

    Wecker, M

    D. Wecker, M. B. Hastings, and M. Troyer, Phys. Rev. A 92, 042303 (2015)

  14. [14]

    Biamonte, P

    J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and S. Lloyd, Nature 549, 195 (2017)

  15. [15]

    Zhou, S.-T

    L. Zhou, S.-T. Wang, S. Choi, H. Pichler, and M. D. Lukin, arXiv preprint arXiv:1812.01041 (2018)

  16. [16]

    J. R. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

  17. [17]

    From the Quantum Approximate Optimization Algorithm to a Quantum Alternating Operator Ansatz

    S. Hadfield, Z. Wang, B. O’Gorman, E. G. Rief- fel, D. Venturelli, and R. Biswas, arXiv preprint arXiv:1709.03489 (2017)

  18. [18]

    Grant, M

    E. Grant, M. Benedetti, S. Cao, A. Hallam, J. Lockhart, V. Stojevic, A. G. Green, and S. Severini, npj Quantum Information 4, 65 (2018)

  19. [19]

    Khatri, R

    S. Khatri, R. LaRose, A. Poremba, L. Cincio, A. T. Sorn- borger, and P. J. Coles, Quantum 3, 140 (2019)

  20. [20]

    Schuld and N

    M. Schuld and N. Killoran, Physical review letters 122, 040504 (2019)

  21. [21]

    McArdle, T

    S. McArdle, T. Jones, S. Endo, Y. Li, S. Benjamin, and X. Yuan, arXiv preprint arXiv:1804.03023 (2018)

  22. [22]

    Benedetti, E

    M. Benedetti, E. Grant, L. Wossnig, and S. Severini, New Journal of Physics 21, 043023 (2019)

  23. [23]

    B. Nash, V. Gheorghiu, and M. Mosca, arXiv preprint arXiv:1904.01972 (2019)

  24. [24]

    Jiang, J

    Z. Jiang, J. McClean, R. Babbush, and H. Neven, arXiv preprint arXiv:1812.08190 (2018)

  25. [25]

    G. R. Steinbrecher, J. P. Olson, D. Englund, and J. Car- olan, arXiv preprint arXiv:1808.10047 (2018)

  26. [26]

    A quantum alternating operator ansatz with hard and soft constraints for lattice protein folding

    M. Fingerhuth, T. Babej, et al. , arXiv preprint arXiv:1810.13411 (2018)

  27. [27]

    Variational Quantum State Diagonalization

    R. LaRose, A. Tikku, ´E. O’Neel-Judy, L. Cincio, and P. J. Coles, arXiv preprint arXiv:1810.10506 (2018)

  28. [28]

    Cincio, Y

    L. Cincio, Y. Suba¸ sı, A. T. Sornborger, and P. J. Coles, New Journal of Physics 20, 113022 (2018)

  29. [29]

    H. Situ, Z. Huang, X. Zou, and S. Zheng, Quantum Information Processing 18, 230 (2019)

  30. [30]

    H. Chen, L. Wossnig, S. Severini, H. Neven, and M. Mohseni, arXiv preprint arXiv:1805.08654 (2018)

  31. [31]

    Verdon, M

    G. Verdon, M. Broughton, and J. Biamonte, arXiv preprint arXiv:1712.05304 (2017)

  32. [32]

    LeCun, Y

    Y. LeCun, Y. Bengio, and G. Hinton, nature 521, 436 (2015)

  33. [33]

    Goodfellow, Y

    I. Goodfellow, Y. Bengio, A. Courville, and Y. Bengio, Deep learning, Vol. 1 (MIT press Cambridge, 2016)

  34. [34]

    Schmidhuber, Neural networks 61, 85 (2015)

    J. Schmidhuber, Neural networks 61, 85 (2015)

  35. [35]

    A Universal Training Algorithm for Quantum Deep Learning

    G. Verdon, J. Pye, and M. Broughton, arXiv preprint arXiv:1806.09729 (2018)

  36. [36]

    J. R. Mcclean, S. Boixo, V. N. Smelyanskiy, R. Bab- bush, and H. Neven, Nature Communications 9 (2018), 10.1038/s41467-018-07090-4

  37. [37]

    Evaluating analytic gradients on quantum hardware

    M. Schuld, V. Bergholm, C. Gogolin, J. Izaac, and N. Killoran, arXiv preprint arXiv:1811.11184 (2018)

  38. [38]

    Harrow and J

    A. Harrow and J. Napp, arXiv preprint arXiv:1901.05374 (2019)

  39. [39]

    Z.-C. Yang, A. Rahmani, A. Shabani, H. Neven, and C. Chamon, Physical Review X 7, 021027 (2017). 12

  40. [40]

    Grant, L

    E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, arXiv preprint arXiv:1903.05076 (2019)

  41. [41]

    Y. Chen, M. W. Hoffman, S. G. Colmenarejo, M. Denil, T. P. Lillicrap, M. Botvinick, and N. de Freitas, arXiv preprint arXiv:1611.03824 (2016)

  42. [42]

    Andrychowicz, M

    M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, and N. De Freitas, in Advances in Neural Information Processing Systems (2016) pp. 3981–3989

  43. [43]

    Neural Architecture Search with Reinforcement Learning

    B. Zoph and Q. V. Le, arXiv preprint arXiv:1611.01578 (2016)

  44. [44]

    C. Finn, P. Abbeel, and S. Levine, arXiv preprint arXiv:1703.03400 (2017)

  45. [45]

    M. Long, Y. Cao, J. Wang, and M. I. Jordan, arXiv preprint arXiv:1502.02791 (2015)

  46. [46]

    I. D. Kivlichan, J. McClean, N. Wiebe, C. Gidney, A. Aspuru-Guzik, G. K.-L. Chan, and R. Babbush, Phys. Rev. Lett. 120, 110501 (2018)

  47. [47]

    Jiang, K

    Z. Jiang, K. J. Sung, K. Kechedzhi, V. N. Smelyanskiy, and S. Boixo, Physical Review Applied 9, 044036 (2018)

  48. [48]

    J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, SIAM Journal on optimization 9, 112 (1998)

  49. [49]

    Nannicini, Physical Review E 99, 013304 (2019)

    G. Nannicini, Physical Review E 99, 013304 (2019)

  50. [50]

    G. G. Guerreschi and M. Smelyanskiy, arXiv preprint arXiv:1701.01450 (2017)

  51. [51]

    J. C. Spall, IEEE Transactions on aerospace and elec- tronic systems 34, 817 (1998)

  52. [52]

    M. J. Powell, Cambridge NA Report NA2009/06, Uni- versity of Cambridge, Cambridge , 26 (2009)

  53. [53]
  54. [54]

    McClean, J

    J. McClean, J. Romero, R. Babbush, and A. Aspuru- Guzik, New Journal of Physics 18, 023023 (2016)

  55. [55]

    For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

    In general, one may relay the raw measurement results to the classical processing unit, which can then compute the expectation value of the cost function. For the purposes of this paper, we assumed the classical optimizer only has access to (noisy) estimates of the expectation value of the cost Hamiltonian

  56. [56]

    A. Y. Kitaev, A. Shen, M. N. Vyalyi, and M. N. Vya- lyi, Classical and quantum computation , 47 (American Mathematical Soc., 2002)

  57. [57]

    N. C. Rubin, R. Babbush, and J. McClean, New Journal of Physics 20, 053020 (2018)

  58. [58]

    LeCun, D

    Y. LeCun, D. Touresky, G. Hinton, and T. Sejnowski, in Proceedings of the 1988 connectionist models summer school, Vol. 1 (CMU, Pittsburgh, Pa: Morgan Kaufmann,

  59. [59]

    D. E. Rumelhart, G. E. Hinton, and R. J. Williams, nature 323, 533 (1986)

  60. [60]

    Variational quantum generators: Generative adversarial quantum machine learning for continuous distributions

    J. Romero and A. Aspuru-Guzik, arXiv preprint arXiv:1901.00848 (2019)

  61. [61]

    On First-Order Meta-Learning Algorithms

    A. Nichol and J. Schulman, arXiv preprint arXiv:1803.02999 (2018)

  62. [62]

    A. A. Rusu, D. Rao, J. Sygnowski, O. Vinyals, R. Pas- canu, S. Osindero, and R. Hadsell, arXiv preprint arXiv:1807.05960 (2018)

  63. [63]

    Blackbox and derivative- free optimization: theory, algorithms and applications,

    C. Audet and M. Kokkolaras, “Blackbox and derivative- free optimization: theory, algorithms and applications,” (2016)

  64. [64]

    Z. C. Lipton, J. Berkowitz, and C. Elkan, arXiv preprint arXiv:1506.00019 (2015)

  65. [65]

    Pascanu, T

    R. Pascanu, T. Mikolov, and Y. Bengio, in International Conference on Machine Learning (2013) pp. 1310–1318

  66. [66]

    Hochreiter and J

    S. Hochreiter and J. Schmidhuber, Neural computation 9, 1735 (1997)

  67. [67]

    Bapat and S

    A. Bapat and S. Jordan, arXiv preprint arXiv:1812.02746 (2018)

  68. [68]

    A Quantum Approximate Optimization Algorithm for continuous problems

    G. Verdon, J. M. Arrazola, K. Br´ adler, and N. Killoran, arXiv preprint arXiv:1902.00409 (2019)

  69. [69]

    C. E. Rasmussen, in Advanced lectures on machine learn- ing (Springer, 2004) pp. 63–71

  70. [70]

    F. G. Brandao, M. Broughton, E. Farhi, S. Gutmann, and H. Neven, arXiv preprint arXiv:1812.04170 (2018)

  71. [71]

    Suzuki, Physics Letters A 146, 319 (1990)

    M. Suzuki, Physics Letters A 146, 319 (1990)

  72. [72]

    Kadowaki and H

    T. Kadowaki and H. Nishimori, Physical Review E 58, 5355 (1998)

  73. [73]

    M. H. Amin, E. Andriyash, J. Rolfe, B. Kulchytskyy, and R. Melko, Physical Review X 8, 021050 (2018)

  74. [74]

    J. R. McClean, I. D. Kivlichan, D. S. Steiger, Y. Cao, E. S. Fried, C. Gidney, T. H¨ aner, V. Havl´ ıˇ cek, Z. Jiang, M. Neeley, et al. , arXiv preprint arXiv:1710.07629 (2017)

  75. [75]

    Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp

    L. Prechelt, in Neural Networks: Tricks of the trade (Springer, 1998) pp. 55–69

  76. [76]

    Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,

    M. LLC, “Cirq: A python framework for creating, edit- ing, and invoking noisy intermediate scale quantum cir- cuits,” (2018)

  77. [77]

    Abadi, P

    M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard, et al

  78. [78]

    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

    S. Ioffe and C. Szegedy, arXiv preprint arXiv:1502.03167 (2015)

  79. [79]

    D. J. Wales and J. P. Doye, The Journal of Physical Chemistry A 101, 5111 (1997)

  80. [80]

    Omalley, R

    P. Omalley, R. Babbush, I. D. Kivlichan, J. Romero, J. R. McClean, R. Barends, J. Kelly, P. Roushan, A. Tranter, N. Ding, et al. , Physical Review X 6, 031007 (2016)

Showing first 80 references.