Introduction to the artificial neural network-based variational Monte Carlo method

William Freitas

arxiv: 2603.15460 · v2 · pith:RS27PI6Qnew · submitted 2026-03-16 · ⚛️ physics.comp-ph

Introduction to the artificial neural network-based variational Monte Carlo method

William Freitas This is my paper

Pith reviewed 2026-05-21 11:07 UTC · model grok-4.3

classification ⚛️ physics.comp-ph

keywords variational Monte Carloneural networkstrial wave functionsunsupervised learningquantum ground statesYukawa potentialhydrogen moleculevariational optimization

0 comments

The pith

Neural networks can represent quantum wave functions so that variational Monte Carlo optimization behaves as a stable unsupervised learning process.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to construct trial wave functions for quantum systems by using artificial neural networks inside the variational Monte Carlo framework. It lays out the mathematical mapping from network parameters to quantum states and explains why this representation brings advantages for optimization and insight. A central point is that the variational energy minimization can be understood as unsupervised learning, where the existence of many local minima helps rather than hinders the search for good approximations. Feature extraction from the trained network then supplies physical understanding of the system. The ideas are demonstrated on the Yukawa potential and the hydrogen molecule.

Core claim

The variational method with neural-network trial states functions as an unsupervised learning algorithm. The landscape of multiple minima in the variational energy is treated as an asset that produces stable optimization rather than trapping the procedure. The network parameters encode the quantum state, and the learned features inside the network allow extraction of physical information from the optimized trial function.

What carries the argument

Representation of a quantum many-body wave function as the output of an artificial neural network whose parameters are variationally optimized to minimize the energy expectation value.

If this is right

Optimization remains stable even when the energy surface contains many local minima.
Internal network features can be examined to extract physical properties of the modeled system.
The same construction applies to model problems such as the Yukawa potential and to simple molecules such as H2.
Standard machine-learning training procedures integrate directly with variational quantum calculations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method may extend to larger systems where conventional trial functions become impractical.
Similar network representations could be tested on time-dependent or excited-state problems.
Connections to other neural ansatze in quantum simulation might allow hybrid constructions.

Load-bearing premise

A neural network can capture the essential features of the true ground-state wave function without introducing uncontrolled bias into the computed variational energy.

What would settle it

For the hydrogen molecule, compare the neural-network variational energy against the known exact ground-state energy; a statistically significant deviation beyond Monte Carlo sampling error would falsify faithful representation.

Figures

Figures reproduced from arXiv: 2603.15460 by William Freitas.

**Figure 2.** Figure 2: FIG. 2. Analogies between the variational method and ma [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. Minimization process for the Morse oscillator. The [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. ANN-based trial state description of the Yukawa po [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8. The minimization process for the H [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

read the original abstract

The construction of trial wave functions based on neural networks combined with the variational Monte Carlo method is discussed. The mathematical formulation for representing quantum states as artificial neural networks is introduced. The advantages of employing such trial states and how machine learning works are discussed. It is shown that the variational method is a kind of unsupervised learning algorithm, where the multiple minima landscape is used as an asset that leads to a stable optimization procedure. The feature representation plays an important role on interpretability and on extracting physical insights from nontrivial trial wave functions. The algorithm is illustrated for the Yukawa potential and the hydrogen molecule.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a basic tutorial on neural-network VMC that states the multiple-minima landscape aids stable optimization but supplies no evidence or analysis for it.

read the letter

The paper introduces the use of artificial neural networks as trial wave functions inside variational Monte Carlo. It sets out the mathematical representation of quantum states via networks, notes the advantages over hand-crafted forms, and frames the variational procedure as a kind of unsupervised learning that treats the multiple-minima energy landscape as helpful for stable optimization. The method is shown on the Yukawa potential and the hydrogen molecule, with some discussion of how feature choices affect interpretability and physical insight. That framing is the main thing a reader will take away. What the paper does reasonably is present the standard formulation in accessible steps and point out the unsupervised-learning angle without overclaiming new theorems. The illustrations are on familiar systems, which keeps the focus on the method rather than on exotic physics. The soft spot is the central assertion about the multiple-minima landscape. The text states that multiplicity is an asset for stability, yet the examples contain no landscape plots, no ensemble of runs from varied starting points, and no comparison against regularized or single-minimum baselines. Without that, the claim stays at the level of an observation rather than a demonstrated result. Numerical details on network depth, activation choices, and sampling parameters are also thin, which limits how far a reader can check the reported behavior. This piece is mainly for graduate students or researchers who want a first exposure to neural-network VMC before diving into the primary literature. It does not contain new results or scaling tests, so it does not need a full referee process in a research journal. It could stay on arXiv as a short methods note or be expanded for a pedagogical venue if the authors add the missing checks on the optimization landscape.

Referee Report

2 major / 2 minor

Summary. The paper introduces the use of artificial neural networks to represent trial wave functions in the variational Monte Carlo (VMC) method for quantum systems. It presents the mathematical formulation for ANN-based quantum states, frames the variational optimization as an unsupervised learning procedure that treats the multiple-minima energy landscape as an asset for stable convergence, discusses advantages and the role of feature representation for interpretability and physical insight, and illustrates the approach on the Yukawa potential and the hydrogen molecule.

Significance. If the central claims are substantiated, the manuscript provides a pedagogical entry point to neural-network VMC that could help researchers in computational physics integrate machine-learning representations into variational calculations. The emphasis on unsupervised learning and interpretability through features offers a distinct perspective that may facilitate extraction of physical insights from nontrivial trial states, though the overall impact depends on whether the stability argument is demonstrated beyond the standard VMC setup.

major comments (2)

[illustrations for the Yukawa potential and hydrogen molecule] The central claim that the variational method functions as an unsupervised learning algorithm where the multiple-minima landscape serves as an asset for stable optimization is not supported by the illustrations. The examples for the Yukawa potential and hydrogen molecule supply no energy-landscape visualizations, no ensemble statistics from varied initial conditions, and no comparisons to single-minimum or regularized baselines that would show multiplicity conferring stability rather than variability or trapping.
[mathematical formulation and advantages of ANN trial states] The assumption that the chosen neural-network architectures faithfully capture essential features of the ground-state wave function without introducing uncontrolled variational bias is stated but not quantified. No systematic checks (e.g., comparison of variational energies against known exact or high-accuracy benchmarks, or variation of network depth/width) are reported to bound the representation error for the two example systems.

minor comments (2)

[mathematical formulation] Notation for the neural-network parameters and the Monte Carlo sampling procedure should be introduced with explicit definitions and consistent symbols across the text and any equations.
[illustrations] The manuscript would benefit from a short table summarizing the network architectures, hyperparameters, and obtained variational energies for the two illustrated systems to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We value the assessment that the manuscript offers a pedagogical entry point to neural-network VMC and the emphasis on unsupervised learning and interpretability. We address the two major comments below, indicating the revisions we will make.

read point-by-point responses

Referee: [illustrations for the Yukawa potential and hydrogen molecule] The central claim that the variational method functions as an unsupervised learning algorithm where the multiple-minima landscape serves as an asset for stable optimization is not supported by the illustrations. The examples for the Yukawa potential and hydrogen molecule supply no energy-landscape visualizations, no ensemble statistics from varied initial conditions, and no comparisons to single-minimum or regularized baselines that would show multiplicity conferring stability rather than variability or trapping.

Authors: We agree that the current illustrations do not contain energy-landscape visualizations, ensemble statistics over varied initial conditions, or explicit comparisons against single-minimum or regularized baselines. The manuscript presents the multiple-minima landscape as an asset for stable convergence as a conceptual framing of the variational principle, drawing on the general structure of the energy functional rather than on new empirical demonstrations within the two examples. The Yukawa and hydrogen-molecule calculations are intended to illustrate the implementation and basic convergence behavior. We will revise the relevant sections to make this distinction explicit, add a concise discussion of why multiple random starts are commonly employed in VMC to reduce the risk of local-minimum trapping, and, if space allows, include a brief qualitative remark on the optimization trajectories observed in the reported runs. revision: partial
Referee: [mathematical formulation and advantages of ANN trial states] The assumption that the chosen neural-network architectures faithfully capture essential features of the ground-state wave function without introducing uncontrolled variational bias is stated but not quantified. No systematic checks (e.g., comparison of variational energies against known exact or high-accuracy benchmarks, or variation of network depth/width) are reported to bound the representation error for the two example systems.

Authors: We acknowledge that the manuscript does not provide systematic comparisons of the obtained variational energies to exact or high-accuracy reference values, nor does it explore variations in network depth or width to quantify representation error. Because the work is framed as an introduction, the emphasis lies on the mathematical formulation and the conceptual advantages of ANN trial states rather than on exhaustive benchmarking. For the hydrogen molecule the exact ground-state energy is known, and for the Yukawa potential accurate numerical references exist. We will revise the manuscript to report the variational energies obtained for both systems alongside the corresponding literature or exact values, and we will add a short paragraph discussing the architectural choices and the expected magnitude of the representation bias for these simple cases. revision: yes

Circularity Check

0 steps flagged

No circularity: variational principle invoked externally; reframing is interpretive

full rationale

The paper introduces ANN representations for trial wave functions in VMC and illustrates them on the Yukawa potential and H2. The central statement that the variational method acts as unsupervised learning with multiple minima as an asset is presented as a perspective on the standard variational principle from quantum mechanics, which is cited as an independent external benchmark rather than derived from the paper's own fits or definitions. No equations reduce to self-definition, no fitted parameters are relabeled as predictions, and no load-bearing uniqueness theorem or ansatz is imported via self-citation. The examples function as demonstrations without claiming to derive the optimization stability from the multiplicity by construction. The derivation chain therefore remains self-contained against external quantum-mechanical foundations.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper relies on standard quantum mechanics and machine-learning concepts without introducing new fitted parameters, axioms beyond established variational principles, or invented entities.

axioms (2)

standard math The variational principle provides an upper bound to the ground-state energy
Invoked as the foundation for optimizing trial wave functions
domain assumption Neural networks are universal function approximators capable of representing quantum states
Central premise for using ANNs as trial wave functions

pith-pipeline@v0.9.0 · 5610 in / 1286 out tokens · 32985 ms · 2026-05-21T11:07:38.190683+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

It is shown that the variational method is a kind of unsupervised learning algorithm, where the multiple minima landscape is used as an asset that leads to a stable optimization procedure.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The ANN based trial states... are quite limited. Due to the simplicity and lack of physical insights of the Ansatz...

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

62 extracted references · 62 canonical work pages · 3 internal anchors

[1]

Introduction to the artificial neural network-based variational Monte Carlo method

and particle physics [2] up to many-body quantum sys- tems [3], quantum chemistry [4, 5], statistical mechanics [6], and materials [7]. The absorption of artificial neu- ral networks (ANN) and other artificial intelligence (AI) tools in the physics research is natural in some respects. The reason is that those tools are built with the pur- pose of recogni...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

Ntampaka, H

M. Ntampaka, H. Trac, D. J. Sutherland, N. Battaglia, B. P´ oczos, and J. Schneider, The Astrophysical Journal 803, 50 (2015)

work page 2015
[3]

Guest, K

D. Guest, K. Cranmer, and D. Whiteson, Annual Review of Nuclear and Particle Science68, 161 (2018)

work page 2018
[4]

D.-L. Deng, X. Li, and S. Das Sarma, Phys. Rev. X7, 021021 (2017)

work page 2017
[5]

D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Phys. Rev. Res.2, 033429 (2020)

work page 2020
[6]

Better, faster fermionic neural networks,

J. S. Spencer, D. Pfau, A. Botev, and W. M. C. Foulkes, “Better, faster fermionic neural networks,” (2020), arXiv:2011.07125 [physics.comp-ph]

work page arXiv 2020
[7]

D. Wu, L. Wang, and P. Zhang, Phys. Rev. Lett.122, 080602 (2019)

work page 2019
[8]

T. Wen, L. Zhang, H. Wang, W. E, and D. J. Srolovitz, Materials Futures1, 022601 (2022)

work page 2022
[9]

Bardeen, L

J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev.108, 1175 (1957)

work page 1957
[10]

Anderson, Materials Research Bulletin8, 153 (1973)

P. Anderson, Materials Research Bulletin8, 153 (1973)

work page 1973
[11]

R. B. Laughlin, Phys. Rev. Lett.50, 1395 (1983)

work page 1983
[12]

Cybenko, Mathematics of Control, Signals and Sys- tems2, 303 (1989)

G. Cybenko, Mathematics of Control, Signals and Sys- tems2, 303 (1989)

work page 1989
[13]

Nagy and V

A. Nagy and V. Savona, Physical Review Letters122, 250501 (2019)

work page 2019
[14]

C. Roth, A. Szab´ o, and A. H. MacDonald, Physical Re- view B108, 054410 (2023)

work page 2023
[15]

L. Yang, Z. Leng, G. Yu, A. Patel, W.-J. Hu, and H. Pu, Physical Review Research2, 012039 (2020)

work page 2020
[16]

Y. Qian, W. Fu, W. Ren, and J. Chen, The Journal of Chemical Physics157, 164104 (2022)

work page 2022
[17]

D. Pfau, S. Axelrod, H. Sutterud, I. von Glehn, and J. S. Spencer, Science385, eadn0137 (2024), publisher: American Association for the Advancement of Science

work page 2024
[18]

Pescia, J

G. Pescia, J. Nys, J. Kim, A. Lovato, and G. Carleo, Phys. Rev. B110, 035108 (2024), publisher: American Physical Society

work page 2024
[19]

Cassella, H

G. Cassella, H. Sutterud, S. Azadi, N. Drummond, D. Pfau, J. S. Spencer, and W. Foulkes, Phys. Rev. Lett. 130, 036401 (2023), publisher: American Physical Soci- ety

work page 2023
[20]

W. T. Lou, H. Sutterud, G. Cassella, W. Foulkes, J. Knolle, D. Pfau, and J. S. Spencer, Phys. Rev. X14, 021030 (2024), publisher: American Physical Society

work page 2024
[21]

Freitas and S

W. Freitas and S. A. Vitiello, Quantum7, 1209 (2023)

work page 2023
[22]

Freitas, B

W. Freitas, B. Abreu, and S. A. Vitiello, Journal of Low Temperature Physics (2024), 10.1007/s10909-024-03061- w

work page doi:10.1007/s10909-024-03061- 2024
[23]

Pescia, J

G. Pescia, J. Han, A. Lovato, J. Lu, and G. Carleo, Phys. Rev. Research4, 023138 (2022)

work page 2022
[24]

H. H. Goldstine,The computer from Pascal to von Neu- mann(Princeton University Press, 1993)

work page 1993
[25]

A. A. Lovelace, Taylor’s Scientific Memoirs3, 666 (1842)

work page
[26]

Welchman,The Hut Six Story: Breaking the Enigma Codes(McGraw-Hill, 1982)

G. Welchman,The Hut Six Story: Breaking the Enigma Codes(McGraw-Hill, 1982)

work page 1982
[27]

RANDELL, inA History of Computing in the Twentieth Century, edited by N

B. RANDELL, inA History of Computing in the Twentieth Century, edited by N. METROPOLIS, J. HOWLETT, and G.-C. ROTA (Academic Press, San Diego, 1980) pp. 47–92

work page 1980
[28]

H. H. Goldstine and A. Goldstine, inThe Origins of Digi- tal Computers: Selected Papers(Springer, 1946) pp. 359– 373

work page 1946
[29]

Von Neumann, IEEE Annals of the History of Com- puting15, 27 (1993)

J. Von Neumann, IEEE Annals of the History of Com- puting15, 27 (1993)

work page 1993
[30]

A. M. TURING, MindLIX, 433 (1950), https://academic.oup.com/mind/article- pdf/LIX/236/433/30123314/lix-236-433.pdf

work page 1950
[31]

Early science acceleration experiments with gpt-5,

S. Bubeck, C. Coester, R. Eldan, T. Gowers, Y. T. Lee, A. Lupsasca, M. Sawhney, R. Scherrer, M. Sellke, B. K. Spears, D. Unutmaz, K. Weil, S. Yin, and N. Zhiv- otovskiy, “Early science acceleration experiments with gpt-5,” (2025), arXiv:2511.16072 [cs.CL]

work page arXiv 2025
[32]

DeepSeek-V3 Technical Report

DeepSeek-AI, “Deepseek-v3 technical report,” (2025), arXiv:2412.19437 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2025
[33]

Uchendu, Z

A. Uchendu, Z. Ma, T. Le, R. Zhang, and D. Lee, CoRR abs/2109.13296(2021)

work page arXiv 2021
[34]

Dargan, S

S. Dargan, S. Bansal, M. Kumar, A. Mittal, and K. Ku- mar, ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING30, 1057 (2023)

work page 2023
[35]

Z. Wang, J. Zhan, C. Duan, X. Guan, P. Lu, and K. Yang, IEEE TRANSACTIONS ON NEURAL NET- WORKS AND LEARNING SYSTEMS34, 3811 (2023)

work page 2023
[36]

Raucci, A

U. Raucci, A. Valentini, E. Pieri, H. Weir, S. Seritan, and T. J. Martinez, NATURE COMPUTATIONAL SCI- ENCE1, 42 (2021)

work page 2021
[37]

L. Li, B. Lei, and C. Mao, JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION26(2022)

work page 2022
[38]

Moore, PROCEEDINGS OF THE IEEE86, 82 (1998)

G. Moore, PROCEEDINGS OF THE IEEE86, 82 (1998)

work page 1998
[39]

Roser, H

M. Roser, H. Ritchie, and E. Mathieu, Our World in Data (2023), https://ourworldindata.org/moores-law

work page 2023
[40]

Goodfellow, Y

I. Goodfellow, Y. Bengio, and A. Courville,Deep learn- ing(MIT press, 2016)

work page 2016
[41]

W. S. McCulloch and W. Pitts, The bulletin of mathe- matical biophysics5, 115 (1943)

work page 1943
[42]

Rosenblatt, Psychological review65, 386 (1958)

F. Rosenblatt, Psychological review65, 386 (1958)

work page 1958
[43]

Fukushima, Biological Cybernetics20, 121 (1975)

K. Fukushima, Biological Cybernetics20, 121 (1975)

work page 1975
[44]

D. E. Rumelhart, J. L. McClelland, and P. R. Group, Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations(The MIT Press, 1986)

work page 1986
[45]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Nature323, 533 (1986)

work page 1986
[46]

G. E. Hinton, Progress in brain research165, 535 (2007)

work page 2007
[47]

LeCun, Y

Y. LeCun, Y. Bengio,et al., Large-scale kernel machines 5, 127 (2007)

work page 2007
[48]

Carrasquilla and R

J. Carrasquilla and R. G. Melko, Nature Physics13, 431 (2017)

work page 2017
[49]

Carleo, I

G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborov´ a, Rev. Mod. Phys.91, 045002 (2019). 11

work page 2019
[50]

Gy¨ orgyi, Phys

G. Gy¨ orgyi, Phys. Rev. A41, 7097 (1990)

work page 1990
[51]

Albert and R

J. Albert and R. H. Swendsen, Physics Procedia57, 99 (2014), proceedings of the 27th Workshop on Com- puter Simulation Studies in Condensed Matter Physics (CSP2014)

work page 2014
[52]

Cohen-Tannoudji, B

C. Cohen-Tannoudji, B. Diu, F. Laloe, and B. Dui, Quantum Mechanics (2 vol. set)(Wiley-Interscience, 2006)

work page 2006
[53]

Metropolis, A

N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, The Journal of Chemical Physics21, 1087 (1953)

work page 1953
[54]

Mascagni and A

M. Mascagni and A. Srinivasan, ACM Trans. Math. Softw.26, 436–461 (2000)

work page 2000
[55]

JAX: com- posable transformations of Python+NumPy programs,

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Vander- Plas, S. Wanderman-Milne, and Q. Zhang, “JAX: com- posable transformations of Python+NumPy programs,” (2018)

work page 2018
[56]

Mitchell, Publisher: McGraw Hill (1997)

T. Mitchell, Publisher: McGraw Hill (1997)

work page 1997
[57]

Adam: A Method for Stochastic Optimization

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2017), arXiv:1412.6980 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[58]

Artificial Neural Network-based Variational Monte Carlo Introduction,

W. Freitas, “Artificial Neural Network-based Variational Monte Carlo Introduction,” (2025)

work page 2025
[59]

Napsuciale and S

M. Napsuciale and S. Rodr´ ıguez, Physics Letters B816, 136218 (2021)

work page 2021
[60]

Kosztin, B

I. Kosztin, B. Faber, and K. Schulten, American Journal of Physics64, 633 (1996)

work page 1996
[61]

Ruggeri, S

M. Ruggeri, S. Moroni, and M. Holzmann, Phys. Rev. Lett.120, 205302 (2018)

work page 2018
[62]

Freitas, B

W. Freitas, B. Abreu, and S. A. Vitiello, Phys. Rev. B 112, 165109 (2025). Appendix A: Blocking method and standard error estimation Since the Monte Carlo integration is based on a Markov chain, successive samples are correlated. Therefore, the effective number of samples is smaller than the actual number of samplesM. Consequently, the usual standard erro...

work page 2025

[1] [1]

Introduction to the artificial neural network-based variational Monte Carlo method

and particle physics [2] up to many-body quantum sys- tems [3], quantum chemistry [4, 5], statistical mechanics [6], and materials [7]. The absorption of artificial neu- ral networks (ANN) and other artificial intelligence (AI) tools in the physics research is natural in some respects. The reason is that those tools are built with the pur- pose of recogni...

work page internal anchor Pith review Pith/arXiv arXiv 2026

[2] [2]

Ntampaka, H

M. Ntampaka, H. Trac, D. J. Sutherland, N. Battaglia, B. P´ oczos, and J. Schneider, The Astrophysical Journal 803, 50 (2015)

work page 2015

[3] [3]

Guest, K

D. Guest, K. Cranmer, and D. Whiteson, Annual Review of Nuclear and Particle Science68, 161 (2018)

work page 2018

[4] [4]

D.-L. Deng, X. Li, and S. Das Sarma, Phys. Rev. X7, 021021 (2017)

work page 2017

[5] [5]

D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Phys. Rev. Res.2, 033429 (2020)

work page 2020

[6] [6]

Better, faster fermionic neural networks,

J. S. Spencer, D. Pfau, A. Botev, and W. M. C. Foulkes, “Better, faster fermionic neural networks,” (2020), arXiv:2011.07125 [physics.comp-ph]

work page arXiv 2020

[7] [7]

D. Wu, L. Wang, and P. Zhang, Phys. Rev. Lett.122, 080602 (2019)

work page 2019

[8] [8]

T. Wen, L. Zhang, H. Wang, W. E, and D. J. Srolovitz, Materials Futures1, 022601 (2022)

work page 2022

[9] [9]

Bardeen, L

J. Bardeen, L. N. Cooper, and J. R. Schrieffer, Phys. Rev.108, 1175 (1957)

work page 1957

[10] [10]

Anderson, Materials Research Bulletin8, 153 (1973)

P. Anderson, Materials Research Bulletin8, 153 (1973)

work page 1973

[11] [11]

R. B. Laughlin, Phys. Rev. Lett.50, 1395 (1983)

work page 1983

[12] [12]

Cybenko, Mathematics of Control, Signals and Sys- tems2, 303 (1989)

G. Cybenko, Mathematics of Control, Signals and Sys- tems2, 303 (1989)

work page 1989

[13] [13]

Nagy and V

A. Nagy and V. Savona, Physical Review Letters122, 250501 (2019)

work page 2019

[14] [14]

C. Roth, A. Szab´ o, and A. H. MacDonald, Physical Re- view B108, 054410 (2023)

work page 2023

[15] [15]

L. Yang, Z. Leng, G. Yu, A. Patel, W.-J. Hu, and H. Pu, Physical Review Research2, 012039 (2020)

work page 2020

[16] [16]

Y. Qian, W. Fu, W. Ren, and J. Chen, The Journal of Chemical Physics157, 164104 (2022)

work page 2022

[17] [17]

D. Pfau, S. Axelrod, H. Sutterud, I. von Glehn, and J. S. Spencer, Science385, eadn0137 (2024), publisher: American Association for the Advancement of Science

work page 2024

[18] [18]

Pescia, J

G. Pescia, J. Nys, J. Kim, A. Lovato, and G. Carleo, Phys. Rev. B110, 035108 (2024), publisher: American Physical Society

work page 2024

[19] [19]

Cassella, H

G. Cassella, H. Sutterud, S. Azadi, N. Drummond, D. Pfau, J. S. Spencer, and W. Foulkes, Phys. Rev. Lett. 130, 036401 (2023), publisher: American Physical Soci- ety

work page 2023

[20] [20]

W. T. Lou, H. Sutterud, G. Cassella, W. Foulkes, J. Knolle, D. Pfau, and J. S. Spencer, Phys. Rev. X14, 021030 (2024), publisher: American Physical Society

work page 2024

[21] [21]

Freitas and S

W. Freitas and S. A. Vitiello, Quantum7, 1209 (2023)

work page 2023

[22] [22]

Freitas, B

W. Freitas, B. Abreu, and S. A. Vitiello, Journal of Low Temperature Physics (2024), 10.1007/s10909-024-03061- w

work page doi:10.1007/s10909-024-03061- 2024

[23] [23]

Pescia, J

G. Pescia, J. Han, A. Lovato, J. Lu, and G. Carleo, Phys. Rev. Research4, 023138 (2022)

work page 2022

[24] [24]

H. H. Goldstine,The computer from Pascal to von Neu- mann(Princeton University Press, 1993)

work page 1993

[25] [25]

A. A. Lovelace, Taylor’s Scientific Memoirs3, 666 (1842)

work page

[26] [26]

Welchman,The Hut Six Story: Breaking the Enigma Codes(McGraw-Hill, 1982)

G. Welchman,The Hut Six Story: Breaking the Enigma Codes(McGraw-Hill, 1982)

work page 1982

[27] [27]

RANDELL, inA History of Computing in the Twentieth Century, edited by N

B. RANDELL, inA History of Computing in the Twentieth Century, edited by N. METROPOLIS, J. HOWLETT, and G.-C. ROTA (Academic Press, San Diego, 1980) pp. 47–92

work page 1980

[28] [28]

H. H. Goldstine and A. Goldstine, inThe Origins of Digi- tal Computers: Selected Papers(Springer, 1946) pp. 359– 373

work page 1946

[29] [29]

Von Neumann, IEEE Annals of the History of Com- puting15, 27 (1993)

J. Von Neumann, IEEE Annals of the History of Com- puting15, 27 (1993)

work page 1993

[30] [30]

A. M. TURING, MindLIX, 433 (1950), https://academic.oup.com/mind/article- pdf/LIX/236/433/30123314/lix-236-433.pdf

work page 1950

[31] [31]

Early science acceleration experiments with gpt-5,

S. Bubeck, C. Coester, R. Eldan, T. Gowers, Y. T. Lee, A. Lupsasca, M. Sawhney, R. Scherrer, M. Sellke, B. K. Spears, D. Unutmaz, K. Weil, S. Yin, and N. Zhiv- otovskiy, “Early science acceleration experiments with gpt-5,” (2025), arXiv:2511.16072 [cs.CL]

work page arXiv 2025

[32] [32]

DeepSeek-V3 Technical Report

DeepSeek-AI, “Deepseek-v3 technical report,” (2025), arXiv:2412.19437 [cs.CL]

work page internal anchor Pith review Pith/arXiv arXiv 2025

[33] [33]

Uchendu, Z

A. Uchendu, Z. Ma, T. Le, R. Zhang, and D. Lee, CoRR abs/2109.13296(2021)

work page arXiv 2021

[34] [34]

Dargan, S

S. Dargan, S. Bansal, M. Kumar, A. Mittal, and K. Ku- mar, ARCHIVES OF COMPUTATIONAL METHODS IN ENGINEERING30, 1057 (2023)

work page 2023

[35] [35]

Z. Wang, J. Zhan, C. Duan, X. Guan, P. Lu, and K. Yang, IEEE TRANSACTIONS ON NEURAL NET- WORKS AND LEARNING SYSTEMS34, 3811 (2023)

work page 2023

[36] [36]

Raucci, A

U. Raucci, A. Valentini, E. Pieri, H. Weir, S. Seritan, and T. J. Martinez, NATURE COMPUTATIONAL SCI- ENCE1, 42 (2021)

work page 2021

[37] [37]

L. Li, B. Lei, and C. Mao, JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION26(2022)

work page 2022

[38] [38]

Moore, PROCEEDINGS OF THE IEEE86, 82 (1998)

G. Moore, PROCEEDINGS OF THE IEEE86, 82 (1998)

work page 1998

[39] [39]

Roser, H

M. Roser, H. Ritchie, and E. Mathieu, Our World in Data (2023), https://ourworldindata.org/moores-law

work page 2023

[40] [40]

Goodfellow, Y

I. Goodfellow, Y. Bengio, and A. Courville,Deep learn- ing(MIT press, 2016)

work page 2016

[41] [41]

W. S. McCulloch and W. Pitts, The bulletin of mathe- matical biophysics5, 115 (1943)

work page 1943

[42] [42]

Rosenblatt, Psychological review65, 386 (1958)

F. Rosenblatt, Psychological review65, 386 (1958)

work page 1958

[43] [43]

Fukushima, Biological Cybernetics20, 121 (1975)

K. Fukushima, Biological Cybernetics20, 121 (1975)

work page 1975

[44] [44]

D. E. Rumelhart, J. L. McClelland, and P. R. Group, Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations(The MIT Press, 1986)

work page 1986

[45] [45]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams, Nature323, 533 (1986)

work page 1986

[46] [46]

G. E. Hinton, Progress in brain research165, 535 (2007)

work page 2007

[47] [47]

LeCun, Y

Y. LeCun, Y. Bengio,et al., Large-scale kernel machines 5, 127 (2007)

work page 2007

[48] [48]

Carrasquilla and R

J. Carrasquilla and R. G. Melko, Nature Physics13, 431 (2017)

work page 2017

[49] [49]

Carleo, I

G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, and L. Zdeborov´ a, Rev. Mod. Phys.91, 045002 (2019). 11

work page 2019

[50] [50]

Gy¨ orgyi, Phys

G. Gy¨ orgyi, Phys. Rev. A41, 7097 (1990)

work page 1990

[51] [51]

Albert and R

J. Albert and R. H. Swendsen, Physics Procedia57, 99 (2014), proceedings of the 27th Workshop on Com- puter Simulation Studies in Condensed Matter Physics (CSP2014)

work page 2014

[52] [52]

Cohen-Tannoudji, B

C. Cohen-Tannoudji, B. Diu, F. Laloe, and B. Dui, Quantum Mechanics (2 vol. set)(Wiley-Interscience, 2006)

work page 2006

[53] [53]

Metropolis, A

N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, The Journal of Chemical Physics21, 1087 (1953)

work page 1953

[54] [54]

Mascagni and A

M. Mascagni and A. Srinivasan, ACM Trans. Math. Softw.26, 436–461 (2000)

work page 2000

[55] [55]

JAX: com- posable transformations of Python+NumPy programs,

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Vander- Plas, S. Wanderman-Milne, and Q. Zhang, “JAX: com- posable transformations of Python+NumPy programs,” (2018)

work page 2018

[56] [56]

Mitchell, Publisher: McGraw Hill (1997)

T. Mitchell, Publisher: McGraw Hill (1997)

work page 1997

[57] [57]

Adam: A Method for Stochastic Optimization

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” (2017), arXiv:1412.6980 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017

[58] [58]

Artificial Neural Network-based Variational Monte Carlo Introduction,

W. Freitas, “Artificial Neural Network-based Variational Monte Carlo Introduction,” (2025)

work page 2025

[59] [59]

Napsuciale and S

M. Napsuciale and S. Rodr´ ıguez, Physics Letters B816, 136218 (2021)

work page 2021

[60] [60]

Kosztin, B

I. Kosztin, B. Faber, and K. Schulten, American Journal of Physics64, 633 (1996)

work page 1996

[61] [61]

Ruggeri, S

M. Ruggeri, S. Moroni, and M. Holzmann, Phys. Rev. Lett.120, 205302 (2018)

work page 2018

[62] [62]

Freitas, B

W. Freitas, B. Abreu, and S. A. Vitiello, Phys. Rev. B 112, 165109 (2025). Appendix A: Blocking method and standard error estimation Since the Monte Carlo integration is based on a Markov chain, successive samples are correlated. Therefore, the effective number of samples is smaller than the actual number of samplesM. Consequently, the usual standard erro...

work page 2025