Basis-free neural-network geminal and Jastrow factors for variational Monte Carlo
Pith reviewed 2026-05-20 15:06 UTC · model grok-4.3
The pith
Neural-network replacements for basis sets in geminal and Jastrow factors separate nodal definition from dynamical correlation in variational Monte Carlo.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors show that replacing conventional basis-set expansions with feed-forward neural networks in both the geminal and Jastrow constructions yields a compact wave function that achieves sub-millihartree accuracy for the hydrogen molecule and the rectangular hydrogen tetramer whenever the AGP nodes are adequate, while highlighting the residual nodal limitation near the large-radius square geometry of the hydrogen tetramer.
What carries the argument
The basis-free Jastrow-AGP ansatz in which an antisymmetrized geminal power determinant defines the nodal surface and a neural-network Jastrow factor recovers dynamical correlation at fixed nodes.
Where Pith is reading between the lines
- Extending this separation to larger systems could help identify when additional nodal optimization is needed beyond neural Jastrow factors.
- Similar neural replacements might apply to other antisymmetric wave function forms to reduce basis dependence.
- Testing on systems with known exact nodes would confirm the isolation of dynamical correlation improvements.
Load-bearing premise
The AGP determinant supplies an adequate nodal surface so the neural-network Jastrow factor can recover dynamical correlation at fixed nodes without adjusting the nodes.
What would settle it
A variational Monte Carlo calculation on the square hydrogen tetramer at large separation that fails to reach sub-millihartree accuracy even after full optimization of the neural-network parameters would indicate that the nodal limitation cannot be overcome by the Jastrow factor alone.
Figures
read the original abstract
Neural-network quantum states offer a flexible route to compact many-electron wave functions, but their practical accuracy depends strongly on how fermionic antisymmetry, electron correlation, and optimization noise are treated. Here we combine an antisymmetrized geminal power (AGP) determinant with feed-forward neural networks that replace conventional basis-set expansions in the geminal and in two Jastrow-factor constructions. The resulting basis-free Jastrow--AGP ansatz is optimized by variational Monte Carlo and is designed to separate two tasks: the AGP part defines the nodal surface, while the neural-network Jastrow factor recovers dynamical correlation at fixed nodes. This separation makes it possible to distinguish errors associated with dynamical correlation from those caused by static, multireference correlation. Applications to the hydrogen molecule and the rectangular hydrogen tetramer show sub-millihartree accuracy when the AGP nodes are adequate, and expose the residual nodal limitation near the large-radius square geometry of the hydrogen tetramer. These results clarify where neural-network building blocks can improve a compact geminal ansatz and where additional nodal flexibility is required.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a basis-free variational Monte Carlo ansatz that replaces conventional basis expansions with feed-forward neural networks for both the antisymmetrized geminal power (AGP) geminal and two Jastrow-factor constructions. The central design choice fixes the nodal surface via the AGP determinant while using the neural-network Jastrow solely to recover dynamical correlation at those fixed nodes. Numerical results on H2 and the rectangular H4 geometry reach sub-millihartree accuracy relative to reference values when the AGP nodes are adequate; the same framework is used to identify a residual error at the large-radius square H4 geometry that is attributed to nodal limitations.
Significance. If the separation of nodal and dynamical-correlation errors can be rigorously validated, the work supplies a practical diagnostic for deciding when additional nodal flexibility (e.g., beyond a single AGP) is required in neural-network quantum states. The concrete accuracy numbers on two small systems and the explicit identification of a nodal bottleneck constitute a useful benchmark contribution for the VMC community.
major comments (1)
- [§4] §4 (H4 results) and the abstract: the claim that the residual error near the large-radius square geometry is caused by nodal limitations presupposes that the neural-network Jastrow has already saturated the dynamical correlation recoverable at the fixed AGP nodes. No scaling study with Jastrow network depth/width, no comparison against a converged conventional Jastrow at the same nodes, and no independent error decomposition are reported; without such evidence the attribution remains inconclusive.
minor comments (2)
- [§2] The precise functional form of the neural-network geminal (how the AGP coefficients are generated from the network output) should be written explicitly, preferably with an equation.
- [Figures] Figure captions for the H4 energy curves should state the reference method and basis used for the comparison values.
Simulated Author's Rebuttal
We thank the referee for the careful reading of our manuscript and for the constructive feedback. The major comment raises a valid point about the strength of evidence supporting our attribution of the residual error to nodal limitations. We address this concern directly below and outline revisions that will be made to the manuscript.
read point-by-point responses
-
Referee: [§4] §4 (H4 results) and the abstract: the claim that the residual error near the large-radius square geometry is caused by nodal limitations presupposes that the neural-network Jastrow has already saturated the dynamical correlation recoverable at the fixed AGP nodes. No scaling study with Jastrow network depth/width, no comparison against a converged conventional Jastrow at the same nodes, and no independent error decomposition are reported; without such evidence the attribution remains inconclusive.
Authors: We agree with the referee that a more rigorous demonstration that the neural-network Jastrow has saturated the dynamical correlation at the fixed AGP nodes would make the attribution of the residual error to nodal limitations more conclusive. The current results show that two distinct neural-network Jastrow constructions yield essentially the same residual error at the large-radius square H4 geometry (while both recover sub-millihartree accuracy at the rectangular geometry), which we interpret as evidence that further dynamical-correlation recovery is not possible at those nodes. However, we acknowledge that this interpretation would be strengthened by the additional checks the referee suggests. We will therefore add a scaling study with respect to Jastrow network depth and width, include a comparison against a converged conventional Jastrow factor at the same AGP nodes, and provide a clearer error decomposition in the revised manuscript. The language in the abstract and §4 will be adjusted to reflect that the nodal limitation is inferred from the saturation behavior observed with the present flexible Jastrow forms. revision: yes
Circularity Check
No circularity: variational optimization and external benchmarks keep results independent of input definitions
full rationale
The paper defines a variational Monte Carlo procedure in which neural-network parameters for the AGP geminal and Jastrow factors are optimized to minimize the energy expectation value. Reported sub-millihartree accuracies are obtained by direct comparison of these variational energies against known external reference values for the H2 and rectangular H4 systems. The separation of tasks (AGP supplying fixed nodes, Jastrow recovering dynamical correlation) is presented explicitly as a design choice that enables diagnostic attribution of residual errors, not as a definitional identity that forces the numerical outcomes. No equation reduces a reported energy or accuracy figure to a quantity defined by the same fitted parameters, and the provided text contains no load-bearing self-citations or imported uniqueness theorems. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Fermionic wave functions must be antisymmetric under particle exchange.
- domain assumption The nodal surface can be held fixed while dynamical correlation is added by the Jastrow factor.
Reference graph
Works this paper leans on
-
[1]
A complementary stochastic route to correlated elec- tronic structure is full configuration interaction quan- tum Monte Carlo (FCIQMC), introduced by Alavi and coworkers, which samples signed walker populations in Slater-determinant space and can reach finite-basis FCI- quality energies without storing the complete CI vec- tor [22]. Its initiator extensio...
work page 2000
-
[2]
are shown as reference values. 85 86 87 88 89 90 [°] 2.020 2.018 2.016 2.014 2.012 2.010 Energy [Ha] NNAGP (45x45) + All-Body-NNJF (110x110) FermiNet FCI FIG. 5.H 4: VMC energies of the NNWF for different H–H angles at fixed radiusR= 3.2843 Bohr. FermiNet and FCI energies from Ref. 27 are shown for comparison. and static correlation can be varied geometri...
-
[3]
J. C. Slater, The theory of complex spectra, Phys. Rev. 34, 1293 (1929)
work page 1929
-
[4]
Jastrow, Many-body problem with strong forces, Phys
R. Jastrow, Many-body problem with strong forces, Phys. Rev.98, 1479 (1955)
work page 1955
-
[5]
A. C. Hurley, J. E. Lennard-Jones, and J. A. Pople, The molecular orbital theory of chemical valency xvi. a theory of paired-electrons in polyatomic molecules, Proc. R. Soc. Lond. A220, 446 (1953)
work page 1953
-
[6]
M. Casula and S. Sorella, Geminal wave functions with jastrow correlation: A first application to atoms, J. Chem. Phys.119, 6500 (2003)
work page 2003
-
[7]
C. Genovese, A. Meninno, and S. Sorella, Assessing the accuracy of the jastrow antisymmetrized geminal power in the h4 model system, J. Chem. Phys.150, 084102 (2019)
work page 2019
-
[8]
Pauling, The nature of the chemical bond
L. Pauling, The nature of the chemical bond. application of results obtained from the quantum mechanics and from a theory of paramagnetic susceptibility to the structure of molecules, J. Am. Chem. Soc.53, 1367 (1931)
work page 1931
-
[9]
P. W. Anderson, The resonating valence bond state in La2CuO4 and superconductivity, Science235, 1196 (1987)
work page 1987
-
[10]
P. W. Anderson, P. A. Lee, M. Randeria, T. M. Rice, N. Trivedi, and F. C. Zhang, The physics behind high- temperature superconducting cuprates: The plain vanilla version of RVB, J. Phys.: Condens. Matter16, R755 (2004)
work page 2004
- [11]
- [12]
- [13]
- [14]
- [15]
- [16]
-
[17]
N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, Equation of state calcula- tions by fast computing machines, J. Chem. Phys.21, 1087 (1953)
work page 1953
-
[18]
W. K. Hastings, Monte Carlo Sampling Methods Using Markov Chains and Their Applications, Biometrika57, 97 (1970)
work page 1970
-
[19]
W. L. McMillan, Ground state of liquid he 4, Phys. Rev. 138, A442 (1965)
work page 1965
-
[20]
D. M. Ceperley and B. J. Alder, Quantum monte carlo, Science231, 555 (1986)
work page 1986
-
[21]
W. M. C. Foulkes, L. Mitas, R. J. Needs, and G. Ra- jagopal, Quantum monte carlo simulations of solids, Rev. Mod. Phys.73, 33 (2001)
work page 2001
-
[22]
R. J. Needs, M. D. Towler, N. D. Drummond, and P. Lopez Rios, Continuum variational and diffusion quan- tum monte carlo calculations, J. Phys.: Condens. Matter 22, 023201 (2010)
work page 2010
-
[23]
B. M. Austin, D. Y. Zubarev, and J. Lester, William A., Quantum monte carlo and related approaches, Chem. Rev.112, 263 (2012)
work page 2012
-
[24]
G. H. Booth, A. J. W. Thom, and A. Alavi, Fermion monte carlo without fixed nodes: A game of life, death, and annihilation in slater determinant space, J. Chem. Phys.131, 054106 (2009)
work page 2009
-
[25]
D. Cleland, G. H. Booth, and A. Alavi, Communications: Survival of the fittest: Accelerating convergence in full configuration-interaction quantum monte carlo, J. Chem. 10 Phys.132, 041103 (2010)
work page 2010
-
[26]
Hornik, Approximation capabilities of multilayer feed- forward networks, Neural Netw.4, 251 (1991)
K. Hornik, Approximation capabilities of multilayer feed- forward networks, Neural Netw.4, 251 (1991)
work page 1991
-
[27]
J. Han, L. Zhang, and W. E, Solving many-electron schr¨ odinger equation using deep neural networks, J. Comput. Phys.399, 108929 (2019)
work page 2019
-
[28]
K. Choo, A. Mezzacapo, and G. Carleo, Fermionic neural-network states for ab-initio electronic structure, Nat. Commun.11, 2368 (2020)
work page 2020
-
[29]
D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes, Ab initio solution of the many-electron schr¨ odinger equation with deep neural networks, Phys. Rev. Research2, 033429 (2020)
work page 2020
-
[30]
J. Kessler, F. Calcavecchia, and T. D. K¨ uhne, Artifi- cial neural networks as trial wave functions for quantum monte carlo, Adv. Theory Simul.4, 2000269 (2021)
work page 2021
-
[31]
M. T. Entwistle, Z. Sch¨ atzle, P. A. Erdman, J. Hermann, and F. No´ e, Electronic excited states in deep variational Monte Carlo, Nat. Commun.14, 274 (2023)
work page 2023
-
[32]
X. Li, C. Fan, W. Ren, and J. Chen, Fermionic neu- ral network with effective core potential, Phys. Rev. Re- search4, 013021 (2022)
work page 2022
- [33]
- [34]
- [35]
- [36]
-
[37]
F. Calcavecchia, F. Pederiva, M. H. Kalos, and T. D. K¨ uhne, Sign problem of the fermionic shadow wave func- tion, Phys. Rev. E90, 053304 (2014)
work page 2014
-
[38]
F. Calcavecchia and T. D. K¨ uhne, On fermionic shadow wave functions for strongly correlated multi-reference systems based on a single slater determinant, EPL110, 20011 (2015)
work page 2015
-
[39]
F. Calcavecchia and T. D. K¨ uhne, Metal-insulator transi- tion of solid hydrogen by the antisymmetric shadow wave function, Z. Naturforsch. A73, 845 (2018)
work page 2018
-
[40]
D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arXiv:1412.6980 [cs] (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[41]
H. Flyvbjerg and H. G. Petersen, Error estimates on av- erages of correlated data, J. Chem. Phys.91, 461 (1989)
work page 1989
-
[42]
Jonsson, Standard error estimation by an automated blocking method, Phys
M. Jonsson, Standard error estimation by an automated blocking method, Phys. Rev. E98, 043304 (2018)
work page 2018
-
[43]
J. B. Anderson, Quantum chemistry by random walk: H4 square, Int. J. Quantum Chem.15, 109 (1979)
work page 1979
-
[44]
K. Gasperich, M. Deible, and K. D. Jordan, H4: A model system for assessing the performance of diffusion monte carlo calculations using a single slater determinant trial function, J. Chem. Phys.147, 074106 (2017)
work page 2017
- [45]
-
[46]
Pachucki, Born-oppenheimer potential for h 2, Phys
K. Pachucki, Born-oppenheimer potential for h 2, Phys. Rev. A82, 032509 (2010)
work page 2010
-
[47]
R. J. Bartlett and M. Musia l, Coupled-cluster theory in quantum chemistry, Rev. Mod. Phys.79, 291 (2007)
work page 2007
-
[48]
S. Sorella, M. Casula, and D. Rocca, Weak binding be- tween two aromatic rings: Feeling the van der waals attraction by quantum monte carlo methods, J. Chem. Phys.127, 014105 (2007)
work page 2007
-
[49]
T. Van Voorhis and M. Head-Gordon, Benchmark vari- ational coupled cluster doubles results, J. Chem. Phys. 113, 8873 (2000)
work page 2000
-
[50]
H. G. A. Burton and A. J. W. Thom, Holomorphic hartree–fock theory: An inherently multireference ap- proach, J. Chem. Theory Comput.12, 167 (2016)
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.