arxiv: 2604.18357 · v1 · submitted 2026-04-20 · 🧮 math.OC · cs.NA· math.NA· physics.comp-ph· quant-ph

Recognition: unknown

Momentum Stability and Adaptive Control in Stochastic Reconfiguration

Yuyang Wang , Xin Liu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:17 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NAphysics.comp-phquant-ph

keywords stochastic reconfigurationmomentum stabilityvariational Monte Carloadaptive optimizationnatural gradientSPRINGconvergence analysiskernel directions

0 comments

The pith

Momentum below 1 guarantees convergence in stochastic reconfiguration while value 1 can cause divergence along kernel directions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines the role of a momentum-like parameter μ in the SPRING variant of stochastic reconfiguration used to optimize neural network wavefunctions in variational Monte Carlo. It shows that keeping μ strictly less than 1 produces convergence under mild conditions on the optimization problem and step-size schedule. Setting μ exactly to 1 allows unbounded growth in directions tied to the kernel of the preconditioner when the step sizes fail to sum to a finite value. From these results the authors build PRIME-SR, an adaptive procedure that estimates effective spectral dimension and subspace overlap on the fly to choose momentum without manual tuning. The new method matches the performance of the best fixed-μ choice while removing the instability that previously required careful hyperparameter search.

Core claim

The authors establish convergence guarantees for the momentum parameter satisfying 0 ≤ μ < 1 under mild assumptions, and they construct explicit counterexamples in which μ = 1 produces divergence through uncontrolled growth along kernel-related directions whenever the step-size sequence is not summable. Motivated by the gap between these two regimes, they introduce PRIME-SR, a momentum-adaptive variant of stochastic reconfiguration that estimates effective spectral dimension and subspace overlap from sampled data to adjust μ automatically, thereby attaining accuracy comparable to optimally tuned SPRING while improving robustness across variational Monte Carlo tasks.

What carries the argument

The momentum parameter μ inside the update rule of subsampled projected-increment natural gradient descent (SPRING), which interpolates between successive preconditioned directions and whose value relative to 1 determines whether the iteration remains bounded or grows along the kernel of the Fisher-like matrix.

Load-bearing premise

The convergence claims rest on unspecified mild assumptions about the landscape and sampling, while the adaptive method assumes that effective spectral dimension and subspace overlap can be estimated reliably from finite batches without introducing fresh instabilities.

What would settle it

A controlled linear test problem containing a nontrivial kernel in which μ = 1 with non-summable steps produces visible divergence, or a standard VMC benchmark in which PRIME-SR fails to reach the accuracy of the best manually tuned SPRING run.

Figures

Figures reproduced from arXiv: 2604.18357 by Xin Liu, Yuyang Wang.

**Figure 2.** Figure 2: Convergence test on the 1D-TFI model with [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: illustrates the behavior of αk on the 1D-TFI model with N = 10, including the full-batch reference and sampled runs with different sample sizes. All other experimental settings are the same as those in Section 3.1. An important empirical observation is that αk is relatively insensitive to the sample size and remains close to its full-batch counterpart. This indicates that αk stably captures the spectral st… view at source ↗

**Figure 4.** Figure 4: Left and right principal range space overlap indicators on the 1D-TFI model with [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: PRIME-SR on the Heisenberg model. Left column: relative energy error. Right column: [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: PRIME-SR on the 2D-TFI model with N = 10 × 10 sites. Left column: relative energy error. Right column: adaptive momentum parameter µk. As a direct comparison, [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Sensitivity of fixed-µ SPRING to initialization on N2 (top row) and CO (bottom row) for µ = 0.9, 0.95. (a) N2. Relative energy error. (b) CO. Relative energy error [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Sensitivity of PRIME-SR to initialization on [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: Comparison of fixed-µ SPRING and PRIME-SR on C, N, O atoms for random seed 0. 5.4 Experiments on Molecular Systems We finally consider the LiH, N2, and CO molecules. We compare against fixed-µ SPRING with µ = 0, 0.2, 0.4, 0.6, 0.8, 0.9, 0.95, 0.99 for LiH molecular, and µ = 0, 0.2, 0.4, 0.6, 0.8, 0.9, 0.95 for N2 and CO molecules. Unstable runs for N2 and CO molecules with µ = 0.99 are reported in Appendix… view at source ↗

**Figure 10.** Figure 10: Comparison of fixed-µ SPRING and PRIME-SR on LiH, N2, and CO molecules for random seed 0. 6 Conclusion and Discussion VMC has become a central tool for studying strongly correlated quantum systems, where the expressiveness of modern neural network wavefunctions must be matched by stable and efficient optimization algorithms. Among existing approaches, SPRING, the momentum-accelerated variant of SR, 23 … view at source ↗

**Figure 11.** Figure 11: Energy trajectories of SPRING with µ = 0.99 on N, O N2 and CO. 40 [PITH_FULL_IMAGE:figures/full_fig_p040_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison of fixed-µ SPRING and PRIME-SR on C, N, O atoms for random seed 1. 41 [PITH_FULL_IMAGE:figures/full_fig_p041_12.png] view at source ↗

**Figure 13.** Figure 13: Comparison of fixed-µ SPRING and PRIME-SR on C, N, O atoms for random seed 1. 42 [PITH_FULL_IMAGE:figures/full_fig_p042_13.png] view at source ↗

**Figure 14.** Figure 14: Comparison of fixed-µ SPRING and PRIME-SR on C, N, O atoms for random seed 3. 43 [PITH_FULL_IMAGE:figures/full_fig_p043_14.png] view at source ↗

**Figure 15.** Figure 15: Comparison of fixed-µ SPRING and PRIME-SR on C, N, O atoms for random seed 4. 44 [PITH_FULL_IMAGE:figures/full_fig_p044_15.png] view at source ↗

**Figure 16.** Figure 16: Comparison of fixed-µ SPRING and PRIME-SR on LiH, N2, and CO molecules for random seed 1. 45 [PITH_FULL_IMAGE:figures/full_fig_p045_16.png] view at source ↗

**Figure 17.** Figure 17: Comparison of fixed-µ SPRING and PRIME-SR on LiH, N2, and CO molecules for random seed 2. 46 [PITH_FULL_IMAGE:figures/full_fig_p046_17.png] view at source ↗

**Figure 18.** Figure 18: Comparison of fixed-µ SPRING and PRIME-SR on LiH, N2, and CO molecules for random seed 3. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_18.png] view at source ↗

**Figure 19.** Figure 19: Comparison of fixed-µ SPRING and PRIME-SR on LiH, N2, and CO molecules for random seed 4. 48 [PITH_FULL_IMAGE:figures/full_fig_p048_19.png] view at source ↗

read the original abstract

Variational Monte Carlo (VMC) combined with expressive neural network wavefunctions has become a powerful route to high-accuracy ground-state calculations, yet its practical success hinges on efficient and stable wavefunction optimization. While stochastic reconfiguration (SR) provides a geometry-aware preconditioner motivated by imaginary-time evolution, its Kaczmarz-inspired variant, subsampled projected-increment natural gradient descent (SPRING), achieves state-of-the-art empirical performance. However, the effectiveness of SPRING is highly sensitive to the choice of a momentum-like parameter $\mu$. The original sensitivity of $\mu$ and the instability observed at $\mu=1$, have remained unclear. In this work, we clarify the distinct mechanisms governing the regimes $\mu<1$ and $\mu=1$. We establish convergence guarantees for $0\le\mu<1$ under mild assumptions, and construct counterexamples showing that $\mu=1$ can induce divergence via uncontrolled growth along kernel-related directions when the step-size is not summable. Motivated by these theoretical insights and numerical observations, we further propose \textit{Principal Range Informed MomEntum SR} (PRIME-SR), a tuning-free momentum-adaptive SR method based on effective spectral dimension and subspace overlap. PRIME-SR achieves performance comparable to optimally tuned SPRING while significantly improving robustness in VMC optimization.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper clarifies why μ=1 destabilizes SR via kernel directions and offers PRIME-SR as a practical adaptive fix that matches tuned SPRING in VMC runs.

read the letter

The main thing to know is that this work separates the momentum regimes in stochastic reconfiguration. It proves convergence for 0 ≤ μ < 1 under mild assumptions and supplies counterexamples where μ = 1 produces divergence along kernel directions unless the step size is summable. From that analysis they build PRIME-SR, which adapts the momentum using effective spectral dimension and subspace overlap so that no manual tuning is needed. In their tests it performs about as well as the best fixed-μ SPRING while staying stable across more cases.

Referee Report

3 major / 2 minor

Summary. The manuscript analyzes momentum stability in stochastic reconfiguration (SR) and its subsampled projected-increment variant (SPRING) for neural-network variational Monte Carlo (VMC) optimization. It establishes convergence guarantees for the momentum parameter in the regime 0 ≤ μ < 1 under mild assumptions, constructs counterexamples demonstrating divergence at μ = 1 via uncontrolled growth along kernel-related directions when the step-size is not summable, and proposes the tuning-free adaptive method PRIME-SR that estimates effective spectral dimension and subspace overlap to achieve performance comparable to optimally tuned SPRING while improving robustness.

Significance. If the convergence guarantees hold under the stated mild assumptions and PRIME-SR proves stable on realistic noisy, rank-deficient Fisher metrics, the work would provide both theoretical clarification of a known practical instability and a practical adaptive algorithm for VMC. The explicit counterexamples for the μ = 1 case constitute a clear strength, as they isolate the mechanism of divergence without relying on fitted quantities.

major comments (3)

[Abstract and §3] Abstract and §3 (convergence analysis): the convergence guarantees for 0 ≤ μ < 1 are asserted under 'mild assumptions' on the SR metric, stochastic gradient noise, and step-size summability, yet the precise statements of these assumptions, any proof sketches, and verification that they bound variance or exclude amplification along near-zero eigenvalues of the sampled Fisher matrix are absent; without these the transfer to finite-sample neural VMC cannot be assessed.
[§4 and §5] §4 (counterexamples) and §5 (PRIME-SR): the μ = 1 divergence counterexamples are constructed in the deterministic setting, but the adaptive PRIME-SR estimator of effective spectral dimension and subspace overlap is applied to the same noisy Monte Carlo samples; no analysis is given of how estimation error or rank deficiency in the sampled metric could introduce new instabilities not covered by the existing guarantees.
[§5.2] §5.2 (numerical experiments): the claim that PRIME-SR 'significantly improves robustness' is supported only by performance comparable to optimally tuned SPRING; without ablations on the sensitivity of the spectral-dimension estimator to sample size or noise level, the robustness advantage remains unverified.

minor comments (2)

[§5.1] Notation for the effective spectral dimension and subspace overlap in the PRIME-SR rule should be defined explicitly before its use in the algorithm box.
[Figure 2] Figure captions for the divergence trajectories should state the precise step-size schedule and matrix conditioning used in the counterexamples.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the careful and constructive review. The comments highlight important points for clarification and additional verification. We address each major comment below and indicate the revisions we will make.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (convergence analysis): the convergence guarantees for 0 ≤ μ < 1 are asserted under 'mild assumptions' on the SR metric, stochastic gradient noise, and step-size summability, yet the precise statements of these assumptions, any proof sketches, and verification that they bound variance or exclude amplification along near-zero eigenvalues of the sampled Fisher matrix are absent; without these the transfer to finite-sample neural VMC cannot be assessed.

Authors: We agree that the assumptions require explicit statement. In the revised manuscript we will add a dedicated subsection in §3 that lists the precise assumptions on the SR metric (positive-definiteness bounds away from zero eigenvalues), the stochastic gradient noise (bounded variance), and step-size summability. We will also include a proof sketch that shows how these conditions control the variance term and preclude amplification along near-zero eigenvalues of the sampled Fisher matrix, thereby clarifying applicability to finite-sample neural VMC. revision: yes
Referee: [§4 and §5] §4 (counterexamples) and §5 (PRIME-SR): the μ = 1 divergence counterexamples are constructed in the deterministic setting, but the adaptive PRIME-SR estimator of effective spectral dimension and subspace overlap is applied to the same noisy Monte Carlo samples; no analysis is given of how estimation error or rank deficiency in the sampled metric could introduce new instabilities not covered by the existing guarantees.

Authors: The deterministic counterexamples in §4 are deliberately constructed to isolate the divergence mechanism at μ=1 that arises from uncontrolled growth along kernel directions when the step-size is not summable. PRIME-SR adapts μ using estimates computed from the same noisy samples; while this is consistent with practical VMC usage, we acknowledge that a rigorous propagation analysis of estimation error under rank deficiency is not supplied. In the revision we will add a discussion paragraph in §5 that addresses possible new instabilities and notes that our existing numerical experiments showed no such instabilities. A complete theoretical treatment of the estimator error bounds lies beyond the present scope. revision: partial
Referee: [§5.2] §5.2 (numerical experiments): the claim that PRIME-SR 'significantly improves robustness' is supported only by performance comparable to optimally tuned SPRING; without ablations on the sensitivity of the spectral-dimension estimator to sample size or noise level, the robustness advantage remains unverified.

Authors: We agree that targeted ablations would strengthen the robustness claim. In the revised §5.2 we will include additional experiments that systematically vary Monte Carlo sample size and noise level while monitoring the spectral-dimension estimator. These results will demonstrate that the estimator remains stable and that PRIME-SR retains its performance advantage, thereby verifying the robustness improvement beyond mere comparability to tuned SPRING. revision: yes

standing simulated objections not resolved

A rigorous analysis of how estimation error or rank deficiency in the sampled metric could introduce new instabilities in PRIME-SR.

Circularity Check

0 steps flagged

No significant circularity: convergence claims and adaptive method rest on independent analysis

full rationale

The paper presents convergence guarantees for μ<1 under mild assumptions and constructs explicit counterexamples for μ=1, neither of which reduces to a fitted parameter or self-referential definition. PRIME-SR is motivated by these results plus numerical observations and introduces a new adaptive rule based on estimated spectral dimension; no equation or claim is shown to be equivalent to its inputs by construction. Self-citations to prior SPRING work are present but not load-bearing for the new stability theorems or the adaptive estimator. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Central claims rest on unspecified mild assumptions for convergence and on the premise that effective spectral dimension and subspace overlap provide a stable basis for momentum adaptation; no free parameters are explicitly fitted in the abstract description.

axioms (1)

domain assumption Mild assumptions sufficient for convergence when 0 ≤ μ < 1
Invoked to establish guarantees for the μ < 1 regime.

invented entities (1)

PRIME-SR adaptive rule no independent evidence
purpose: Tuning-free momentum control using effective spectral dimension and subspace overlap
New method introduced to replace manual μ selection.

pith-pipeline@v0.9.0 · 5541 in / 1367 out tokens · 54378 ms · 2026-05-10T04:17:29.974827+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 5 canonical work pages · 2 internal anchors

[1]

Convergence of variational monte carlo simulation and scale-invariant pre-training

Nilin Abrahamsen, Zhiyan Ding, Gil Goldshlager, and Lin Lin. Convergence of variational monte carlo simulation and scale-invariant pre-training. Journal of Computational Physics , 513:113140, 2024

2024
[2]

Anti-symmetric barron functions and their approximation with sums of determinants

Nilin Abrahamsen and Lin Lin. Anti-symmetric barron functions and their approximation with sums of determinants. Journal of Computational Physics , page 114118, 2025

2025
[3]

Functional neural wavefunction optimization

Victor Armegioiu, Juan Carrasquilla, Siddhartha Mishra, Johannes Müller, Jannes Nys, Marius Zeinhofer, and Hang Zhang. Functional neural wavefunction optimization. arXiv preprint arXiv:2507.10835, 2025

work page arXiv 2025
[4]

Quantum Monte Carlo approaches for correlated systems

Federico Becca and Sandro Sorella. Quantum Monte Carlo approaches for correlated systems . Cambridge University Press, 2017

2017
[5]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. VanderPlas, S. Wanderman-Milne, and Q. Zhang. JAX: Composable transforma- tions of Python+NumPy programs. http://github.com/jax-ml/jax, 2018

2018
[6]

Carleo and M

G. Carleo and M. Troyer. Solving the quantum many-body problem with artificial neural networks. Science, 355(6325):602–606, 2017

2017
[7]

Empowering deep neural quantum states through eﬀicient opti- mization

Ao Chen and Markus Heyl. Empowering deep neural quantum states through eﬀicient opti- mization. Nature Physics , 20(9):1476–1481, 2024. 24

2024
[8]

Deanna and T

N. Deanna and T. Joel A. Paved with good intentions: analysis of a randomized block Kaczmarz method. Linear Algebra and its Applications , 441:199–221, 2014

2014
[9]

Quantum monte carlo simulations of solids

William MC Foulkes, Lubos Mitas, RJ Needs, and Guna Rajagopal. Quantum monte carlo simulations of solids. Reviews of Modern Physics , 73(1):33, 2001

2001
[10]

Goldshlager, N

G. Goldshlager, N. Abrahamsen, and L. Lin. A Kaczmarz-inspired approach to accelerate the optimization of neural network wavefunctions. Journal of Computational Physics , 516:113351, 2024

2024
[11]

Solving the hubbard model with neural quantum states, 2025

Yuntian Gu, Wenrui Li, Heng Lin, Bo Zhan, Ruichen Li, Yifei Huang, Di He, Yantao Wu, Tao Xiang, Mingpu Qin, Liwei Wang, and Dingshun Lv. Solving the hubbard model with neural quantum states, 2025

2025
[12]

Hermann, Z

J. Hermann, Z. Schätzle, and F. Noé. Deep-neural-network solution of the electronic Schrödinger equation. Nature Chemistry , 12(10):891–897, 2020

2020
[13]

Ab initio quantum chemistry with neural-network wavefunctions

Jan Hermann, James Spencer, Kenny Choo, Antonio Mezzacapo, W Matthew C Foulkes, David Pfau, Giuseppe Carleo, and Frank Noé. Ab initio quantum chemistry with neural-network wavefunctions. Nature Reviews Chemistry , 7(10):692–709, 2023

2023
[14]

Juergen Hinze. Mc-scf. i. the multi-configuration self-consistent-field method. The Journal of Chemical Physics , 59(12):6424–6432, 1973

1973
[15]

Karczmarz

S. Karczmarz. Angenaherte auflosung von systemen linearer glei-chungen. Bull. Int. Acad. Pol. Sic. Let., Cl. Sci. Math. Nat. , pages 355–357, 1937

1937
[16]

Adam: A Method for Stochastic Optimization

Diederik P Kingma. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[17]

Accelerated natural gradient method for parametric manifold optimization, 2025

Chenyi Li, Shuchen Zhu, Zhonglin Xie, and Zaiwen Wen. Accelerated natural gradient method for parametric manifold optimization, 2025

2025
[18]

A computational framework for neural network-based variational monte carlo with forward laplacian

Ruichen Li, Haotian Ye, Du Jiang, Xuelan Wen, Chuwei Wang, Zhe Li, Xiang Li, Di He, Ji Chen, Weiluo Ren, et al. A computational framework for neural network-based variational monte carlo with forward laplacian. Nature Machine Intelligence , 6(2):209–219, 2024

2024
[19]

Convergence analysis of stochastic gradient descent with mcmc estimators

Tianyou Li, Fan Chen, Huajie Chen, and Zaiwen Wen. Convergence analysis of stochastic gradient descent with mcmc estimators. arXiv preprint arXiv:2303.10599 , 2023

work page arXiv 2023
[20]

J. Lin, G. Goldshlager, N. Abrahamsen, and L. Lin. VMCNet: Framework for training first- quantized neural network wavefunctions using VMC, built on JAX, 2024

2024
[21]

Explicitly antisymmetrized neural network layers for variational monte carlo simulation

Jeffmin Lin, Gil Goldshlager, and Lin Lin. Explicitly antisymmetrized neural network layers for variational monte carlo simulation. Journal of Computational Physics , 474:111765, 2023

2023
[22]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[23]

Martens and R

J. Martens and R. Grosse. Optimizing neural networks with Kronecker-factored approximate curvature. In International Conference on Machine Learning , pages 2408–2417. PMLR, 2015

2015
[24]

Ground state of liquid he 4

William Lauchlin McMillan. Ground state of liquid he 4. Physical Review, 138(2A):A442, 1965

1965
[25]

Wasserstein quantum monte carlo: a novel approach for solving the quan- tum many-body schrödinger equation

Kirill Neklyudov, Jannes Nys, Luca Thiede, Juan Carrasquilla, Qiang Liu, Max Welling, and Alireza Makhzani. Wasserstein quantum monte carlo: a novel approach for solving the quan- tum many-body schrödinger equation. Advances in Neural Information Processing Systems , 36:63461–63482, 2023. 25

2023
[26]

Numerical optimization

Jorge Nocedal. Numerical optimization. Springer Ser. Oper. Res. Financ. Eng./Springer , 2006

2006
[27]

D. Pfau, J. S. Spencer, A. G. D. G. Matthews, and W. M. C. Foulkes. Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Physical Review Research, 2(3), 2020

2020
[28]

A simple linear algebra identity to optimize large-scale neural network quantum states

Riccardo Rende, Luciano Loris Viteritti, Lorenzo Bardone, Federico Becca, and Sebastian Goldt. A simple linear algebra identity to optimize large-scale neural network quantum states. Communications Physics , 7(1):260, 2024

2024
[29]

The method of configuration interaction

Isaiah Shavitt. The method of configuration interaction. In Methods of electronic structure theory, pages 189–275. Springer, 1977

1977
[30]

S. Sorella. Green function Monte Carlo with stochastic reconfiguration. Physical Review Letters, 80(20):4558, 1998

1998
[31]

S. Sorella. Generalized Lanczos algorithm for variational quantum Monte Carlo. Physical Review B, 64(2):024512, 2001

2001
[32]

Quantum natural gradient

James Stokes, Josh Izaac, Nathan Killoran, and Giuseppe Carleo. Quantum natural gradient. Quantum, 4:269, 2020

2020
[33]

Energy and variance optimization of many-body wave func- tions

CJ Umrigar and Claudia Filippi. Energy and variance optimization of many-body wave func- tions. Physical review letters , 94(15):150201, 2005

2005
[34]

Vicentini, D

F. Vicentini, D. Hofmann, A. Szabó, D. Wu, C. Roth, C. Giuliani, G. Pescia, J. Nys, V. Vargas- Calderón, N. Astrakhantsev, and G. Carleo. NetKet 3: Machine Learning Toolbox for Many- Body Quantum Systems. SciPost Phys. Codebases , page 7, 2022

2022
[35]

A self-attention ansatz for ab-initio quantum chemistry

Ingrid von Glehn, James S Spencer, and David Pfau. A self-attention ansatz for ab-initio quantum chemistry. In The Eleventh International Conference on Learning Representations
[36]

Lower bound on the representation complexity of antisymmetric tensor product functions

Yuyang Wang, Yukuan Hu, and Xin Liu. Lower bound on the representation complexity of antisymmetric tensor product functions. SCIENCE CHINA Mathematics , 2026

2026
[37]

Optimization of neural network wave functions in variational monte carlo

Yuyang Wang and Xin Liu. Optimization of neural network wave functions in variational monte carlo. The Innovation Informatics , pages 100025–1, 2025

2025
[38]

Rayleigh-gauss-newton optimization with enhanced sampling for variational monte carlo

Robert J Webber and Michael Lindsey. Rayleigh-gauss-newton optimization with enhanced sampling for variational monte carlo. Physical Review Research, 4(3):033099, 2022

2022
[39]

o(n2) representation of general continuous anti-symmetric function

Haotian Ye, Ruichen Li, Yuntian Gu, Yiping Lu, Di He, and Liwei Wang. o(n2) representation of general continuous anti-symmetric function. arXiv preprint arXiv:2402.15167 , 2024

work page arXiv 2024
[40]

A blocked linear method for optimizing large parameter sets in variational monte carlo

Luning Zhao and Eric Neuscamman. A blocked linear method for optimizing large parameter sets in variational monte carlo. Journal of chemical theory and computation , 13(6):2604–2611, 2017

2017
[41]

Stochastic recon- figuration with warm-started svd, 2025

Dexuan Zhou, Huajie Chen, Cheuk Hin Ho, Xin Liu, and Christoph Ortner. Stochastic recon- figuration with warm-started svd, 2025

2025
[42]

A multilevel method for many-electron schrödinger equations based on the atomic cluster expansion

Dexuan Zhou, Huajie Chen, Cheuk Hin Ho, and Christoph Ortner. A multilevel method for many-electron schrödinger equations based on the atomic cluster expansion. SIAM Journal on Scientific Computing , 46(1):A105–A129, 2024. 26 A Proof of Theorem 3.1 To prove Theorem 3.1, we first establish the following lemmas. Lemma A.1. Under Assumption 3.1, for any θ 2 ...

2024