Viability of perturbative expansion for quantum field theories on neurons
Pith reviewed 2026-05-22 00:06 UTC · model grok-4.3
The pith
Renormalized 1/N corrections for neural quantum field theories depend on the ultraviolet cutoff
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The single-layer NN architecture reproduces local QFT results exactly in the infinite neuron limit. For finite N the renormalized O(1/N) corrections to the two- and four-point correlators in phi^4 theory yield perturbative series that are sensitive to the ultraviolet cut-off and have only weak convergence.
What carries the argument
The O(1/N) expansion of the renormalized two- and four-point functions in the broken-independence neural network model for phi^4 theory
If this is right
- Modifications to the neural network architecture can be used to improve the convergence of the perturbative series.
- Appropriate constraints on the parameters and the scaling of N with the cutoff allow accurate field theory results to be extracted.
- The approach requires careful management of ultraviolet sensitivities to achieve reliable perturbative calculations at finite N.
Where Pith is reading between the lines
- Cutoff dependence may be a general feature of finite-size neural network approximations to field theories.
- Similar perturbative expansions could be explored in other theories or with deeper networks to see if convergence improves.
- Numerical tests of the proposed modification for small but finite N would provide concrete evidence of better performance.
Load-bearing premise
That the neural network architecture exactly reproduces local quantum field theory results when the number of neurons becomes infinite.
What would settle it
Finding that the renormalized O(1/N) correction to the four-point function in phi^4 theory is independent of the ultraviolet cutoff after proper renormalization.
Figures
read the original abstract
Neural Network (NN) architectures that break statistical independence of parameters have been proposed as a new approach for simulating local quantum field theories (QFTs). In the infinite neuron number limit, single-layer NNs can exactly reproduce QFT results. This paper examines the viability of this architecture for perturbative calculations of local QFTs for finite neuron number $N$ using scalar $\phi^4$ theory in $d$ Euclidean dimensions as an example. We find that the renormalized $O(1/N)$ corrections to two- and four-point correlators yield perturbative series which are sensitive to the ultraviolet cut-off and therefore have a weak convergence. We propose a modification to the architecture to improve this convergence and discuss constraints on the parameters of the theory and the scaling of N which allow us to extract accurate field theory results.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript examines the use of single-layer neural networks with broken statistical independence of parameters to simulate local quantum field theories, taking scalar φ⁴ theory in d Euclidean dimensions as an example. It asserts that the architecture exactly reproduces local QFT results in the infinite-neuron (N→∞) limit. For finite N, the authors compute O(1/N) corrections to two- and four-point correlators, renormalize them, and conclude that the resulting perturbative series remain sensitive to the ultraviolet cutoff, implying weak convergence. A modification to the NN architecture is proposed to improve convergence, together with constraints on theory parameters and the required scaling of N.
Significance. If the infinite-N reproduction of local QFT correlators holds and the 1/N expansion can be controlled, the reported UV sensitivity of the renormalized corrections would constitute a concrete limitation on the perturbative utility of this NN discretization, while the proposed architectural modification could offer a practical route to improved accuracy. The work therefore addresses a relevant question at the interface of neural-network discretizations and perturbative QFT, provided the foundational assumption is verified.
major comments (2)
- [Introduction and §2 (infinite-N limit)] The central premise that the single-layer NN with broken parameter independence exactly reproduces local QFT results in the N→∞ limit is stated in the abstract and introduction but is not supported by an explicit check (e.g., matching of the quadratic action or the two-point propagator to the standard continuum φ⁴ theory). This verification is load-bearing for interpreting the computed O(1/N) terms as corrections to the target QFT rather than to an effective theory with residual non-localities.
- [§4 (renormalization and correlators)] The headline claim that renormalized O(1/N) corrections to the two- and four-point functions are UV-cutoff sensitive (abstract and §4) is presented without explicit derivation steps, cutoff regularization details, error estimates, or direct comparison against known perturbative results in φ⁴ theory. This absence prevents assessment of whether the observed sensitivity is an artifact of the NN discretization or a genuine feature of the 1/N expansion around the local QFT.
minor comments (1)
- Notation for the NN weight correlations and the precise definition of the 1/N expansion parameter could be made more explicit to aid readers outside the immediate subfield.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments on the manuscript. We address the major comments point by point below, providing the strongest honest responses based on the content and derivations in the paper. Revisions have been made where they strengthen the presentation without altering the core results.
read point-by-point responses
-
Referee: [Introduction and §2 (infinite-N limit)] The central premise that the single-layer NN with broken parameter independence exactly reproduces local QFT results in the N→∞ limit is stated in the abstract and introduction but is not supported by an explicit check (e.g., matching of the quadratic action or the two-point propagator to the standard continuum φ⁴ theory). This verification is load-bearing for interpreting the computed O(1/N) terms as corrections to the target QFT rather than to an effective theory with residual non-localities.
Authors: We agree that an explicit verification of the N→∞ limit is important for clarity. The manuscript establishes this limit by showing that the effective action of the neural network, obtained by averaging over the broken-independence parameters, reduces exactly to the local φ⁴ action as N diverges, with all non-local contributions vanishing. To make this more transparent, we have added an explicit calculation in the revised §2 demonstrating that the quadratic term yields the standard continuum kinetic operator and that the two-point propagator matches the known result for the free theory in the infinite-N limit. This confirms that the O(1/N) corrections computed later are perturbations around the target local QFT. revision: yes
-
Referee: [§4 (renormalization and correlators)] The headline claim that renormalized O(1/N) corrections to the two- and four-point functions are UV-cutoff sensitive (abstract and §4) is presented without explicit derivation steps, cutoff regularization details, error estimates, or direct comparison against known perturbative results in φ⁴ theory. This absence prevents assessment of whether the observed sensitivity is an artifact of the NN discretization or a genuine feature of the 1/N expansion around the local QFT.
Authors: We thank the referee for this suggestion. The O(1/N) corrections to the correlators and their renormalization are derived in §4 using a hard UV cutoff Λ to regulate the integrals that arise from the finite-N parameter averaging. In the revised manuscript we have expanded the presentation with the intermediate algebraic steps for both the two-point and four-point functions, specified the cutoff scheme in detail, and added error estimates associated with truncating the 1/N expansion. A direct comparison to the standard perturbative expansion of φ⁴ theory is now included, showing agreement at leading order while the 1/N terms retain cutoff dependence due to the architecture-induced non-localities at finite N. This establishes the sensitivity as a genuine feature of the expansion rather than an artifact. revision: yes
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper states as premise that single-layer NNs with broken parameter independence exactly reproduce local QFT results at infinite neuron number N, then performs an explicit O(1/N) expansion of the NN correlators for finite N in scalar φ⁴ theory. The renormalized two- and four-point functions are computed directly from this expansion, and their UV-cutoff sensitivity is reported as a result of that calculation. This does not reduce to the infinite-N premise by construction, nor does any equation equate the sensitivity finding to a fitted parameter or a self-citation chain. Renormalization follows the standard QFT procedure as described, and the proposed architecture modification is an additional suggestion motivated by the computed sensitivity rather than a redefinition that forces the outcome. The derivation therefore remains independent of its inputs and yields a non-tautological claim about perturbative convergence.
Axiom & Free-Parameter Ledger
free parameters (2)
- neuron number N
- ultraviolet cutoff
axioms (1)
- domain assumption Single-layer NNs with broken parameter independence exactly reproduce local QFT correlators when neuron number goes to infinity.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We find that the renormalized O(1/N) corrections to two- and four-point correlators yield perturbative series which are sensitive to the ultraviolet cut-off and therefore have a weak convergence.
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the finite width corrections of NNFT cannot be used for describing such local interactions. It was proposed that desired local interactions can be incorporated within the NNFT framework by breaking statistical independence of network parameters
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 3 Pith papers
-
Anomalies in Neural Network Field Theory
Derives Schwinger-Dyson equations and Ward identities in NN-FT to study anomalies in QFTs via a conserved parameter-space current, yielding a new perspective on symmetries.
-
Topological Effects in Neural Network Field Theory
Neural network field theory extended with discrete topological labels recovers the BKT transition and bosonic string T-duality.
-
Optimal Architecture and Fundamental Bounds in Neural Network Field Theory
α=0 architecture in NNFT minimizes finite-width variance, removes IR corrections, and sets a fundamental SNR bound for correlation functions in scalar field theory.
Reference graph
Works this paper leans on
-
[1]
Can techniques from statistical mechanics and the path integral formulation of quantum field theory (QFT) help us build a theoretical understanding of how neural networks learn?
-
[2]
Z Vd ddbi (2π)d eibi·wi # δd(b1 +b 2 +b 3 +b 4)
Can neural networks be used to facilitate computations in quantum field theory? These two questions are deeply interrelated, and will motivate the questions we explore in this work. The second question itself splits naturally into two subcategories: (a) applied machine learning for physics problems, and (b) the theoretical interplay between machine learni...
-
[3]
57 is a contribution proportional to the field theory prediction
The first term in the expression forQ(k 1, k2, k3, k4) in Eq. 57 is a contribution proportional to the field theory prediction. This is what we expect; the finite N corrections are suppressed by O(1/N) compared to the true result
-
[4]
The second term in Eq. 57 is disturbing since it is a remnant of the one loop (1PI) correction to the two point correlator of external legs (Fig.2(b)); normally one would expect that for any observable in a renormalizable theory these types of 1PI divergences would be fully absorbed into the bare mass. This signals to us the non-renormalizability of this ...
-
[5]
Finally we have corrections that scale as Λ dξd/Nfrom the non-Gaussian correlations induced by the finite width NN. This scaling holds at any order in perturbation theory which suggest that as long asN≫Λ dξd, these corrections are small and under control. We conclude that the uncanceled 1/Nsuppressed UV divergent 1PI diagrams are main obstacle to the viab...
-
[6]
Powers of 1PI diagrams which will be sensitive to the UV cut-off. 3. Non-Gaussian corrections proportional to powers ofλΛ dξd/N. The next question we want to address whether this conclusion holds for higher point correlators. We will show that apart from the three types of corrections discussed in the previous paragraph, the higher point correlators will ...
-
[7]
Z dai √ N√ 2πσa e − N 2σ2a aiai #
Next we consider the third term in Eq. B1. We usually ignore this term while computing Feynman diagrams in field theory since they only lead to disconnected contributions which cancel out with the corresponding diagrams obtained from the second term of Eq. B1. However we expect that this will not hold for the O(1/N) corrections ⟨ϕ(w1)ϕ(w2)⟩f λ 4! Z ddx⟨ϕ4...
-
[8]
Neural network field theories: non-Gaussianity, actions, and locality,
Mehmet Demirtas, James Halverson, Anindita Maiti, Matthew D. Schwartz, and Keegan Stoner, “Neural network field theories: non-Gaussianity, actions, and locality,” Mach. Learn. Sci. Tech.5, 015002 (2024), arXiv:2307.03223 [hep-th]
-
[9]
Flavour tagging with graph neural networks with the ATLAS detector,
Arnaud Duperrin (ATLAS), “Flavour tagging with graph neural networks with the ATLAS detector,” in30th International Workshop on Deep-Inelastic Scattering and Related Subjects(2023) arXiv:2306.04415 [hep-ex]
-
[10]
Transformer Neural Networks for Identifying Boosted Higgs Bosons decaying intob ¯bandc¯cin ATLAS,
“Transformer Neural Networks for Identifying Boosted Higgs Bosons decaying intob ¯bandc¯cin ATLAS,” (2023)
work page 2023
-
[11]
Georges Aadet al.(ATLAS), “Search for New Phenomena in Two-Body Invariant Mass Distributions Using Unsupervised Machine Learning for Anomaly Detection at s=13 TeV with the ATLAS Detector,” Phys. Rev. Lett.132, 081801 (2024), arXiv:2307.01612 [hep-ex]
-
[12]
Learning Uncertainties the Frequentist Way: Calibration and Correlation in High Energy Physics,
Rikab Gambhir, Benjamin Nachman, and Jesse Thaler, “Learning Uncertainties the Frequentist Way: Calibration and Correlation in High Energy Physics,” Phys. Rev. Lett.129, 082001 (2022), arXiv:2205.03413 [hep-ph]
-
[13]
Bias and priors in machine learning calibrations for high energy physics,
Rikab Gambhir, Benjamin Nachman, and Jesse Thaler, “Bias and priors in machine learning calibrations for high energy physics,” Phys. Rev. D106, 036011 (2022), arXiv:2205.05084 [hep-ph]
-
[14]
Scaling deep learning for materials discovery,
Amil Merchant, Simon Batzner, Samuel S. Schoenholz, and Ekin Dogus Cubuk, “Scaling deep learning for materials discovery,” Nature624, 80–85 (2023)
work page 2023
-
[15]
AG Kusne, TR Gao, A Mehta, LQ Ke, MC Nguyen, KM Ho, V Antropov, CZ Wang, MJ Kramer, C Long,et al., “On-the- fly machine-learning for high-throughput experiments: search for rare-earth-free permanent magnets,” Scientific Reports 4(2014), 10.1038/srep06367
-
[16]
High-throughput discovery of high curie point two-dimensional ferromagnetic materials,
Arnab Kabiraj, Mayank Kumar, and Santanu Mahapatra, “High-throughput discovery of high curie point two-dimensional ferromagnetic materials,” npj Computational Materials6(2020), 10.1038/s41524-020-0300-2
-
[17]
Machine learning driven new material discovery,
Jiazhen Cai, Xuan Chu, Kun Xu, Hongbo Li, and Jing Wei, “Machine learning driven new material discovery,” Nanoscale Advances2(2020), 10.1039/D0NA00388C
-
[18]
Machine learning in magnetic materials,
Georgios Katsikas, Charalampos Sarafidis, and Joseph Kioseoglou, “Machine learning in magnetic materials,” physica status solidi (b)258(2021), 10.1002/pssb.202000600
-
[19]
Machine learning accelerated prediction of ce-based ternary compounds involving antagonistic pairs,
Weiyi Xia, Wei-Shen Tee, Paul C. Canfield, Fernando Assis Garcia, Raquel A Ribeiro, Yongbin Lee, Liqin Ke, Rebecca Flint, and Cai-Zhuang Wang, “Machine learning accelerated prediction of ce-based ternary compounds involving antagonistic pairs,” Phys. Rev. Mater.9, 053803 (2025)
work page 2025
-
[20]
Flow-based generative models for markov chain monte carlo in lattice field theory,
M. S. Albergo, G. Kanwar, and P. E. Shanahan, “Flow-based generative models for markov chain monte carlo in lattice field theory,” Phys. Rev. D100, 034515 (2019)
work page 2019
-
[21]
Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions,
Ryan Abbott, Michael S. Albergo, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, S´ ebastien Racani` ere, Danilo J. Rezende, Fernando Romero-L´ opez, Phiala E. Shanahan, Betsy Tian, and Julian M. Urban, “Gauge-equivariant flow models for sampling in lattice field theories with pseudofermions,” Phys. Rev. D106, 074506 (2022)
work page 2022
-
[22]
Path integral contour deformations for observables inSU(N) gauge theory,
William Detmold, Gurtej Kanwar, Henry Lamm, Michael L. Wagman, and Neill C. Warrington, “Path integral contour deformations for observables inSU(N) gauge theory,” Phys. Rev. D103, 094517 (2021), arXiv:2101.12668 [hep-lat]
-
[23]
Complex paths around the sign problem,
Andrei Alexandru, G¨ ok ¸ ce Ba¸ sar, Paulo F. Bedaque, and Neill C. Warrington, “Complex paths around the sign problem,” Rev. Mod. Phys.94, 015006 (2022)
work page 2022
-
[24]
Deep learning beyond lefschetz thimbles,
Andrei Alexandru, Paulo F. Bedaque, Henry Lamm, and Scott Lawrence, “Deep learning beyond lefschetz thimbles,” Physical Review D96(2017), 10.1103/physrevd.96.094505
-
[25]
Learning lattice quantum field theories with equivariant continuous flows,
Mathis Gerdes, Pim de Haan, Corrado Rainone, Roberto Bondesan, and Miranda C. N. Cheng, “Learning lattice quantum field theories with equivariant continuous flows,” SciPost Physics15(2023), 10.21468/scipostphys.15.6.238
-
[26]
Gaussian Process Behaviour in Wide Deep Neural Networks
Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, and Zoubin Ghahramani, “Gaussian process behaviour in wide deep neural networks,” (2018), arXiv:1804.11271 [stat.ML]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[27]
Bayesian deep convolutional networks with many channels are gaussian processes,
Roman Novak, Lechao Xiao, Jaehoon Lee, Yasaman Bahri, Greg Yang, Jiri Hron, Daniel A. Abolafia, Jeffrey Pennington, and Jascha Sohl-Dickstein, “Bayesian deep convolutional networks with many channels are gaussian processes,” (2020), arXiv:1810.05148 [stat.ML]
-
[28]
Deep Convolutional Networks as shallow Gaussian Processes
Adri` a Garriga-Alonso, Carl Edward Rasmussen, and Laurence Aitchison, “Deep convolutional networks as shallow gaussian processes,” (2019), arXiv:1808.05587 [stat.ML]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[29]
Greg Yang, “Scaling limits of wide neural networks with weight sharing: Gaussian process behavior, gradient independence, and neural tangent kernel derivation,” (2020), arXiv:1902.04760 [cs.NE]
-
[30]
Greg Yang, “Tensor programs i: Wide feedforward or recurrent neural networks of any architecture are gaussian processes,” (2021), arXiv:1910.12478 [cs.NE]. 24
-
[31]
Tensor programs ii: Neural tangent kernel for any architecture,
Greg Yang, “Tensor programs ii: Neural tangent kernel for any architecture,” (2020), arXiv:2006.14548 [stat.ML]
-
[32]
Building Quantum Field Theories Out of Neurons,
James Halverson, “Building Quantum Field Theories Out of Neurons,” (2021), arXiv:2112.04527 [hep-th]
-
[33]
Quantum Mechanics and Neural Networks,
Christian Ferko and James Halverson, “Quantum Mechanics and Neural Networks,” (2025), arXiv:2504.05462 [hep-th]
-
[34]
Conformal Fields from Neural Networks,
James Halverson, Joydeep Naskar, and Jiahua Tian, “Conformal Fields from Neural Networks,” (2024), arXiv:2409.12222 [hep-th]
-
[35]
Bayesian RG flow in neural network field theories,
Jessica N. Howard, Marc S. Klinger, Anindita Maiti, and Alexander G. Stapleton, “Bayesian RG flow in neural network field theories,” SciPost Phys. Core8, 027 (2025), arXiv:2405.17538 [hep-th]
-
[36]
Nuclear matrix elements from lattice QCD for electroweak and beyond-Standard-Model processes,
Zohreh Davoudi, William Detmold, Phiala Shanahan, Kostas Orginos, Assumpta Parre˜ no, Martin J. Savage, and Michael L. Wagman, “Nuclear matrix elements from lattice QCD for electroweak and beyond-Standard-Model processes,” Phys. Rept.900, 1–74 (2021), arXiv:2008.11160 [hep-lat]
-
[37]
Hadron Spectroscopy with Lattice QCD,
John Bulavaet al., “Hadron Spectroscopy with Lattice QCD,” inSnowmass 2021(2022) arXiv:2203.03230 [hep-lat]
-
[38]
Review of lattice results concerning low-energy particle physics
S. Aokiet al., “Review of lattice results concerning low-energy particle physics,” Eur. Phys. J. C77, 112 (2017), arXiv:1607.00299 [hep-lat]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[39]
Zohreh Davoudi, Alexander F. Shaw, and Jesse R. Stryker, “General quantum algorithms for Hamiltonian simulation with applications to a non-Abelian lattice gauge theory,” Quantum7, 1213 (2023), arXiv:2212.14030 [hep-lat]
-
[40]
Gauss’s law, duality, and the Hamiltonian formulation of U(1) lattice gauge theory,
David B. Kaplan and Jesse R. Stryker, “Gauss’s law, duality, and the Hamiltonian formulation of U(1) lattice gauge theory,” Phys. Rev. D102, 094515 (2020), arXiv:1806.08797 [hep-lat]
-
[41]
Loop, string, and hadron dynamics in SU(2) Hamiltonian lattice gauge theories,
Indrakshi Raychowdhury and Jesse R. Stryker, “Loop, string, and hadron dynamics in SU(2) Hamiltonian lattice gauge theories,” Phys. Rev. D101, 114502 (2020), arXiv:1912.06133 [hep-lat]
-
[42]
Daniel A. Roberts, Sho Yaida, and Boris Hanin,The Principles of Deep Learning Theory(Cambridge University Press,
- [43]
-
[44]
Structures of neural network effective theories,
Ian Banta, Tianji Cai, Nathaniel Craig, and Zhengkang Zhang, “Structures of neural network effective theories,” Phys. Rev. D109, 105007 (2024), arXiv:2305.02334 [hep-th]
-
[45]
Non-perturbative renormalization for the neural network- QFT correspondence,
Harold Erbin, Vincent Lahoche, and Dine Ousmane Samary, “Non-perturbative renormalization for the neural network- QFT correspondence,” Mach. Learn. Sci. Tech.3, 015027 (2022), arXiv:2108.01403 [hep-th]
-
[46]
The Neural Networks with Tensor Weights and the Corresponding Fermionic Quantum Field Theory,
Guojun Huang and Kai Zhou, “The Neural Networks with Tensor Weights and the Corresponding Fermionic Quantum Field Theory,” (2025), arXiv:2507.05303 [hep-th]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.