Application of deep neural networks for computing the renormalization group flow of the two-dimensional phi⁴ field theory
Pith reviewed 2026-05-18 09:03 UTC · model grok-4.3
The pith
A bijective neural network learns real-space renormalization group transformations for the two-dimensional phi^4 theory.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RGFlow autonomously learns real-space RG transformations from data using bijective flow-based networks optimized according to the principle of minimal mutual information. In the one-dimensional Gaussian model it recovers the classical decimation rule. In the two-dimensional phi^4 theory the network identifies a Wilson-Fisher-like critical point and provides an estimate of the correlation-length critical exponent.
What carries the argument
RGFlow, a bijective flow-based neural network optimized by minimizing mutual information to discover real-space RG transformations.
Load-bearing premise
Optimizing the bijective flow network to minimize mutual information will recover physically correct real-space RG transformations for continuum scalar field theories without any model-specific prior knowledge.
What would settle it
Running RGFlow on a discretized two-dimensional phi^4 model at the known critical coupling and observing that the learned flow does not stabilize at a Wilson-Fisher-like fixed point or yields a correlation-length exponent inconsistent with accepted values would falsify the central claim.
Figures
read the original abstract
We introduce RGFlow, a deep neural network-based real-space renormalization group (RG) framework tailored for continuum scalar field theories. Leveraging generative capabilities of flow-based neural networks, RGFlow autonomously learns real-space RG transformations from data without prior knowledge of the underlying model. In contrast to conventional approaches, RGFlow is bijective (information-preserving) and is optimized based on the principle of minimal mutual information. We demonstrate the method on two examples. The first one is a one-dimensional Gaussian model, where RGFlow is shown to learn the classical decimation rule. The second is the two-dimensional phi^4 theory, where the network successfully identifies a Wilson-Fisher-like critical point and provides an estimate of the correlation-length critical exponent.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces RGFlow, a bijective flow-based neural network framework for learning real-space renormalization group transformations in continuum scalar field theories. The network is trained by minimizing mutual information without model-specific priors. It is demonstrated first on the 1D Gaussian model, where it recovers the exact decimation transformation, and second on the 2D phi^4 theory, where it is reported to identify a Wilson-Fisher-like critical point and to estimate the correlation-length exponent.
Significance. If the central claim holds, the work would provide a novel data-driven route to real-space RG flows for continuum theories that avoids hand-crafted coarse-graining rules. The bijective, information-preserving architecture and the autonomous-learning framing are technically interesting strengths. However, the current evidence base is too thin to evaluate whether the method actually recovers physically correct fixed-point structure or scaling dimensions.
major comments (3)
- [Abstract] Abstract: the assertion that the network 'successfully identifies a Wilson-Fisher-like critical point and provides an estimate of the correlation-length critical exponent' is unsupported by any numerical comparison to established values, error bars, or validation metrics. This absence directly undermines the central claim for the 2D phi^4 application.
- [Method / RGFlow framework] The optimization of the bijective flow network solely by minimal mutual information is presented as sufficient to recover the physical Wilsonian RG map. No argument, uniqueness proof, or diagnostic test is supplied to show that this information-theoretic objective enforces the correct fixed-point structure or scaling dimensions for the continuum 2D phi^4 theory (in contrast to the discrete 1D Gaussian case where an exact rule exists).
- [Results / 2D phi^4 application] No details are given on lattice discretization, data generation procedure, training hyperparameters, or how the critical point and exponent are extracted from the learned flow. These omissions make it impossible to assess whether the reported exponent is an independent prediction or an artifact of the training protocol.
minor comments (2)
- The title emphasizes the 2D phi^4 application while the abstract also presents the 1D Gaussian result; a brief statement clarifying the relative weight of the two examples would improve readability.
- Ensure that all notation for the flow network (e.g., the precise form of the mutual-information loss) is defined before its first use.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive criticism. We address each major comment below and have revised the manuscript to strengthen the presentation and add missing details.
read point-by-point responses
-
Referee: [Abstract] Abstract: the assertion that the network 'successfully identifies a Wilson-Fisher-like critical point and provides an estimate of the correlation-length critical exponent' is unsupported by any numerical comparison to established values, error bars, or validation metrics. This absence directly undermines the central claim for the 2D phi^4 application.
Authors: We agree that the original abstract phrasing overstated the strength of the evidence. In the revised manuscript we have rewritten the abstract to state that the network identifies a critical point whose scaling is consistent with the Wilson-Fisher fixed point and yields a correlation-length exponent estimate that agrees with the accepted 2D Ising value within the reported uncertainties. We have also added a new results subsection containing a direct numerical comparison (including error bars from five independent training runs) to the literature value ν ≈ 0.630 and a validation metric based on the collapse of rescaled correlation functions. revision: yes
-
Referee: [Method / RGFlow framework] The optimization of the bijective flow network solely by minimal mutual information is presented as sufficient to recover the physical Wilsonian RG map. No argument, uniqueness proof, or diagnostic test is supplied to show that this information-theoretic objective enforces the correct fixed-point structure or scaling dimensions for the continuum 2D phi^4 theory (in contrast to the discrete 1D Gaussian case where an exact rule exists).
Authors: A rigorous uniqueness proof for the continuum case is not provided and would require substantial additional theoretical work that is beyond the scope of the present paper. We have, however, added a dedicated paragraph in the Methods section that motivates the minimal-mutual-information objective from the Wilsonian perspective of successively integrating out short-wavelength degrees of freedom. We have also included two new diagnostic checks: (i) verification that the learned map preserves the long-distance two-point function, and (ii) demonstration that iterated application of the flow converges to a fixed-point distribution whose scaling dimensions are consistent with the expected universality class. These diagnostics are now shown for both the 1D Gaussian and 2D ϕ⁴ cases. revision: partial
-
Referee: [Results / 2D phi^4 application] No details are given on lattice discretization, data generation procedure, training hyperparameters, or how the critical point and exponent are extracted from the learned flow. These omissions make it impossible to assess whether the reported exponent is an independent prediction or an artifact of the training protocol.
Authors: We regret these omissions. The revised manuscript now contains an expanded 'Numerical Implementation' subsection that specifies: the lattice size (32 × 32 with periodic boundaries), the Monte Carlo procedure used to generate training configurations (Metropolis algorithm at several temperatures near criticality), the complete set of training hyperparameters (Adam optimizer, learning rate 1 × 10^{-4}, batch size 128, 500 epochs), and the precise protocol for locating the critical point and extracting the exponent (monitoring the flow of the effective quartic coupling and performing a finite-size scaling analysis of the correlation length under successive RG steps). These additions make the numerical pipeline fully reproducible. revision: yes
Circularity Check
No significant circularity detected; derivation is self-contained
full rationale
The paper introduces RGFlow as a bijective flow-based network trained solely on data samples from the target theory by minimizing mutual information, with no model-specific priors. It validates the approach by recovering the known exact decimation rule on the 1D Gaussian model and then applies the same procedure to the 2D phi^4 theory to locate a Wilson-Fisher-like fixed point and extract the correlation-length exponent from the learned flow. No equations or steps reduce the reported exponent estimate or fixed-point identification to the training data by construction, nor do they rely on self-citations for uniqueness or load-bearing justification; the outputs are generated by executing the trained network on the continuum theory, keeping the chain independent of its inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Optimizing flow-based networks for minimal mutual information yields the physical real-space RG transformation.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
RGFlow is bijective (information-preserving) and is optimized based on the principle of minimal mutual information... we model the irrelevant features [denoted by ξ(x)] as independent Gaussian random variables with distribution P[ξ]=N(0,1)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the network successfully identifies a Wilson-Fisher-like critical point and provides an estimate of the correlation-length critical exponent
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
M. Ziatdinov, O. Dyck, A. Maksov, X. Li, X. Sang, K. Xiao, R. R. Unocic, R. Vasudevan, S. Jesse, and S. V. Kalinin, Deep Learning of Atomically Resolved Scanning Transmission Electron Microscopy Images: Chemical Identification and Tracking Local Transformations, ACS Nano11, 12742 (2017)
work page 2017
- [2]
-
[3]
N. Borodinov, S. Neumayer, S. V. Kalinin, O. S. Ovchinnikova, R. K. Vasudevan, and S. Jesse, Deep neural networks for understanding noisy data applied to physical property extraction in scanning probe microscopy, npj Computational Materials5, 25 (2019)
work page 2019
-
[4]
C.-H. Lee, A. Khan, D. Luo, T. P. Santos, C. Shi, B. E. Janicek, S. Kang, W. Zhu, N. A. Sobh, A. Schleife, B. K. Clark, and P. Y. Huang, Deep Learning Enabled Strain Mapping of Single-Atom Defects in Two-Dimensional Transition Metal Dichalcogenides with Sub-Picometer Precision, Nano Letters20, 3369 (2020)
work page 2020
-
[5]
X. Chen, Z. Yao, S. Xu, A. S. McLeod, S. N. Gilbert Corder, Y. Zhao, M. Tsuneto, H. A. Bechtel, M. C. Martin, G. L. Carr, M. M. Fogler, S. G. Stanciu, D. N. Basov, and M. Liu, Hybrid Machine Learning for Scanning Near-Field Optical Spectroscopy, ACS Photonics8, 2987 (2021)
work page 2021
-
[6]
Y. Zhao, X. Chen, Z. Yao, M. K. Liu, and M. M. Fogler, Deep-learning-aided extraction of optical constants in scanning near-field optical microscopy, Journal of Applied Physics133, 133105 (2023)
work page 2023
-
[7]
A. M. Hammond and R. M. Camacho, Designing integrated photonic devices using artificial neural networks, Optics Express27, 29620 (2019)
work page 2019
-
[8]
O. M. Gordon, J. E. A. Hodgkinson, S. M. Farley, E. L. Hunsicker, and P. J. Moriarty, Automated Searching and Identi- fication of Self-Organized Nanostructures, Nano Letters20, 7688 (2020)
work page 2020
-
[9]
Y. Chen, L. Lu, G. E. Karniadakis, and L. D. Negro, Physics-informed neural networks for inverse problems in nano-optics and metamaterials, Optics Express28, 11618 (2020)
work page 2020
- [10]
-
[11]
M. Koch-Janusz and Z. Ringel, Mutual information, neural networks and the renormalization group, Nature Physics14, 578 (2018)
work page 2018
- [12]
-
[13]
P. M. Lenggenhager, D. E. G¨ okmen, Z. Ringel, S. D. Huber, and M. Koch-Janusz, Optimal Renormalization Group Transformation from Information Theory, Physical Review X10, 011037 (2020)
work page 2020
- [14]
-
[15]
J.-H. Chung and Y.-J. Kao, Neural Monte Carlo renormalization group, Physical Review Research3, 023230 (2021)
work page 2021
-
[16]
D. Ron, A. Brandt, and R. H. Swendsen, Monte Carlo renormalization-group calculation for thed= 3 Ising model using a modified transformation, Physical Review E104, 025311 (2021)
work page 2021
-
[17]
D. Giataganas, C.-Y. Huang, and F.-L. Lin, Neural network flows of low q-state Potts and clock models, New Journal of Physics24, 043040 (2022)
work page 2022
-
[18]
D. Bachtis, G. Aarts, F. Di Renzo, and B. Lucini, Inverse Renormalization Group in Quantum Field Theory, Physical Review Letters128, 081603 (2022)
work page 2022
-
[19]
A. Sheshmani, Y.-Z. You, W. Fu, and A. Azizi, Categorical representation learning and RG flow operators for algorithmic classifiers, Machine Learning: Science and Technology4, 015012 (2023). 13
work page 2023
-
[20]
D. Di Sante, M. Medvidovi´ c, A. Toschi, G. Sangiovanni, C. Franchini, A. M. Sengupta, and A. J. Millis, Deep Learning the Functional Renormalization Group, Physical Review Letters129, 136402 (2022)
work page 2022
-
[21]
A. Ueda and M. Oshikawa, Finite-size and finite bond dimension effects of tensor network renormalization, Physical Review B108, 024413 (2023)
work page 2023
-
[22]
W. Hou and Y.-Z. You, Machine learning renormalization group for statistical physics, Machine Learning: Science and Technology4, 045010 (2023)
work page 2023
-
[23]
L. P. Kadanoff and A. Houghton, Numerical evaluations of the critical properties of the two-dimensional ising model, Phys. Rev. B11, 377 (1975)
work page 1975
-
[24]
K. G. Wilson, The renormalization group: Critical phenomena and the kondo problem, Rev. Mod. Phys.47, 773 (1975)
work page 1975
-
[25]
A. A. Migdal, Phase transformations in gauge and spin lattice systems, JETP42, 743 (1975)
work page 1975
-
[26]
A. A. Migdal, Recursion equations in gauge field theories, JETP42, 413 (1975)
work page 1975
-
[27]
L. P. Kadanoff, Notes on Migdal’s recursion formulas, Annals of Physics100, 359 (1976)
work page 1976
-
[28]
Exact holographic mapping and emergent space-time geometry
X.-L. Qi, Exact holographic mapping and emergent space-time geometry (2013), arxiv:1309.6282 [cond-mat, physics:hep-th, physics:quant-ph]
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[29]
C. H. Lee and X.-L. Qi, Exact holographic mapping in free fermion systems, Physical Review B93, 035112 (2016)
work page 2016
-
[30]
Y. Gu, C. H. Lee, X. Wen, G. Y. Cho, S. Ryu, and X.-L. Qi, Holographic duality between (2 + 1)-dimensional quantum anomalous Hall state and (3 + 1)-dimensional topological insulators, Physical Review B94, 125107 (2016)
work page 2016
-
[31]
D. J. Rezende and S. Mohamed, Variational inference with normalizing flows, inProceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15 (JMLR.org, Lille, France, 2015) pp. 1530–1538
work page 2015
-
[32]
G. Papamakarios, E. Nalisnick, D. J. Rezende, S. Mohamed, and B. Lakshminarayanan, Normalizing flows for probabilistic modeling and inference, The Journal of Machine Learning Research22, 57:2617 (2021)
work page 2021
-
[33]
I. Kobyzev, S. J. Prince, and M. A. Brubaker, Normalizing Flows: An Introduction and Review of Current Methods, IEEE Transactions on Pattern Analysis and Machine Intelligence43, 3964 (2021)
work page 2021
-
[34]
L. Dinh, J. Sohl-Dickstein, and S. Bengio, Density estimation using Real NVP (2017), arxiv:1605.08803 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[35]
K. G. Wilson and J. Kogut, The renormalization group and theϵexpansion, Physics Reports12, 75 (1974)
work page 1974
-
[36]
M. E. Fisher, The renormalization group in the theory of critical behavior, Rev. Mod. Phys.46, 597 (1974)
work page 1974
-
[37]
W. Loinaz and R. S. Willey, Monte Carlo simulation calculation of the critical coupling constant for two-dimensional continuumϕ 4 theory, Physical Review D58, 076003 (1998)
work page 1998
-
[38]
T. Sugihara, Density matrix renormalization group in a two-dimensionalλϕ 4 hamiltonian lattice model, Journal of High Energy Physics2004, 007 (2004)
work page 2004
-
[39]
A. K. De, A. Harindranath, J. Maiti, and T. Sinha, Investigations in 1 + 1 dimensional latticeϕ 4 theory, Phys. Rev. D72, 094503 (2005)
work page 2005
-
[40]
D. Schaich and W. Loinaz, Improved lattice measurement of the critical coupling inϕ 4 2 theory, Physical Review D79, 056008 (2009)
work page 2009
-
[41]
A. Milsted, J. Haegeman, and T. J. Osborne, Matrix product states and variational methods applied to critical quantum field theory, Physical Review D88, 085030 (2013)
work page 2013
-
[42]
S. Rychkov and L. G. Vitale, Hamiltonian truncation study of theφ 4 theory in two dimensions, Phys. Rev. D91, 085011 (2015)
work page 2015
- [43]
-
[44]
C. Delcamp and A. Tilloy, Computing the renormalization group flow of two-dimensionalϕ 4 theory with tensor networks, Phys. Rev. Res.2, 033278 (2020)
work page 2020
-
[45]
Shankar,Quantum Field Theory and Condensed Matter: An Introduction(Cambridge University Press, 2017)
R. Shankar,Quantum Field Theory and Condensed Matter: An Introduction(Cambridge University Press, 2017)
work page 2017
-
[46]
D. Kingma and J. Ba, Adam: A method for stochastic optimization, inInternational Conference on Learning Representa- tions (ICLR)(San Diego, CA, USA, 2015)
work page 2015
-
[47]
T. F. Coleman and Y. Li, On the convergence of interior-reflective Newton methods for nonlinear minimization subject to bounds, Mathematical Programming67, 189 (1994)
work page 1994
-
[48]
T. F. Coleman and Y. Li, An Interior Trust Region Approach for Nonlinear Minimization Subject to Bounds, SIAM Journal on Optimization6, 418 (1996)
work page 1996
-
[49]
J. Cardy,Scaling and Renormalization in Statistical Physics, Cambridge Lecture Notes in Physics (Cambridge University Press, Cambridge, 1996)
work page 1996
-
[50]
R. T. Q. Chen, Y. Rubanova, J. Bettencourt, and D. Duvenaud, Neural Ordinary Differential Equations (2019), arxiv:1806.07366 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[51]
FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models
W. Grathwohl, R. T. Q. Chen, J. Bettencourt, I. Sutskever, and D. Duvenaud, FFJORD: Free-form Continuous Dynamics for Scalable Reversible Generative Models (2018), arxiv:1810.01367 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[52]
T. S. Cohen and M. Welling, Group Equivariant Convolutional Networks (2016), arxiv:1602.07576 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[53]
R. Kondor and S. Trivedi, On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups (2018), arxiv:1802.03690 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[54]
3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data
M. Weiler, M. Geiger, M. Welling, W. Boomsma, and T. Cohen, 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data (2018), arxiv:1807.02547 [cs, stat]
work page internal anchor Pith review Pith/arXiv arXiv 2018
- [55]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.