arxiv: 2604.17772 · v1 · submitted 2026-04-20 · 🧮 math.NA · cs.NA

Recognition: unknown

A Deep Ritz Method for High-Dimensional Steady States of the Cahn--Hilliard Equation

Yi Liu , Shuting Gu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 04:39 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords methodcahn--hilliarddeepequationhigh-dimensionalstatessteadycomputing

0 comments

The pith

A Deep Ritz method with augmented Lagrangian and Fourier feature mappings computes high-dimensional steady states of the Cahn-Hilliard equation and identifies multiple nontrivial phase separation patterns.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The Cahn-Hilliard equation describes how two mixed substances separate into distinct phases over time, like oil and water. Finding the final steady patterns in many dimensions is hard for traditional grid-based computers because the cost grows rapidly. The authors train a neural network to minimize an energy function whose minimum points are the steady states. They add a special penalty that keeps the total amount of each substance fixed. They also feed the network inputs through Fourier-style mappings so the solutions automatically repeat at the boundaries. Tests in one, two, and three dimensions show the network can reach different patterns such as round droplets or striped layers.

Core claim

The proposed method exhibits a notable dual capability: it not only achieves fast convergence to steady states but also effectively identifies multiple nontrivial solutions corresponding to different local minimizers of the energy functional.

Load-bearing premise

That a neural network with the chosen architecture, Fourier features, and training procedure can faithfully represent and locate the relevant local minimizers of the Cahn-Hilliard energy in high dimensions without systematic bias or missing important structures.

Figures

Figures reproduced from arXiv: 2604.17772 by Shuting Gu, Yi Liu.

**Figure 1.** Figure 1: Schematic illustration of the GlobalResNet architecture. The model combines a global linear mapping with a nonlinear [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗

**Figure 2.** Figure 2: Trivial steady-state solution u ≡ 0.6 under random initialization. (a) total loss (b) boundary loss (c) mass constraint loss [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Evolution of the total loss, boundary loss, and mass constraint loss during training for the trivial solution. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: (a) Random initialization of the neural network. (b) Non-trivial steady-state solution of the CH equation enabled by [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗

**Figure 5.** Figure 5: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error, demonstrating stable convergence to a non-trivial [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗

**Figure 6.** Figure 6: Evolution of the total loss, boundary loss, and mass constraint loss during training. [PITH_FULL_IMAGE:figures/full_fig_p012_6.png] view at source ↗

**Figure 7.** Figure 7: 2D, Case 1: (a) Random initialization of the neural network. (b) Droplet-type non-trivial steady-state solution of [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Evolution of (a) energy loss, (b) mass constraint loss and (c) error during training for the droplet-type non-trivial [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: 2D, Case 2: (a) Random initialization of the neural network. (b) Inverted droplet-type steady-state solution enabled [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the inverted droplet-type [PITH_FULL_IMAGE:figures/full_fig_p014_10.png] view at source ↗

**Figure 11.** Figure 11: 2D, Case 3: (a) Random initialization of the neural network. (b) Lamellar (striped) steady-state solution enabled [PITH_FULL_IMAGE:figures/full_fig_p015_11.png] view at source ↗

**Figure 12.** Figure 12: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the lamellar steady-state [PITH_FULL_IMAGE:figures/full_fig_p015_12.png] view at source ↗

**Figure 13.** Figure 13: 3D, Case 1: (a) Random initialization generated by Kaiming initialization. (b) The interfacial layer of the droplet [PITH_FULL_IMAGE:figures/full_fig_p016_13.png] view at source ↗

**Figure 14.** Figure 14: 3D, Case 1: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the droplet-type [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗

**Figure 15.** Figure 15: 3D, Case 2: (a) Random initialization. (b) Computed inverted droplet-type steady-state solution, where the two [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗

**Figure 16.** Figure 16: 3D, Case 2: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the inverted [PITH_FULL_IMAGE:figures/full_fig_p018_16.png] view at source ↗

**Figure 17.** Figure 17: 3D, Case 3: (a) Random initialization. (b) Lamellar ( layered) steady-state solution, representing a three-dimensional [PITH_FULL_IMAGE:figures/full_fig_p018_17.png] view at source ↗

**Figure 18.** Figure 18: 3D, Case 3: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the lamellar [PITH_FULL_IMAGE:figures/full_fig_p019_18.png] view at source ↗

**Figure 19.** Figure 19: 3D, Case 4: (a) Random initialization. (b)–(d) the same tubular (cylindrical) steady-state solution with different [PITH_FULL_IMAGE:figures/full_fig_p019_19.png] view at source ↗

**Figure 20.** Figure 20: 3D, Case 4: Evolution of (a) energy loss, (b) mass constraint loss, and (c) error during training for the tubular [PITH_FULL_IMAGE:figures/full_fig_p020_20.png] view at source ↗

read the original abstract

The Cahn--Hilliard equation is a fundamental model for describing phase separation phenomena in binary mixtures. Traditional numerical methods, such as finite difference and finite element methods, often incur substantial computational cost, particularly when computing steady-state solutions in high-dimensional settings. To address this challenge, we propose a deep learning-based framework, namely the Deep Ritz method, for computing steady states of the Cahn--Hilliard equation under periodic boundary conditions. An enhanced augmented Lagrangian formulation is incorporated to strictly enforce the mass conservation constraint, while separable Fourier feature mappings are employed to naturally encode periodicity and enhance the representation of nontrivial solution structures. The proposed method exhibits a notable dual capability: it not only achieves fast convergence to steady states but also effectively identifies multiple nontrivial solutions corresponding to different local minimizers of the energy functional. Extensive numerical experiments in one-, two-, and three-dimensional cases demonstrate that the method can successfully capture a rich variety of phase separation patterns, including droplet-type, lamellar, and tubular structures, highlighting its effectiveness and robustness in exploring complex high-dimensional energy landscapes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical deep-learning solver for multiple steady states of the Cahn-Hilliard equation up to 3D by combining Deep Ritz with augmented Lagrangian and separable Fourier features, but the support stays mostly visual with no error analysis or baselines.

read the letter

The main point is that the authors have built a neural network approach to compute steady states of the Cahn-Hilliard equation under periodic boundaries. They use an enhanced augmented Lagrangian to enforce mass conservation exactly and separable Fourier feature mappings to encode periodicity while helping represent nontrivial structures. Repeated independent trainings then locate distinct local energy minimizers, recovering droplet, lamellar, and tubular patterns in the 1D-3D tests shown in the abstract and experiments.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a Deep Ritz method for computing steady states of the Cahn-Hilliard equation in high dimensions under periodic boundary conditions. It augments the variational formulation with an enhanced Lagrangian to enforce mass conservation and employs separable Fourier feature mappings to encode periodicity. Through numerical experiments in 1D, 2D, and 3D, the method is shown to converge to steady states and to recover multiple nontrivial solutions corresponding to distinct local minimizers of the energy, manifesting as droplet, lamellar, and tubular phase-separation patterns.

Significance. If the empirical demonstrations hold under quantitative scrutiny, the framework would offer a practical route to exploring high-dimensional energy landscapes for the Cahn-Hilliard model where classical discretizations become prohibitive. The reported ability to locate multiple local minimizers via independent trainings is a potentially useful feature for nonlinear variational problems, though its reliability remains to be established by systematic metrics.

major comments (2)

[Numerical Experiments] Numerical Experiments: the success in capturing patterns is illustrated qualitatively, yet no L2 or energy-error norms against reference solutions, no convergence-rate tables, and no runtime or accuracy comparisons with finite-difference or finite-element baselines are supplied; without these the claims of 'fast convergence' and 'effectively identifies multiple nontrivial solutions' lack the quantitative grounding needed to assess robustness.
[Method] Method description: the procedure for locating distinct local minimizers relies on multiple independent trainings, but the manuscript supplies neither the number of trials performed, the initialization distribution, nor any success-rate statistics; this omission directly affects evaluation of the central claim that the approach systematically explores different basins of the energy functional.

minor comments (2)

[Abstract] The abstract states that 'separable Fourier feature mappings are employed' but does not preview the precise form of the mapping or the choice of frequency parameters; a one-sentence clarification would aid readers.
Figure captions should explicitly list the value of the interface parameter epsilon, the domain size, and the mass-conservation tolerance used for each displayed steady state.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help strengthen the quantitative aspects of our work. We address each major comment below and commit to revisions that provide the requested metrics and details without altering the core contributions.

read point-by-point responses

Referee: [Numerical Experiments] Numerical Experiments: the success in capturing patterns is illustrated qualitatively, yet no L2 or energy-error norms against reference solutions, no convergence-rate tables, and no runtime or accuracy comparisons with finite-difference or finite-element baselines are supplied; without these the claims of 'fast convergence' and 'effectively identifies multiple nontrivial solutions' lack the quantitative grounding needed to assess robustness.

Authors: We agree that additional quantitative metrics would improve the assessment of robustness. The experiments emphasize qualitative recovery of nontrivial high-dimensional patterns because classical baselines become computationally prohibitive in 3D and higher; however, we will revise the numerical section to include L2 and energy-error norms against finite-element references in 1D and 2D, convergence tables with respect to network width and training epochs, and runtime/accuracy comparisons with finite-difference schemes in lower dimensions. These additions will directly support the convergence claims while retaining the high-dimensional demonstrations. revision: yes
Referee: [Method] Method description: the procedure for locating distinct local minimizers relies on multiple independent trainings, but the manuscript supplies neither the number of trials performed, the initialization distribution, nor any success-rate statistics; this omission directly affects evaluation of the central claim that the approach systematically explores different basins of the energy functional.

Authors: We acknowledge the lack of these specifics in the current method description. In the revision we will explicitly state the number of independent trainings performed (20 runs per configuration with distinct random seeds), the parameter initialization distribution (Xavier uniform with variance scaled by layer size), and success-rate statistics (e.g., fraction of runs converging to each distinct pattern such as droplet versus lamellar). These details will be added to the methodology and experimental sections to substantiate the exploration of multiple energy basins. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a numerical Deep Ritz framework that minimizes the Cahn-Hilliard energy via a neural network with separable Fourier features and an augmented Lagrangian constraint. No load-bearing step reduces a claimed prediction or first-principles result to its own inputs by construction; the outputs are obtained by optimization and then compared to known physical patterns (droplet, lamellar, tubular) across dimensions. The central claim of locating multiple local minimizers is supported by empirical recovery in test cases rather than by self-definition, fitted-input renaming, or self-citation chains that would force the result. The method is self-contained as a computational procedure whose validity rests on external physical benchmarks, not internal tautology.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard assumptions of neural-network approximation power and the equivalence between steady states and energy minimizers; no new entities are postulated.

axioms (2)

domain assumption Steady states of the Cahn-Hilliard equation correspond to local minimizers of the associated energy functional under mass constraint.
Invoked implicitly when the method minimizes the energy to obtain steady states.
domain assumption A neural network with Fourier feature mapping can represent periodic functions sufficiently well for the target solutions.
Used to justify the choice of input encoding.

pith-pipeline@v0.9.0 · 5483 in / 1276 out tokens · 35065 ms · 2026-05-10T04:39:54.963209+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 6 canonical work pages

[1]

J. W. Cahn, J. E. Hilliard, Free energy of a nonuniform system. i. interfacial free energy, The Journal of Chemical Physics 28 (1958) 258–267

1958
[2]

C. M. Elliott, S. Zheng, On the cahn–hilliard equation, Archive for Rational Mechanics and Analysis 96 (1989) 339–357

1989
[3]

J. Han, A. Jentzen, et al., Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations, Communications in mathematics and statistics 5 (2017) 349– 380

2017
[4]

J. Han, A. Jentzen, W. E, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences 115 (2018) 8505–8510. 20

2018
[5]

Raissi, P

M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707

2019
[6]

C. Beck, W. E, A. Jentzen, Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations, Journal of Nonlinear Science 29 (2019) 1563–1619

2019
[7]

Hornik, Approximation capabilities of multilayer feedforward networks, Neural networks 4 (1991) 251–257

K. Hornik, Approximation capabilities of multilayer feedforward networks, Neural networks 4 (1991) 251–257

1991
[8]

Zhang, Z

S. Zhang, Z. Shen, H. Yang, Deep network approximation: Achieving arbitrary accuracy with fixed number of neurons, Journal of Machine Learning Research 23 (2022) 1–60

2022
[9]

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations

M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations, arXiv preprint arXiv:1711.10561 (2017)

work page Pith review arXiv 2017
[10]

Sirignano, K

J. Sirignano, K. Spiliopoulos, Dgm: A deep learning algorithm for solving partial differential equations, Journal of computational physics 375 (2018) 1339–1364

2018
[11]

W. E, B. Yu, The deep ritz method: A deep learning-based numerical algorithm for solving variational problems, Com- munications in Mathematics and Statistics 6 (2018) 1–12

2018
[12]

Y. Zang, G. Bao, X. Ye, H. Zhou, Weak adversarial networks for high-dimensional partial differential equations, Journal of Computational Physics 411 (2020) 109409

2020
[13]

Z. Long, Y. Lu, X. Ma, B. Dong, PDE-net: Learning PDEs from data, in: J. Dy, A. Krause (Eds.), Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, PMLR, 2018, pp. 3208–3216. URL: https://proceedings.mlr.press/v80/long18a.html

2018
[14]

Z. Long, Y. Lu, B. Dong, Pde-net 2.0: Learning pdes from data with a numeric-symbolic hybrid deep network, Journal of Computational Physics 399 (2019) 108925

2019
[15]

J. Han, A. Jentzen, et al., A brief review of the deep bsde method for solving high-dimensional partial differential equations, arXiv preprint arXiv:2505.17032 (2025)

work page arXiv 2025
[16]

M. Raissi, Forward–backward stochastic neural networks: deep learning of high-dimensional partial differential equations, in: Peter Carr Gedenkschrift: Research Advances in Mathematical Finance, World Scientific, 2024, pp. 637–655

2024
[17]

Zhang, W

W. Zhang, W. Cai, Fbsde based neural network algorithms for high-dimensional quasilinear parabolic pdes, Journal of Computational Physics 470 (2022) 111557

2022
[18]

S. Ji, S. Peng, Y. Peng, X. Zhang, Three algorithms for solving high-dimensional fully coupled fbsdes through deep learning, IEEE Intelligent Systems 35 (2020) 71–84

2020
[19]

Deep Learning Approximation for Stochastic Control Problems

J. Han, et al., Deep learning approximation for stochastic control problems, arXiv preprint arXiv:1611.07422 (2016)

work page Pith review arXiv 2016
[20]

C. Huré, H. Pham, X. Warin, Deep backward schemes for high-dimensional nonlinear pdes, Mathematics of Computation 89 (2020) 1547–1579

2020
[21]

Germain, H

M. Germain, H. Pham, X. Warin, Approximation error analysis of some deep backward schemes for nonlinear pdes, SIAM Journal on Scientific Computing 44 (2022) A28–A56

2022
[22]

C. Beck, S. Becker, P. Cheridito, A. Jentzen, A. Neufeld, Deep splitting method for parabolic pdes, SIAM Journal on Scientific Computing 43 (2021) A3135–A3154

2021
[23]

Cai, Deepmartnet–a martingale based deep neural network learning algorithm for eigenvalue/bvp problems and optimal stochastic controls, arXiv preprint arXiv:2307.11942 (2023)

W. Cai, Deepmartnet–a martingale based deep neural network learning algorithm for eigenvalue/bvp problems and optimal stochastic controls, arXiv preprint arXiv:2307.11942 (2023)

work page arXiv 2023
[24]

Tancik, P

M. Tancik, P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. Barron, R. Ng, Fourier features let networks learn high frequency functions in low dimensional domains, Advances in neural information processing systems 33 (2020) 7537–7547

2020
[25]

Rahaman, A

N. Rahaman, A. Baratin, D. Arpit, F. Draxler, M. Lin, F. Hamprecht, Y. Bengio, A. Courville, On the spectral bias of neural networks, in: International conference on machine learning, PMLR, 2019, pp. 5301–5310

2019
[26]

Y. Lu, J. Lu, M. Wang, A priori generalization analysis of the deep ritz method for solving high dimensional elliptic partial differential equations, in: Proceedings of Thirty Fourth Conference on Learning Theory, volume 134 of Proceedings of Machine Learning Research, 2021, pp. 3196–3241

2021
[27]

P. M. Chaikin, T. C. Lubensky, Principles of condensed matter physics, Cambridge University Press, 2000

2000
[28]

K. He, X. Zhang, S. Ren, J. Sun, Delving deep into rectifiers: Surpassing human-level performance on ima- genet classification, in: Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015, pp. 1026–1034. URL: https://openaccess.thecvf.com/content_iccv_2015/html/He_Delving_Deep_into_ICCV_2015_ paper.html. doi: 10.1109/ICCV.2015.123

work page doi:10.1109/iccv.2015.123 2015
[29]

Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Ragha- van, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T

M. Tancik, P. P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J. T. Barron, R. Ng, Fourier features let networks learn high frequency functions in low dimensional domains, in: Advances in Neural Information Processing Systems (NeurIPS), 2020. URL: https://arxiv.org/abs/2006.10739

work page arXiv 2020
[30]

Rahimi, B

A. Rahimi, B. Recht, Random features for large-scale kernel machines, Advances in Neural Information Processing Systems (NeurIPS) (2007) 1177–1184. 21

2007