arxiv: 2604.22208 · v1 · submitted 2026-04-24 · 🧮 math.NA · cs.NA

Recognition: unknown

Finite Expression Method with TranNet-based Function Learning for High-Dimensional Partial Differential Equations

Phuoc-Toan Huynh , Feng Bao , Haizhao Yang , Ahmed Zytoon

Authors on Pith no claims yet

Pith reviewed 2026-05-08 10:49 UTC · model grok-4.3

classification 🧮 math.NA cs.NA

keywords finite expression methodhigh-dimensional PDEsTransNetshallow neural networksmachine learning solverscurse of dimensionalitynumerical PDE methodsfunction approximation

0 comments

The pith

Shallow neural network operators initialized by TransNet extend the finite expression method to solve high-dimensional PDEs effectively.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the finite expression method for approximating solutions to partial differential equations by generating its functional pool from shallow neural network operators whose parameters are set using the TransNet initialization. This targets the curse of dimensionality that limits classical numerical methods on high-dimensional problems. The approach retains the original method's reported strengths of high accuracy and polynomial memory complexity. Numerical experiments on several PDEs indicate the extension works as an effective alternative. A reader would care if the hybrid keeps practical scalability where traditional grids or bases fail.

Core claim

The finite expression method approximates PDE solutions in a space of finitely many analytic expressions and has shown high accuracy with polynomial memory use; the extension replaces or augments the expression generation step with shallow neural network operators whose parameters are initialized via TransNet, and experiments on multiple high-dimensional PDEs confirm this produces an effective solver.

What carries the argument

The finite expression method (FEX) functional pool, now generated by TransNet-initialized shallow neural network operators.

If this is right

High-dimensional PDEs become solvable with accuracy levels previously limited to low-dimensional cases.
Memory requirements stay polynomial in the problem dimension instead of exponential.
Computational costs remain favorable relative to grid-based or basis-expansion methods.
The same framework can serve as an alternative for a range of PDE problems without needing hand-crafted analytic expressions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The TransNet initialization may reduce the need for problem-specific tuning of the functional pool across different PDE types.
This learned-pool approach could be combined with other neural training schedules to handle time-dependent or nonlinear high-dimensional problems.
If the initialization reliably spans useful function spaces, similar transferable-network ideas might apply to other expression-based or basis-adaptive solvers.

Load-bearing premise

Initializing shallow neural network operators with TransNet yields a functional pool that achieves high accuracy while keeping memory complexity polynomial for high-dimensional PDEs.

What would settle it

A high-dimensional PDE test case in which the method produces only low accuracy or shows memory usage that grows exponentially with dimension would disprove the effectiveness of the extension.

Figures

Figures reproduced from arXiv: 2604.22208 by Ahmed Zytoon, Feng Bao, Haizhao Yang, Phuoc-Toan Huynh.

**Figure 1.** Figure 1: Examples of binary trees of depths 1, 2, and 3, respectively. In each binary tree, every node contains either a unary or a binary operator. For trees with depth greater than 1, the computation is carried out recursively. This figure is adapted from [35, view at source ↗

**Figure 2.** Figure 2: [60D Poisson] Heatmaps of the reference solution on two-dimensional slices, with the remaining 58 dimensions fixed at predefined values. (First) Dimensions (22, 37). (Second) Dimensions (30, 35). (Third) Dimensions (41, 18) view at source ↗

**Figure 3.** Figure 3: [60D Poisson] Heatmaps of the predicted solution with pool P1 (first row) and pool P2 (second row). (First column) Dimensions (22, 37). (Second column) Dimensions (30, 35). (Third column) Dimensions (41, 18) view at source ↗

**Figure 4.** Figure 4: [60D Poisson] Heatmaps of absolute pointwise error with pool P1 (first row) and pool P2 (second row). (First column) Dimensions (22, 37). (Second column) Dimensions (30, 35). (Third column) Dimensions (41, 18). The true solution on three selected pairs of dimensions, with resolution 200 × 200, is displayed in view at source ↗

**Figure 5.** Figure 5: [60D Reaction-diffusion] Heatmaps of the reference solution on two-dimensional slices, with the remaining 58 dimensions fixed at predefined values. (First) Dimensions (17, 33). (Second) Dimensions (21, 56). (Third) Dimensions (52, 19). only a moderately accurate approximation when this candidate is excluded. More specifically, when the pool P2 is used, FEX constructs the predicted solution from a combinat… view at source ↗

**Figure 6.** Figure 6: [60D Reaction-diffusion] Heatmaps of the predicted solution with pool P1 in (24) andwith pool P2 in (25). (First column) Dimensions (17, 33). (Second column) Dimensions (21, 56). (Third column) Dimensions (52, 19) view at source ↗

**Figure 7.** Figure 7: [60D Reaction-diffusion] Heatmaps of absolute pointwise error with pool P1 in (24) andwith pool P2 in (25). (First column) Dimensions (17, 33). (Second column) Dimensions (21, 56). (Third column) Dimensions (52, 19). We further compute the absolute pointwise errors for both approximate solutions and 14 view at source ↗

**Figure 8.** Figure 8: [55D Semilinear Elliptic] Heatmaps of the reference solution, predicted solution, and the corresponding absolute relative error on two-dimensional slices, with the remaining 58 dimensions fixed at predefined values. (First row) Dimensions (27, 32). (Second row) Dimensions (30, 35). (Third row) Dimensions (30, 35). We plot the true solution and the predicted solution along selected pairs of dimensions, wi… view at source ↗

read the original abstract

In this paper, we study a machine-learning-based solver for high-dimensional partial differential equations (PDEs). Computing accurate solutions efficiently for such problems remains challenging because of the curse of dimensionality, which severely limits the scalability of classical numerical methods. Our approach builds on the recently developed finite expression method (FEX), which approximates PDE solutions in a function space generated by finitely many analytic expressions. This framework has been shown to achieve high, and in some cases machine-level, accuracy with polynomial memory complexity and favorable computational cost. We propose an extension of FEX in which the functional pool is generated by shallow neural network operators whose parameters are initialized using the transferable neural network method TransNet. Numerical experiments suggest that the proposed extension is an effective alternative for solving several high-dimensional PDEs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper extends FEX by swapping in TransNet-initialized shallow networks for the function pool, and the experiments give initial support but lack the quantitative detail needed to judge the gains.

read the letter

The main takeaway is that the authors have modified the finite expression method by replacing its functional pool with shallow neural network operators initialized using the TransNet technique. This seems intended to make the method more flexible or easier to apply to high-dimensional PDEs without losing the polynomial memory complexity that FEX promises. On the positive side, the work builds directly on the FEX framework, which already has some track record for accurate solutions. By incorporating TransNet, they are leveraging a known method for initializing networks in a way that might transfer well across problems. The numerical experiments cover several high-dimensional PDEs and indicate that the approach can produce solutions, which at least validates the basic idea. If the full paper includes reproducible code or detailed parameter settings, that would be a plus for anyone wanting to build on it. Where it falls short is in the presentation of results. The description relies on the claim that experiments suggest effectiveness, but without specific numbers, error bars, or head-to-head comparisons against the original FEX or competing methods like physics-informed neural networks, it's tough to gauge the real improvement. The assumption that TransNet initialization leads to a capable functional pool is central, but the paper doesn't seem to test variations or show why this initialization is superior to random or other schemes. Minor issues like lack of discussion on computational costs beyond memory could also be addressed. This kind of paper is useful for specialists in numerical methods for PDEs who are exploring machine learning hybrids. A reader familiar with FEX or TransNet would appreciate the targeted extension, while someone new to the area might need more background. I would recommend sending it for peer review. The idea is coherent and the experiments provide initial support, even if they could be more rigorous. Referees could help clarify the contributions and suggest improvements to the validation.

Referee Report

2 major / 0 minor

Summary. The paper extends the finite expression method (FEX) for high-dimensional PDEs by generating the functional pool via shallow neural network operators initialized with the TransNet method. Numerical experiments are cited to suggest that this extension is an effective alternative for solving several high-dimensional PDEs while aiming to preserve high accuracy and polynomial memory complexity.

Significance. If the experimental support can be strengthened with quantitative details, the work could provide a useful bridge between analytic expression-based solvers and neural network flexibility for high-dimensional PDEs. It builds on the established FEX framework's strengths in accuracy and scaling, with the TransNet initialization as a targeted extension. The modest claim level makes the contribution potentially incremental but worthwhile in numerical analysis.

major comments (2)

Abstract: The effectiveness claim rests entirely on numerical experiments, yet the text provides no quantitative metrics, error bars, baseline comparisons, or details on data selection and setup. This renders the central claim unverifiable at the stated level of support.
Numerical experiments: No specific accuracy values, memory scaling measurements, or comparisons to standard FEX or other high-dimensional solvers (e.g., PINNs) are reported, which is load-bearing for validating that the TranNet-initialized pool delivers the promised effectiveness and complexity properties.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We address each of the major comments below and have made revisions to the manuscript to incorporate additional quantitative details and clarifications.

read point-by-point responses

Referee: Abstract: The effectiveness claim rests entirely on numerical experiments, yet the text provides no quantitative metrics, error bars, baseline comparisons, or details on data selection and setup. This renders the central claim unverifiable at the stated level of support.

Authors: We agree with the referee that the abstract should provide more concrete support for the effectiveness claim. In the revised version, we have included specific quantitative metrics from the numerical experiments, such as achieved accuracy levels and comparisons to baseline methods, along with brief details on the setup. This makes the claim more verifiable while maintaining the abstract's conciseness. revision: yes
Referee: Numerical experiments: No specific accuracy values, memory scaling measurements, or comparisons to standard FEX or other high-dimensional solvers (e.g., PINNs) are reported, which is load-bearing for validating that the TranNet-initialized pool delivers the promised effectiveness and complexity properties.

Authors: We acknowledge that the original numerical experiments section did not include sufficient specific values or direct comparisons. We have revised this section to report detailed accuracy values (e.g., relative L2 errors), memory usage scaling with dimension, and comparisons against standard FEX and PINN solvers. Multiple runs with error bars are now presented to demonstrate robustness. These additions directly validate the benefits of the TranNet-based initialization in terms of accuracy and complexity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper proposes an extension of the finite expression method (FEX) by replacing the functional pool with shallow neural network operators initialized via TransNet, then validates the approach through new numerical experiments on high-dimensional PDEs. The central claim is modest and empirical ('effective alternative'), resting directly on reported experiments rather than any derivation that reduces by construction to fitted parameters, self-definitions, or load-bearing self-citations. Prior FEX work is cited as background but is not invoked to force the new result; the experiments provide independent support. No steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the prior FEX framework and the TransNet method; the abstract introduces no new free parameters, invented entities, or ad-hoc axioms beyond the domain assumption that neural operators can usefully generate the required analytic expressions.

axioms (1)

domain assumption Shallow neural network operators initialized by TransNet can generate a functional pool that approximates high-dimensional PDE solutions with high accuracy and polynomial memory cost.
This assumption underpins the proposed extension of FEX.

pith-pipeline@v0.9.0 · 5438 in / 1140 out tokens · 81959 ms · 2026-05-08T10:49:17.063653+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

65 extracted references · 6 canonical work pages · 1 internal anchor

[1]

J. H. Adler , Hans De Sterck, S. MacLachlan, and L. N. Olson. Numerical Partial Differ- ential Equations. Society for Industrial and Applied Mathematics, 2024

2024
[2]

Archibald, F

R. Archibald, F . Bao, Y . Cao, and H. Sun. Numerical analysis f or convergence of a sample-wise backpropagation method for training stochast ic neural networks. SIAM Journal on Numerical Analysis , 62(2):593–621, 2024

2024
[3]

Archibald, F

R. Archibald, F . Bao, Y . Cao, and H. Zhang. A backward sde meth od for uncer- tainty quantiﬁcation in deep learning. Discrete and Continuous Dynamical Systems - S, 15(10):2807–2835, 2022

2022
[4]

Arora, Amitabh Basu, Poorya Mianjy , and Anirbit Mukherje e

R. Arora, Amitabh Basu, Poorya Mianjy , and Anirbit Mukherje e. Understanding deep neural networks with rectiﬁed linear units. Electron. Colloquium Comput. Complex. , TR17, 2016

2016
[5]

Nonlinear Programming: Analysis and Methods

Mordecai Avriel. Nonlinear Programming: Analysis and Methods . Dover Publications, 2003

2003
[6]

Bahmani, I

B. Bahmani, I. G. Kevrekidis, and M. D. Shields. Neural chaos : A spectral stochastic neural operator . Journal of Computational Physics , 539:114233, 2025

2025
[7]

Baranek and P

M. Baranek and P . Przybyłowicz. Stpinns - deep learning framework for approximation of stochastic differential equations, 2026

2026
[8]

The convergence rate of neural networks for learned functions of different frequen cies

Ronen Basri, David Jacobs, Yoni Kasten, and Shira Kritchman. The convergence rate of neural networks for learned functions of different frequen cies. In Advances in Neural Information Processing Systems , volume 32, 2019

2019
[9]

Bausback, J

R. Bausback, J. T ang, Lu Lu, Feng Bao, and P .-T . Huynh. Stochastic operator network: A stochastic maximum principle based approach to operator learning. Journal of Machine Learning, 5(1):71–96, 2026

2026
[10]

Neural Combinatorial Optimization with Reinforcement Learning

I. Bello, H. Pham, Q. V . Le, M. Norouzi, and S. Bengio. Neural c ombinatorial optimiza- tion with reinforcement learning. arXiv preprint arXiv:1611.09940 , 2016

work page Pith review arXiv 2016
[11]

Bello, B

I. Bello, B. Zoph, V . V asudevan, and Q. V . Le. Neural optimize r search with reinforce- ment learning. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research , pages 459–468. PMLR, 2017

2017
[12]

Bianco, R

S. Bianco, R. Cadène, L. Celona, and P . Napoletano. Benchmar k analysis of represen- tative deep neural network architectures. IEEE Access, 6:64270–64277, 2018

2018
[13]

T owards under- standing the spectral bias of deep learning

Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, and QuanquanGu. T owards under- standing the spectral bias of deep learning. In Proceedings of the Thirtieth International Joint Conference on Artiﬁcial Intelligence , pages 2205–2211. International Joint Con- ferences on Artiﬁcial Intelligence Organization, 2021

2021
[14]

Transfer learning based multi-ﬁde lity physics informed deep neural network

Souvik Lal Chakraborty . Transfer learning based multi-ﬁde lity physics informed deep neural network. J. Comput. Phys. , 426:109942, 2020

2020
[15]

F . Chen, J. Huang, C. Wang, and H. Yang. Friedrichs learning: Weak solutions of partial differential equations via deep learning. SIAM Journal of Scientiﬁc Computing , 45(3):A1271–A1299, 2023. 17

2023
[16]

J. Chen, X. Chi, W . E, and Z. Yang. Bridging traditional and ma chine learning-based algorithms for solving pdes: The random feature method. Journal of Machine Learning, pages 268–298, 2022

2022
[17]

W . C. Cheung, V . T an, and Z. Zhong. A thompson sampling algori thm for cascading bandits. In Proceedings of the Twenty-Second International Conference on Artiﬁcial In- telligence and Statistics , volume 89 of Proceedings of Machine Learning Research , pages 438–447. PMLR, 2019

2019
[18]

J. D. Co-Reyes, Y . Miao, D. Peng, E. Real, Q. V . Le, S. Levine, H . Lee, and A. Faust. Evolving reinforcement learning algorithms. In International Conference on Learning Representations, 2021

2021
[19]

The Mathematics of Diffusion

John Crank. The Mathematics of Diffusion . Oxford University Press, 2 edition, 1975

1975
[20]

Daubechies, R

I. Daubechies, R. DeV ore, S. Foucart, B. Hanin, and G. Petrov a. Nonlinear approxima- tion and (deep) relu networks. Constr . Approx., 55(1):127–172, 2022

2022
[21]

Desai, M

S. Desai, M. Mattheakis, H. Joy , P . Protopapas, and S. Roberts. One-shot transfer learn- ing of physics-informed neural networks, 2022

2022
[22]

Han, and A

E., J. Han, and A. Jentzen. Algorithms for solving high dimen sional pdes: From non- linear monte carlo to machine learning. Nonlinearity, 35(1):278–310, 2022

2022
[23]

Practical Methods of Optimization

Roger Fletcher . Practical Methods of Optimization . Wiley , 2 edition, 2000

2000
[24]

C. R. Gin, D.E. Shea, S. L. Brunton, and J. N. Kutz. Deepgreen: deep learning of green’s functions for nonlinear boundary value problems. Scientiﬁc Reports , 11(1):1–14, 2021

2021
[25]

J. Han, A. Jentzen, and E. Weinan. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

2018
[26]

Complexity of linear regions in deep networks

Boris Hanin and David Rolnick. Complexity of linear regions in deep networks. In International Conference on Machine Learning , Proceedings of Machine Learning Re- search, pages 2596–2604, 2019

2019
[27]

Jia and A

J. Jia and A. R. Benson. Neural jump stochastic differential equations. In Advances in Neural Information Processing Systems 32 , pages 9847–9858. Curran Associates, Inc., 2019

2019
[28]

Y . Jiao, Y . Lai, X. Lu, F . Wang, J. Z. Yang, and Y . Yang. Deep neu ral networks with ReLU-sine-exponential activations break curse of dimensi onality in approximation on hölder class. SIAM Journal on Mathematical Analysis , 55(4):3635–3649, 2023

2023
[29]

Y . Khoo, J. Lu, and L. Ying. Solving parametric pde problems w ith artiﬁcial neural networks. European Journal of Applied Mathematics , 32(3):421–435, 2021

2021
[30]

D. P . Kingma and J. Ba. Adam: A method for stochastic optimiza tion. arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review arXiv 2014
[31]

L. Kong, J. Sun, and C. Zhang. Sde-net: Equipping deep neural networks with un- certainty estimates. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research , pages 5405–5415. PMLR, 2020. 18

2020
[32]

I. E. Lagaris, A. Likas, and D. I. Fotiadis. Artiﬁcial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks , 9(5):987– 1000, 1998

1998
[33]

Landajuela, B

M. Landajuela, B. K. Petersen, S. Kim, C. P . Santiago, R. Glatt, N. Mundhenk, J. F . Pettit, and D. Faissol. Discovering symbolic policies with deep rei nforcement learning. In Marina Meila and T ong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning , volume 139 of Proceedings of Machine Learning Research , pages 5979–59...

2021
[34]

Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattac harya, A. Stuart, and A. Anankumar . Fourier neural operator for parametric parti al differential equations. In International Conference on Learning Representations , 2021

2021
[35]

Liang and H

S. Liang and H. Yang. Finite expression method for solving hi gh-dimensional partial differential equations. Journal of Machine Learning Research , 26(138):1–31, 2025

2025
[36]

Liao and P

Y . Liao and P . Ming. Deep nitsche method: Deep ritz method wit h essential boundary conditions. Commun. Comput. Phys. , 29(5):1365–1384, 2021

2021
[37]

H. Liu, K. Simonyan, and Y . Yang. Darts: Differentiable arch itecture search. In Inter- national Conference on Learning Representations , 2019

2019
[38]

X. Liu, T . Xiao, Si Si, Q. Cao, S. Kumar , and C.-J. Hsieh. Neura l sde: Stabilizing neural ode networks with stochastic noise, 2019. arXiv preprint ar Xiv:1906.02355

work page arXiv 2019
[39]

Y . Liu, S. G. McCalla, and H. Schaeffer . Random feature models for learning interacting dynamical systems. Proceedings of the Royal Society A: Mathematical, Physical a nd Engineering Sciences, 479(2275):20220835, 2023

2023
[40]

Z. Liu, W . Cai, and Z.-Q. J. Xu. Multi-scale deep neural netwo rk (MscaleDNN) for solving poisson-boltzmann equation in complex domains. Communications in Compu- tational Physics, 28(5):1970–2001, 2020

1970
[41]

Lu Lu, P . Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learni ng nonlinear operators via DeepONet based on the universal approximation theorem o f operators. Nature Machine Intelligence, 3(3):218–229, 2021

2021
[42]

Mazyavkina, S

N. Mazyavkina, S. Sviridov , S. Ivanov , and E. Burnaev . Reinf orcement learning for combinatorial optimization: A survey . Computers & Operations Research , 134:105400, 2021

2021
[43]

Robert N. Miller . Primitive equation models. In Numerical Modeling of Ocean Circula- tion, pages 87–164. Cambridge University Press, 2007

2007
[44]

Murray-Smith

David J. Murray-Smith. Modelling and Simulation of Integrated Systems in Engineerin g: Issues of Methodology, Quality, Testing and Application . Elsevier , 2012

2012
[45]

M. W . M. G. Dissanayake nd N. Phan-Thien. Neural-network-ba sed approximations for solving partial differential equations. Comm. Numer . Methods. Engrg., 10:195–201, 1994

1994
[46]

B. K. Petersen, M. L. Larma, T . N. Mundhenk, C. P . Santiago, S. K. Kim, and J. T . Kim. Deep symbolic regression: Recovering mathematical ex pressions from data via risk-seeking policy gradients. In International Conference on Learning Representations , 2021. 19

2021
[47]

Raissi, P

M. Raissi, P . Perdikaris, and G.E. Karniadakis. Physics-in formed neural networks: A deep learning framework for solving forward and inverse pro blems involving non- linear partial differential equations. Journal of Computational Physics , 378:686–707, 2019

2019
[48]

Ramachandran, B

P . Ramachandran, B. Zoph, and Q. V . Le. Searching for activat ion functions. In Inter- national Conference on Learning Representations , 2018

2018
[49]

D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning re presentations by back- propagating errors. Nature, 323(6088):533–536, 1986

1986
[50]

Z. Shen, H. Yang, and S. Zhang. Deep network with approximation error being recipro- cal of width to power of square root of depth. Neural Computation, 33(4):1005–1036, 2021

2021
[51]

Sirignano and K

J. Sirignano and K. Spiliopoulos. Dgm: A deep learning algor ithm for solving partial differential equations. Journal of Computational Physics , 375:1339–1364, 2018

2018
[52]

Y . Sun, A. Gilbert, and A. T ewari. On the approximation prope rties of random relu features. arXiv preprint arXiv:1810.04374 , 2018

work page arXiv 2018
[53]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction . The MIT Press, 2 edition, 2018

2018
[54]

T ang, R

J. T ang, R. Bausback, F . Bao, and R. Archibald. Federated learning on stochastic neural networks. Journal of Machine Learning for Modeling and Computing , 6(4):125–150, 2025

2025
[55]

Navier-Stokes Equations: Theory and Numerical Analysis

Roger T emam. Navier-Stokes Equations: Theory and Numerical Analysis . AMS Chelsea Publishing, 2001

2001
[56]

arXiv:1905.09883 , year=

B. Tzen and M. Raginsky . Neural stochastic differential equ ations: Deep latent gaus- sian models in the diffusion limit. arXiv preprint arXiv:1905.09883 , 2019

work page arXiv 1905
[57]

Weinan, J

E. Weinan, J. Han, and A. Jentzen. Deep learning-based numer ical methods for high- dimensional parabolic partial differential equations and backward stochastic differen- tial equations. Communications in Mathematics and Statistics , 5(4):349–380, 2017

2017
[58]

Weinan and B

E. Weinan and B. Yu. The deep ritz method: A deep learning-bas ed numerical algo- rithm for solving variational problems. Commun. Math. Stat , 6:1–12, 2018

2018
[59]

Z.-Q. J. Xu, Y . Zhang, T . Luo, Y . Xiao, and Z. Ma. Frequency principle: Fourier analysis sheds light on deep neural networks. Communications in Computational Physics, 28(5), 2020

2020
[60]

Elementary superexpressive activations

Dmitry Yarotsky . Elementary superexpressive activations . arXiv preprint arXiv:2102.10911, 2021

work page arXiv 2021
[61]

Y . Zang, G. Bao, X. Y e, and H. Zhou. Weak adversarial networks for high-dimensional partial differential equations. Journal of Computational Physics , 411:109409, 2020

2020
[62]

Zhang, T

X. Zhang, T . Cheng, and L. Ju. Implicit form neural network fo r learning scalar hy- perbolic conservation laws. In J. Bruna, J. Hesthaven, and L . Zdeborova, editors, Pro- ceedings of the 2nd Mathematical and Scientiﬁc Machine Learn ing Conference , volume 145 of Proceedings of Machine Learning Research , pages 1082–1098. PMLR, 2022

2022
[63]

Zhang, F

Z. Zhang, F . Bao, L. Ju, and G. Zhang. Transferable neural net works for partial differ- ential equations. J. Sci. Comput. , 99(2), 2024. 20

2024
[64]

Zhang, F

Z. Zhang, F . Bao, and G. Zhang. Improving the expressive powe r of deep neural net- works through integral activation transform. International Journal of Numerical Anal- ysis and Modeling , 21(5):739–763, 2024

2024
[65]

Yang, and S

Z.Shen, H. Yang, and S. Zhang. Neural network approximation : Three hidden layers are enough. Neural Networks, 141:160–173, 2021. 21

2021