pith. machine review for the scientific record. sign in

arxiv: 2604.22208 · v1 · submitted 2026-04-24 · 🧮 math.NA · cs.NA

Recognition: unknown

Finite Expression Method with TranNet-based Function Learning for High-Dimensional Partial Differential Equations

Authors on Pith no claims yet

Pith reviewed 2026-05-08 10:49 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords finite expression methodhigh-dimensional PDEsTransNetshallow neural networksmachine learning solverscurse of dimensionalitynumerical PDE methodsfunction approximation
0
0 comments X

The pith

Shallow neural network operators initialized by TransNet extend the finite expression method to solve high-dimensional PDEs effectively.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends the finite expression method for approximating solutions to partial differential equations by generating its functional pool from shallow neural network operators whose parameters are set using the TransNet initialization. This targets the curse of dimensionality that limits classical numerical methods on high-dimensional problems. The approach retains the original method's reported strengths of high accuracy and polynomial memory complexity. Numerical experiments on several PDEs indicate the extension works as an effective alternative. A reader would care if the hybrid keeps practical scalability where traditional grids or bases fail.

Core claim

The finite expression method approximates PDE solutions in a space of finitely many analytic expressions and has shown high accuracy with polynomial memory use; the extension replaces or augments the expression generation step with shallow neural network operators whose parameters are initialized via TransNet, and experiments on multiple high-dimensional PDEs confirm this produces an effective solver.

What carries the argument

The finite expression method (FEX) functional pool, now generated by TransNet-initialized shallow neural network operators.

If this is right

  • High-dimensional PDEs become solvable with accuracy levels previously limited to low-dimensional cases.
  • Memory requirements stay polynomial in the problem dimension instead of exponential.
  • Computational costs remain favorable relative to grid-based or basis-expansion methods.
  • The same framework can serve as an alternative for a range of PDE problems without needing hand-crafted analytic expressions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The TransNet initialization may reduce the need for problem-specific tuning of the functional pool across different PDE types.
  • This learned-pool approach could be combined with other neural training schedules to handle time-dependent or nonlinear high-dimensional problems.
  • If the initialization reliably spans useful function spaces, similar transferable-network ideas might apply to other expression-based or basis-adaptive solvers.

Load-bearing premise

Initializing shallow neural network operators with TransNet yields a functional pool that achieves high accuracy while keeping memory complexity polynomial for high-dimensional PDEs.

What would settle it

A high-dimensional PDE test case in which the method produces only low accuracy or shows memory usage that grows exponentially with dimension would disprove the effectiveness of the extension.

Figures

Figures reproduced from arXiv: 2604.22208 by Ahmed Zytoon, Feng Bao, Haizhao Yang, Phuoc-Toan Huynh.

Figure 1
Figure 1. Figure 1: Examples of binary trees of depths 1, 2, and 3, respectively. In each binary tree, every node contains either a unary or a binary operator. For trees with depth greater than 1, the computation is carried out recursively. This figure is adapted from [35, view at source ↗
Figure 2
Figure 2. Figure 2: [60D Poisson] Heatmaps of the reference solution on two-dimensional slices, with the remaining 58 dimensions fixed at predefined values. (First) Dimensions (22, 37). (Second) Dimensions (30, 35). (Third) Dimensions (41, 18) view at source ↗
Figure 3
Figure 3. Figure 3: [60D Poisson] Heatmaps of the predicted solution with pool P1 (first row) and pool P2 (second row). (First column) Dimensions (22, 37). (Second column) Dimensions (30, 35). (Third column) Dimensions (41, 18) view at source ↗
Figure 4
Figure 4. Figure 4: [60D Poisson] Heatmaps of absolute pointwise error with pool P1 (first row) and pool P2 (second row). (First column) Dimensions (22, 37). (Second column) Dimensions (30, 35). (Third column) Dimensions (41, 18). The true solution on three selected pairs of dimensions, with resolution 200 × 200, is displayed in view at source ↗
Figure 5
Figure 5. Figure 5: [60D Reaction-diffusion] Heatmaps of the reference solution on two-dimensional slices, with the remaining 58 dimensions fixed at predefined values. (First) Dimensions (17, 33). (Second) Dimensions (21, 56). (Third) Dimensions (52, 19). only a moderately accurate approximation when this candidate is excluded. More specifi￾cally, when the pool P2 is used, FEX constructs the predicted solution from a combinat… view at source ↗
Figure 6
Figure 6. Figure 6: [60D Reaction-diffusion] Heatmaps of the predicted solution with pool P1 in (24) andwith pool P2 in (25). (First column) Dimensions (17, 33). (Second column) Dimensions (21, 56). (Third column) Dimensions (52, 19) view at source ↗
Figure 7
Figure 7. Figure 7: [60D Reaction-diffusion] Heatmaps of absolute pointwise error with pool P1 in (24) andwith pool P2 in (25). (First column) Dimensions (17, 33). (Second column) Dimensions (21, 56). (Third column) Dimensions (52, 19). We further compute the absolute pointwise errors for both approximate solutions and 14 view at source ↗
Figure 8
Figure 8. Figure 8: [55D Semilinear Elliptic] Heatmaps of the reference solution, predicted solution, and the corre￾sponding absolute relative error on two-dimensional slices, with the remaining 58 dimensions fixed at prede￾fined values. (First row) Dimensions (27, 32). (Second row) Dimensions (30, 35). (Third row) Dimensions (30, 35). We plot the true solution and the predicted solution along selected pairs of dimensions, wi… view at source ↗
read the original abstract

In this paper, we study a machine-learning-based solver for high-dimensional partial differential equations (PDEs). Computing accurate solutions efficiently for such problems remains challenging because of the curse of dimensionality, which severely limits the scalability of classical numerical methods. Our approach builds on the recently developed finite expression method (FEX), which approximates PDE solutions in a function space generated by finitely many analytic expressions. This framework has been shown to achieve high, and in some cases machine-level, accuracy with polynomial memory complexity and favorable computational cost. We propose an extension of FEX in which the functional pool is generated by shallow neural network operators whose parameters are initialized using the transferable neural network method TransNet. Numerical experiments suggest that the proposed extension is an effective alternative for solving several high-dimensional PDEs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper extends the finite expression method (FEX) for high-dimensional PDEs by generating the functional pool via shallow neural network operators initialized with the TransNet method. Numerical experiments are cited to suggest that this extension is an effective alternative for solving several high-dimensional PDEs while aiming to preserve high accuracy and polynomial memory complexity.

Significance. If the experimental support can be strengthened with quantitative details, the work could provide a useful bridge between analytic expression-based solvers and neural network flexibility for high-dimensional PDEs. It builds on the established FEX framework's strengths in accuracy and scaling, with the TransNet initialization as a targeted extension. The modest claim level makes the contribution potentially incremental but worthwhile in numerical analysis.

major comments (2)
  1. Abstract: The effectiveness claim rests entirely on numerical experiments, yet the text provides no quantitative metrics, error bars, baseline comparisons, or details on data selection and setup. This renders the central claim unverifiable at the stated level of support.
  2. Numerical experiments: No specific accuracy values, memory scaling measurements, or comparisons to standard FEX or other high-dimensional solvers (e.g., PINNs) are reported, which is load-bearing for validating that the TranNet-initialized pool delivers the promised effectiveness and complexity properties.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We address each of the major comments below and have made revisions to the manuscript to incorporate additional quantitative details and clarifications.

read point-by-point responses
  1. Referee: Abstract: The effectiveness claim rests entirely on numerical experiments, yet the text provides no quantitative metrics, error bars, baseline comparisons, or details on data selection and setup. This renders the central claim unverifiable at the stated level of support.

    Authors: We agree with the referee that the abstract should provide more concrete support for the effectiveness claim. In the revised version, we have included specific quantitative metrics from the numerical experiments, such as achieved accuracy levels and comparisons to baseline methods, along with brief details on the setup. This makes the claim more verifiable while maintaining the abstract's conciseness. revision: yes

  2. Referee: Numerical experiments: No specific accuracy values, memory scaling measurements, or comparisons to standard FEX or other high-dimensional solvers (e.g., PINNs) are reported, which is load-bearing for validating that the TranNet-initialized pool delivers the promised effectiveness and complexity properties.

    Authors: We acknowledge that the original numerical experiments section did not include sufficient specific values or direct comparisons. We have revised this section to report detailed accuracy values (e.g., relative L2 errors), memory usage scaling with dimension, and comparisons against standard FEX and PINN solvers. Multiple runs with error bars are now presented to demonstrate robustness. These additions directly validate the benefits of the TranNet-based initialization in terms of accuracy and complexity. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper proposes an extension of the finite expression method (FEX) by replacing the functional pool with shallow neural network operators initialized via TransNet, then validates the approach through new numerical experiments on high-dimensional PDEs. The central claim is modest and empirical ('effective alternative'), resting directly on reported experiments rather than any derivation that reduces by construction to fitted parameters, self-definitions, or load-bearing self-citations. Prior FEX work is cited as background but is not invoked to force the new result; the experiments provide independent support. No steps match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the prior FEX framework and the TransNet method; the abstract introduces no new free parameters, invented entities, or ad-hoc axioms beyond the domain assumption that neural operators can usefully generate the required analytic expressions.

axioms (1)
  • domain assumption Shallow neural network operators initialized by TransNet can generate a functional pool that approximates high-dimensional PDE solutions with high accuracy and polynomial memory cost.
    This assumption underpins the proposed extension of FEX.

pith-pipeline@v0.9.0 · 5438 in / 1140 out tokens · 81959 ms · 2026-05-08T10:49:17.063653+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

65 extracted references · 6 canonical work pages · 1 internal anchor

  1. [1]

    J. H. Adler , Hans De Sterck, S. MacLachlan, and L. N. Olson. Numerical Partial Differ- ential Equations. Society for Industrial and Applied Mathematics, 2024

  2. [2]

    Archibald, F

    R. Archibald, F . Bao, Y . Cao, and H. Sun. Numerical analysis f or convergence of a sample-wise backpropagation method for training stochast ic neural networks. SIAM Journal on Numerical Analysis , 62(2):593–621, 2024

  3. [3]

    Archibald, F

    R. Archibald, F . Bao, Y . Cao, and H. Zhang. A backward sde meth od for uncer- tainty quantification in deep learning. Discrete and Continuous Dynamical Systems - S, 15(10):2807–2835, 2022

  4. [4]

    Arora, Amitabh Basu, Poorya Mianjy , and Anirbit Mukherje e

    R. Arora, Amitabh Basu, Poorya Mianjy , and Anirbit Mukherje e. Understanding deep neural networks with rectified linear units. Electron. Colloquium Comput. Complex. , TR17, 2016

  5. [5]

    Nonlinear Programming: Analysis and Methods

    Mordecai Avriel. Nonlinear Programming: Analysis and Methods . Dover Publications, 2003

  6. [6]

    Bahmani, I

    B. Bahmani, I. G. Kevrekidis, and M. D. Shields. Neural chaos : A spectral stochastic neural operator . Journal of Computational Physics , 539:114233, 2025

  7. [7]

    Baranek and P

    M. Baranek and P . Przybyłowicz. Stpinns - deep learning framework for approximation of stochastic differential equations, 2026

  8. [8]

    The convergence rate of neural networks for learned functions of different frequen cies

    Ronen Basri, David Jacobs, Yoni Kasten, and Shira Kritchman. The convergence rate of neural networks for learned functions of different frequen cies. In Advances in Neural Information Processing Systems , volume 32, 2019

  9. [9]

    Bausback, J

    R. Bausback, J. T ang, Lu Lu, Feng Bao, and P .-T . Huynh. Stochastic operator network: A stochastic maximum principle based approach to operator learning. Journal of Machine Learning, 5(1):71–96, 2026

  10. [10]

    Neural Combinatorial Optimization with Reinforcement Learning

    I. Bello, H. Pham, Q. V . Le, M. Norouzi, and S. Bengio. Neural c ombinatorial optimiza- tion with reinforcement learning. arXiv preprint arXiv:1611.09940 , 2016

  11. [11]

    Bello, B

    I. Bello, B. Zoph, V . V asudevan, and Q. V . Le. Neural optimize r search with reinforce- ment learning. In Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research , pages 459–468. PMLR, 2017

  12. [12]

    Bianco, R

    S. Bianco, R. Cadène, L. Celona, and P . Napoletano. Benchmar k analysis of represen- tative deep neural network architectures. IEEE Access, 6:64270–64277, 2018

  13. [13]

    T owards under- standing the spectral bias of deep learning

    Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, and QuanquanGu. T owards under- standing the spectral bias of deep learning. In Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence , pages 2205–2211. International Joint Con- ferences on Artificial Intelligence Organization, 2021

  14. [14]

    Transfer learning based multi-fide lity physics informed deep neural network

    Souvik Lal Chakraborty . Transfer learning based multi-fide lity physics informed deep neural network. J. Comput. Phys. , 426:109942, 2020

  15. [15]

    F . Chen, J. Huang, C. Wang, and H. Yang. Friedrichs learning: Weak solutions of partial differential equations via deep learning. SIAM Journal of Scientific Computing , 45(3):A1271–A1299, 2023. 17

  16. [16]

    J. Chen, X. Chi, W . E, and Z. Yang. Bridging traditional and ma chine learning-based algorithms for solving pdes: The random feature method. Journal of Machine Learning, pages 268–298, 2022

  17. [17]

    W . C. Cheung, V . T an, and Z. Zhong. A thompson sampling algori thm for cascading bandits. In Proceedings of the Twenty-Second International Conference on Artificial In- telligence and Statistics , volume 89 of Proceedings of Machine Learning Research , pages 438–447. PMLR, 2019

  18. [18]

    J. D. Co-Reyes, Y . Miao, D. Peng, E. Real, Q. V . Le, S. Levine, H . Lee, and A. Faust. Evolving reinforcement learning algorithms. In International Conference on Learning Representations, 2021

  19. [19]

    The Mathematics of Diffusion

    John Crank. The Mathematics of Diffusion . Oxford University Press, 2 edition, 1975

  20. [20]

    Daubechies, R

    I. Daubechies, R. DeV ore, S. Foucart, B. Hanin, and G. Petrov a. Nonlinear approxima- tion and (deep) relu networks. Constr . Approx., 55(1):127–172, 2022

  21. [21]

    Desai, M

    S. Desai, M. Mattheakis, H. Joy , P . Protopapas, and S. Roberts. One-shot transfer learn- ing of physics-informed neural networks, 2022

  22. [22]

    Han, and A

    E., J. Han, and A. Jentzen. Algorithms for solving high dimen sional pdes: From non- linear monte carlo to machine learning. Nonlinearity, 35(1):278–310, 2022

  23. [23]

    Practical Methods of Optimization

    Roger Fletcher . Practical Methods of Optimization . Wiley , 2 edition, 2000

  24. [24]

    C. R. Gin, D.E. Shea, S. L. Brunton, and J. N. Kutz. Deepgreen: deep learning of green’s functions for nonlinear boundary value problems. Scientific Reports , 11(1):1–14, 2021

  25. [25]

    J. Han, A. Jentzen, and E. Weinan. Solving high-dimensional partial differential equations using deep learning. Proceedings of the National Academy of Sciences , 115(34):8505–8510, 2018

  26. [26]

    Complexity of linear regions in deep networks

    Boris Hanin and David Rolnick. Complexity of linear regions in deep networks. In International Conference on Machine Learning , Proceedings of Machine Learning Re- search, pages 2596–2604, 2019

  27. [27]

    Jia and A

    J. Jia and A. R. Benson. Neural jump stochastic differential equations. In Advances in Neural Information Processing Systems 32 , pages 9847–9858. Curran Associates, Inc., 2019

  28. [28]

    Y . Jiao, Y . Lai, X. Lu, F . Wang, J. Z. Yang, and Y . Yang. Deep neu ral networks with ReLU-sine-exponential activations break curse of dimensi onality in approximation on hölder class. SIAM Journal on Mathematical Analysis , 55(4):3635–3649, 2023

  29. [29]

    Y . Khoo, J. Lu, and L. Ying. Solving parametric pde problems w ith artificial neural networks. European Journal of Applied Mathematics , 32(3):421–435, 2021

  30. [30]

    D. P . Kingma and J. Ba. Adam: A method for stochastic optimiza tion. arXiv preprint arXiv:1412.6980, 2014

  31. [31]

    L. Kong, J. Sun, and C. Zhang. Sde-net: Equipping deep neural networks with un- certainty estimates. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research , pages 5405–5415. PMLR, 2020. 18

  32. [32]

    I. E. Lagaris, A. Likas, and D. I. Fotiadis. Artificial neural networks for solving ordinary and partial differential equations. IEEE transactions on neural networks , 9(5):987– 1000, 1998

  33. [33]

    Landajuela, B

    M. Landajuela, B. K. Petersen, S. Kim, C. P . Santiago, R. Glatt, N. Mundhenk, J. F . Pettit, and D. Faissol. Discovering symbolic policies with deep rei nforcement learning. In Marina Meila and T ong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning , volume 139 of Proceedings of Machine Learning Research , pages 5979–59...

  34. [34]

    Z. Li, N. B. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattac harya, A. Stuart, and A. Anankumar . Fourier neural operator for parametric parti al differential equations. In International Conference on Learning Representations , 2021

  35. [35]

    Liang and H

    S. Liang and H. Yang. Finite expression method for solving hi gh-dimensional partial differential equations. Journal of Machine Learning Research , 26(138):1–31, 2025

  36. [36]

    Liao and P

    Y . Liao and P . Ming. Deep nitsche method: Deep ritz method wit h essential boundary conditions. Commun. Comput. Phys. , 29(5):1365–1384, 2021

  37. [37]

    H. Liu, K. Simonyan, and Y . Yang. Darts: Differentiable arch itecture search. In Inter- national Conference on Learning Representations , 2019

  38. [38]

    X. Liu, T . Xiao, Si Si, Q. Cao, S. Kumar , and C.-J. Hsieh. Neura l sde: Stabilizing neural ode networks with stochastic noise, 2019. arXiv preprint ar Xiv:1906.02355

  39. [39]

    Y . Liu, S. G. McCalla, and H. Schaeffer . Random feature models for learning interacting dynamical systems. Proceedings of the Royal Society A: Mathematical, Physical a nd Engineering Sciences, 479(2275):20220835, 2023

  40. [40]

    Z. Liu, W . Cai, and Z.-Q. J. Xu. Multi-scale deep neural netwo rk (MscaleDNN) for solving poisson-boltzmann equation in complex domains. Communications in Compu- tational Physics, 28(5):1970–2001, 2020

  41. [41]

    Lu Lu, P . Jin, G. Pang, Z. Zhang, and G. E. Karniadakis. Learni ng nonlinear operators via DeepONet based on the universal approximation theorem o f operators. Nature Machine Intelligence, 3(3):218–229, 2021

  42. [42]

    Mazyavkina, S

    N. Mazyavkina, S. Sviridov , S. Ivanov , and E. Burnaev . Reinf orcement learning for combinatorial optimization: A survey . Computers & Operations Research , 134:105400, 2021

  43. [43]

    Robert N. Miller . Primitive equation models. In Numerical Modeling of Ocean Circula- tion, pages 87–164. Cambridge University Press, 2007

  44. [44]

    Murray-Smith

    David J. Murray-Smith. Modelling and Simulation of Integrated Systems in Engineerin g: Issues of Methodology, Quality, Testing and Application . Elsevier , 2012

  45. [45]

    M. W . M. G. Dissanayake nd N. Phan-Thien. Neural-network-ba sed approximations for solving partial differential equations. Comm. Numer . Methods. Engrg., 10:195–201, 1994

  46. [46]

    B. K. Petersen, M. L. Larma, T . N. Mundhenk, C. P . Santiago, S. K. Kim, and J. T . Kim. Deep symbolic regression: Recovering mathematical ex pressions from data via risk-seeking policy gradients. In International Conference on Learning Representations , 2021. 19

  47. [47]

    Raissi, P

    M. Raissi, P . Perdikaris, and G.E. Karniadakis. Physics-in formed neural networks: A deep learning framework for solving forward and inverse pro blems involving non- linear partial differential equations. Journal of Computational Physics , 378:686–707, 2019

  48. [48]

    Ramachandran, B

    P . Ramachandran, B. Zoph, and Q. V . Le. Searching for activat ion functions. In Inter- national Conference on Learning Representations , 2018

  49. [49]

    D. E. Rumelhart, G. E. Hinton, and R. J. Williams. Learning re presentations by back- propagating errors. Nature, 323(6088):533–536, 1986

  50. [50]

    Z. Shen, H. Yang, and S. Zhang. Deep network with approximation error being recipro- cal of width to power of square root of depth. Neural Computation, 33(4):1005–1036, 2021

  51. [51]

    Sirignano and K

    J. Sirignano and K. Spiliopoulos. Dgm: A deep learning algor ithm for solving partial differential equations. Journal of Computational Physics , 375:1339–1364, 2018

  52. [52]

    Y . Sun, A. Gilbert, and A. T ewari. On the approximation prope rties of random relu features. arXiv preprint arXiv:1810.04374 , 2018

  53. [53]

    R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction . The MIT Press, 2 edition, 2018

  54. [54]

    T ang, R

    J. T ang, R. Bausback, F . Bao, and R. Archibald. Federated learning on stochastic neural networks. Journal of Machine Learning for Modeling and Computing , 6(4):125–150, 2025

  55. [55]

    Navier-Stokes Equations: Theory and Numerical Analysis

    Roger T emam. Navier-Stokes Equations: Theory and Numerical Analysis . AMS Chelsea Publishing, 2001

  56. [56]

    arXiv:1905.09883 , year=

    B. Tzen and M. Raginsky . Neural stochastic differential equ ations: Deep latent gaus- sian models in the diffusion limit. arXiv preprint arXiv:1905.09883 , 2019

  57. [57]

    Weinan, J

    E. Weinan, J. Han, and A. Jentzen. Deep learning-based numer ical methods for high- dimensional parabolic partial differential equations and backward stochastic differen- tial equations. Communications in Mathematics and Statistics , 5(4):349–380, 2017

  58. [58]

    Weinan and B

    E. Weinan and B. Yu. The deep ritz method: A deep learning-bas ed numerical algo- rithm for solving variational problems. Commun. Math. Stat , 6:1–12, 2018

  59. [59]

    Z.-Q. J. Xu, Y . Zhang, T . Luo, Y . Xiao, and Z. Ma. Frequency principle: Fourier analysis sheds light on deep neural networks. Communications in Computational Physics, 28(5), 2020

  60. [60]

    Elementary superexpressive activations

    Dmitry Yarotsky . Elementary superexpressive activations . arXiv preprint arXiv:2102.10911, 2021

  61. [61]

    Y . Zang, G. Bao, X. Y e, and H. Zhou. Weak adversarial networks for high-dimensional partial differential equations. Journal of Computational Physics , 411:109409, 2020

  62. [62]

    Zhang, T

    X. Zhang, T . Cheng, and L. Ju. Implicit form neural network fo r learning scalar hy- perbolic conservation laws. In J. Bruna, J. Hesthaven, and L . Zdeborova, editors, Pro- ceedings of the 2nd Mathematical and Scientific Machine Learn ing Conference , volume 145 of Proceedings of Machine Learning Research , pages 1082–1098. PMLR, 2022

  63. [63]

    Zhang, F

    Z. Zhang, F . Bao, L. Ju, and G. Zhang. Transferable neural net works for partial differ- ential equations. J. Sci. Comput. , 99(2), 2024. 20

  64. [64]

    Zhang, F

    Z. Zhang, F . Bao, and G. Zhang. Improving the expressive powe r of deep neural net- works through integral activation transform. International Journal of Numerical Anal- ysis and Modeling , 21(5):739–763, 2024

  65. [65]

    Yang, and S

    Z.Shen, H. Yang, and S. Zhang. Neural network approximation : Three hidden layers are enough. Neural Networks, 141:160–173, 2021. 21