pith. machine review for the scientific record. sign in

arxiv: 2604.08869 · v1 · submitted 2026-04-10 · 🧮 math.NA · cs.NA

Recognition: 2 theorem links

· Lean Theorem

Adaptive Randomized Neural Networks with Locally Activation Function: Theory and Algorithm for Solving PDEs

Authors on Pith no claims yet

Pith reviewed 2026-05-10 17:56 UTC · model grok-4.3

classification 🧮 math.NA cs.NA
keywords randomized neural networksapproximation theorempartition of unityadaptive methodphysics-informed neural networksPDE solvinglocal regularitya posteriori error
0
0 comments X

The pith

Randomized neural networks achieve optimal approximation when the hidden-parameter sampling domain is sized to match the target function's smoothness and the number of neurons.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves an approximation theorem for randomized neural networks whose hidden-layer parameters are drawn uniformly from a single bounded domain. It shows that the domain size needed for the best possible error rates depends directly on how smooth the target function is and on how many neurons the network contains. This theoretical link is used to build an adaptive physics-informed method that refines a partition of unity according to a posteriori error indicators, allowing the network to focus on localized regions of low regularity when solving partial differential equations.

Core claim

The authors establish that for networks of the form sum W_i sigma(A_i, b_i) with uniform sampling of (A_i, b_i) from a prescribed bounded domain, optimal approximation rates require the domain size to scale with the smoothness of the target function and the network width. They then combine these networks with a partition of unity whose subdomains are refined adaptively by a posteriori error indicators, producing the adaptive PIRaNN scheme that solves PDEs whose solutions have limited local regularity without introducing additional consistency errors.

What carries the argument

The approximation theorem that relates the required size of the uniform sampling domain for hidden parameters in randomized neural networks to the smoothness of the target function and the number of neurons, together with a posteriori error-driven partition-of-unity refinement.

If this is right

  • The adaptive PIRaNN method captures localized low-regularity features in PDE solutions by refining the partition of unity according to a posteriori indicators.
  • The method maintains consistency because the refinement strategy does not add new approximation errors beyond those already controlled by the randomized network.
  • Numerical benchmarks confirm both the theoretical dependence of domain size on smoothness and the practical performance of the adaptive scheme on standard test problems.
  • The approach extends the use of randomized networks from globally smooth to locally irregular PDE solutions while keeping the number of neurons moderate.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same domain-size tuning principle could be tested on other randomized approximation schemes outside neural networks to see whether it yields similar rate improvements.
  • Applying the adaptive partition-of-unity idea to time-dependent or high-dimensional PDEs would test whether the error-driven refinement remains computationally efficient as dimension grows.
  • If the load-bearing assumption holds, one could replace the uniform sampling step with other simple distributions and still obtain the same link between domain size and smoothness.

Load-bearing premise

Uniform sampling of hidden-layer parameters from one fixed bounded domain plus error-driven partition-of-unity refinement is enough to resolve localized low-regularity features without creating new consistency errors.

What would settle it

A numerical test that measures whether the observed approximation or PDE-solution error stops improving at the predicted optimal rate once the sampling domain size is deliberately mismatched to the smoothness and neuron count, or once the adaptive refinement is removed.

read the original abstract

This paper establishes an approximation theorem for randomized neural networks (RaNNs) whose hidden-layer parameters are uniformly sampled from a prescribed bounded domain. Our analysis shows that, for RaNNs of the form $\mathop{\sum}_i W_i \sigma(A_i, b_i)$, the size of the sampling domain required to achieve optimal approximation is intrinsically linked to the smoothness of the target function and the number of neurons. Motivated by this theoretical insight, we integrate a partition of unity (PoU) with RaNNs to develop an adaptive physics-informed randomized neural network (PIRaNN) method for solving partial differential equations with limited local regularity. The proposed adaptive strategy refines the PoU based on a posteriori error indicators, enabling the network to efficiently capture localized solution features. Numerical experiments validate the theoretical results and demonstrate the strong approximation capabilities of RaNNs, confirming the effectiveness of the adaptive PIRaNN method on a range of benchmark problems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript establishes an approximation theorem for randomized neural networks (RaNNs) with hidden-layer parameters uniformly sampled from a bounded domain, showing that the required sampling-domain size is linked to the target function's smoothness and the number of neurons. It then proposes an adaptive physics-informed RaNN (PIRaNN) method that integrates a partition of unity (PoU), refines patches via a posteriori error indicators, and solves PDEs with localized low regularity; numerical experiments on benchmark problems are used to support the claims.

Significance. If the approximation theorem holds and the adaptive PoU construction is shown to preserve optimal rates, the work would supply a theoretically motivated adaptive framework for neural solvers of PDEs with singularities or reduced regularity, with the domain-size/smoothness link offering practical guidance for parameter choice. The numerical validation on benchmarks is a positive indicator, but the absence of quantitative rate comparisons limits the assessed impact.

major comments (2)
  1. [§3] §3 (approximation theorem): the result is stated for a single global RaNN with fixed bounded sampling domain; the subsequent adaptive PIRaNN construction assigns independent RaNNs to a posteriori-refined patches with possibly different local sampling domains, yet no error-propagation argument is supplied showing that the sum of local approximation errors remains controlled by the same smoothness-dependent constants.
  2. [§4.2] §4.2 (adaptive PIRaNN algorithm): the claim that the method captures localized low-regularity features 'without introducing new consistency errors' rests on the assumption that PoU weighting and per-patch adaptive sampling commute with the randomization argument; a concrete global error bound or proof sketch verifying that each local RaNN satisfies the theorem hypotheses at every refinement step is required.
minor comments (2)
  1. [Abstract] Abstract: the statement that 'numerical experiments validate the theoretical results' is vague; a single sentence summarizing observed convergence rates or error magnitudes relative to theory would strengthen the claim.
  2. [Notation] Notation: the RaNN form ∑_i W_i σ(A_i, b_i) is introduced without an immediate reminder of the precise definitions of the random matrices A_i and vectors b_i; adding a short parenthetical or reference to the earlier definition would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address the major comments point by point below, agreeing that the connection between the global approximation theorem and the adaptive construction requires explicit justification. We will strengthen the manuscript accordingly.

read point-by-point responses
  1. Referee: [§3] §3 (approximation theorem): the result is stated for a single global RaNN with fixed bounded sampling domain; the subsequent adaptive PIRaNN construction assigns independent RaNNs to a posteriori-refined patches with possibly different local sampling domains, yet no error-propagation argument is supplied showing that the sum of local approximation errors remains controlled by the same smoothness-dependent constants.

    Authors: The referee is correct that Theorem 3.1 is stated for a single global RaNN. The adaptive PIRaNN employs a partition of unity to localize the approximation, with each patch using its own RaNN whose sampling domain is sized according to the local regularity. Because the PoU functions are smooth, non-negative, and sum to one, the global L2 error is bounded by a sum of the local errors (with a multiplicative constant depending only on the PoU). We will insert a new proposition after Theorem 3.1 that makes this propagation explicit, showing that the smoothness-dependent constants from the theorem carry over to each local approximant when the sampling domain is chosen adaptively per patch. This addition will confirm that the global error remains controlled without inflation. revision: yes

  2. Referee: [§4.2] §4.2 (adaptive PIRaNN algorithm): the claim that the method captures localized low-regularity features 'without introducing new consistency errors' rests on the assumption that PoU weighting and per-patch adaptive sampling commute with the randomization argument; a concrete global error bound or proof sketch verifying that each local RaNN satisfies the theorem hypotheses at every refinement step is required.

    Authors: We acknowledge that the current text does not supply a self-contained verification that the local randomization hypotheses remain satisfied after each refinement. The PoU weights are independent of the random parameters and the adaptive choice of sampling domain is made from a posteriori indicators that estimate local smoothness; thus the local problems continue to meet the hypotheses of the theorem. We will add a short proof sketch in §4.2 that (i) confirms each local RaNN at every step satisfies the uniform-sampling assumption with a domain sized to the local regularity, and (ii) assembles the local bounds into a global a-priori error estimate that contains no extra consistency terms arising from the PoU or the adaptation process. This will rigorously justify the claim. revision: yes

Circularity Check

0 steps flagged

No circularity detected; approximation theorem and adaptive construction remain independent of self-referential inputs.

full rationale

The paper first states an approximation theorem for RaNNs that links the required sampling-domain diameter to the target function's smoothness and the number of neurons; this is presented as a derived result from analysis of the form ∑ W_i σ(A_i, b_i) with uniform sampling from a bounded domain. The subsequent adaptive PIRaNN construction with a posteriori PoU refinement is motivated by that theorem but does not redefine any quantity in terms of itself, fit a parameter and relabel it a prediction, or rely on a load-bearing self-citation whose content is unverified. No equation reduces the claimed global error bound to a fitted constant or to the adaptive choice itself by construction. The derivation chain is therefore self-contained and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard approximation-theory assumptions for neural networks and on the existence of reliable a-posteriori error indicators for the chosen PDE class; no new free parameters or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption Uniform sampling of hidden-layer parameters from a bounded domain yields approximation rates governed by the smoothness of the target and the number of neurons.
    Invoked in the statement of the approximation theorem for RaNNs.
  • domain assumption A posteriori error indicators computed from the current network solution can be used to refine the partition of unity without destroying consistency.
    Required for the adaptive strategy to converge on problems with limited local regularity.

pith-pipeline@v0.9.0 · 5463 in / 1510 out tokens · 31886 ms · 2026-05-10T17:56:46.302938+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

40 extracted references · 12 canonical work pages · 2 internal anchors

  1. [1]

    R. A. Adams and J. J. Fournier , Sobolev spaces, vol. 140, Elsevier, 2003

  2. [2]

    A. R. Barron , Universal approximation bounds for superpositions of a sig moidal function , IEEE Transactions on Information theory, 39 (2002), pp. 930 –945

  3. [3]

    R. Bi, W. Deng, and Y. Zhu , Extended interface physics-informed neural networks meth od for moving interface problems , arXiv preprint arXiv:2508.01463, (2025)

  4. [4]

    J. M. Cascon, C. Kreuzer, R. H. Nochetto, and K. G. Siebert , Quasi-optimal convergence rate for an adaptive finite element method , SIAM Journal on Numerical Analysis, 46 (2008), pp. 2524–2550

  5. [5]

    J. Chen, X. Chi, W. E, and Z. Yang , Bridging traditional and machine learning-based algo- rithms for solving pdes: the random feature method , J Mach Learn, 1 (2022), pp. 268–298

  6. [6]

    S. M. Cox and P. C. Matthews , Exponential time differencing for stiff systems , Journal of Computational Physics, 176 (2002), pp. 430–455

  7. [7]

    De Ryck, S

    T. De Ryck, S. Lanthaler, and S. Mishra , On the approximation of functions by tanh neural networks, Neural Networks, 143 (2021), pp. 732–750

  8. [8]

    Approximation theory and applications of randomized neural networks for solving high-dimensional PDEs.arXiv preprint arXiv:2501.12145, 2025

    T. De Ryck, S. Mishra, Y. Shang, and F. W ang , Approximation theory and applica- tions of randomized neural networks for solving high-dimen sional pdes , arXiv preprint arXiv:2501.12145, (2025)

  9. [9]

    Dong and Z

    S. Dong and Z. Li , Local extreme learning machines and domain decomposition f or solving lin- ear and nonlinear partial differential equations , Computer Methods in Applied Mechanics and Engineering, 387 (2021), p. 114129

  10. [10]

    D ¨orfler, A convergent adaptive algorithm for poisson ’s equation , SIAM Journal on Nu- merical Analysis, 33 (1996), pp

    W. D ¨orfler, A convergent adaptive algorithm for poisson ’s equation , SIAM Journal on Nu- merical Analysis, 33 (1996), pp. 1106–1124

  11. [11]

    T. A. Driscoll, N. Hale, and L. N. Trefethen , Chebfun guide , 2014

  12. [12]

    Dwivedi and B

    V. Dwivedi and B. Srinivasan , Physics informed extreme learning machine (pielm)–a rapid method for the numerical solution of partial differential eq uations, Neurocomputing, 391 (2020), pp. 96–118

  13. [13]

    Ellacott , Aspects of the numerical analysis of neural networks , Acta numerica, 3 (1994), pp

    S. Ellacott , Aspects of the numerical analysis of neural networks , Acta numerica, 3 (1994), pp. 145–202

  14. [14]

    L. C. Evans , Partial differential equations , vol. 19, American mathematical society, 2022

  15. [15]

    G. B. Folland , Real analysis: modern techniques and their applications , John Wiley & Sons, 1999

  16. [16]

    Gonon , Random feature neural networks learn black-scholes type pd es without curse of dimensionality, Journal of Machine Learning Research, 24 (2023), pp

    L. Gonon , Random feature neural networks learn black-scholes type pd es without curse of dimensionality, Journal of Machine Learning Research, 24 (2023), pp. 1–51

  17. [17]

    G ¨uhring and M

    I. G ¨uhring and M. Raslan , Approximation rates for neural networks with encodable wei ghts in smoothness spaces , Neural Networks, 134 (2021), pp. 107–130

  18. [18]

    Hu, T.-S

    W.-F. Hu, T.-S. Lin, and M.-C. Lai , A discontinuity capturing shallow neural network for elliptic interface problems , Journal of Computational Physics, 469 (2022), p. 111576

  19. [19]

    Huang, Q.-Y

    G.-B. Huang, Q.-Y. Zhu, and C.-K. Siew , Extreme learning machine: theory and applica- tions, Neurocomputing, 70 (2006), pp. 489–501

  20. [20]

    A. D. Jagtap, K. Kaw aguchi, and G. E. Karniadakis , Adaptive activation functions acceler- ate convergence in deep and physics-informed neural networ ks, Journal of Computational Physics, 404 (2020), p. 109136

  21. [21]

    O. A. Karakashian and F. Pascal , Convergence of adaptive discontinuous galerkin approxi- mations of second-order elliptic problems , SIAM Journal on Numerical Analysis, 45 (2007), pp. 641–665

  22. [22]

    J. M. Klusowski and A. R. Barron , Risk bounds for high-dimensional ridge function com- binations including neural networks , arXiv preprint arXiv:1607.01434, (2016)

  23. [23]

    J. M. Klusowski and A. R. Barron , Uniform approximation by neural networks activated by first and second order ridge splines , arXiv preprint arXiv:1607.07819, (2016)

  24. [24]

    Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattachary a, A. Stuart, and A. Anandkumar , Fourier neural operator for parametric partial differentia l equations , arXiv preprint arXiv:2010.08895, (2020)

  25. [25]

    X. Liu, T. Mao, and J. Xu , Integral representations of sobolev spaces via reluk activ ation func- tion and optimal error estimates for linearized networks , arXiv preprint arXiv:2505.00351, (2025)

  26. [26]

    J. Lu, Z. Shen, H. Yang, and S. Zhang , Deep network approximation for smooth functions , SIAM Journal on Mathematical Analysis, 53 (2021), pp. 5465– 5506. 30 RAN BI AND WEIBING DENG

  27. [27]

    L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis , Learning nonlinear operators via deeponet based on the universal approximation theorem o f operators, Nature machine intelligence, 3 (2021), pp. 218–229

  28. [28]

    Universal approximation property of Banach space-valued random feature models including random neural networks

    A. Neufeld and P. Schmocker , Universal approximation property of random neural network s, arXiv preprint arXiv:2312.08410, (2023)

  29. [29]

    Raissi, P

    M. Raissi, P. Perdikaris, and G. E. Karniadakis , Physics-informed neural networks: A deep learning framework for solving forward and inverse problem s involving nonlinear partial differential equations , Journal of Computational physics, 378 (2019), pp. 686–707

  30. [30]

    Rathore, W

    P. Rathore, W. Lei, Z. Frangella, L. Lu, and M. Udell , Challenges in training pinns: A loss landscape perspective , arXiv preprint arXiv:2402.01868, (2024)

  31. [31]

    J. W. Siegel and J. Xu , Approximation rates for neural networks with general activ ation functions, Neural Networks, 128 (2020), pp. 313–321

  32. [32]

    J. W. Siegel and J. Xu , High-order approximation rates for shallow neural network s with cosine and reluk activation functions , Applied and Computational Harmonic Analysis, 58 (2022), pp. 1–26

  33. [33]

    J. W. Siegel and J. Xu , Sharp bounds on the approximation rates, metric entropy, an d n- widths of shallow neural networks , Foundations of Computational Mathematics, 24 (2024), pp. 481–537

  34. [34]

    Gradient alignment in physics- informed neural networks: a second-order optimization perspective

    S. W ang, A. K. Bhartari, B. Li, and P. Perdikaris , Gradient alignment in physics- informed neural networks: A second-order optimization per spective, arXiv preprint arXiv:2502.00604, (2025)

  35. [35]

    W ang, Y

    S. W ang, Y. Teng, and P. Perdikaris , Understanding and mitigating gradient flow patholo- gies in physics-informed neural networks , SIAM Journal on Scientific Computing, 43 (2021), pp. A3055–A3081

  36. [36]

    W ang, X

    S. W ang, X. Yu, and P. Perdikaris , When and why pinns fail to train: A neural tangent kernel perspective, Journal of Computational Physics, 449 (2022), p. 110768

  37. [37]

    Weinan, C

    E. Weinan, C. Ma, and L. Wu , A priori estimates of the population risk for two-layer neur al networks, arXiv preprint arXiv:1810.06397, (2018)

  38. [38]

    Weinan, C

    E. Weinan, C. Ma, and L. Wu , The barron space and the flow-induced function spaces for neural network models , Constructive Approximation, 55 (2022), pp. 369–406

  39. [39]

    Xu , The finite neuron method and convergence analysis , arXiv preprint arXiv:2010.01458, (2020)

    J. Xu , The finite neuron method and convergence analysis , arXiv preprint arXiv:2010.01458, (2020)

  40. [40]

    Y. Zhu, W. Deng, and R. Bi , A two-stage adaptive lifting pinn framework for solving vis cous approximations to hyperbolic conservation laws , arXiv preprint arXiv:2511.04490, (2025)