pith. sign in

arxiv: 2605.17740 · v1 · pith:SNF5XMYFnew · submitted 2026-05-18 · 🧮 math.NA · cs.NA· math.OC

Two-scale neural networks for optimal control of linear convection-dominated equations

Pith reviewed 2026-05-20 01:21 UTC · model grok-4.3

classification 🧮 math.NA cs.NAmath.OC
keywords neural networksoptimal controlconvection-diffusion equationssingular perturbationtwo-scale methodnumerical experimentsadjoint method
0
0 comments X

The pith

Two-scale neural networks with separate state and adjoint networks and rescaled features solve optimal control problems for convection-dominated equations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural network approach tailored to optimal control problems where the governing equations are convection-dominated convection-diffusion-reaction equations. Standard neural networks struggle when the diffusion coefficient is small because solutions develop sharp layers in different locations for the state and adjoint variables. The method augments inputs with rescaled features that emphasize these layers and uses two separate networks, each centered to match the layer position of its variable. A successive training strategy starts with larger diffusion and reduces it gradually to the target value. This setup is tested on benchmark problems using both direct optimality conditions and a penalized formulation.

Core claim

The central claim is that augmenting spatial inputs with rescaled features and employing separate neural networks for the state and adjoint variables, with centers chosen to align with their respective layer locations, combined with successive training by decreasing the diffusion coefficient, enables effective numerical solution of optimal control problems governed by convection-dominated equations.

What carries the argument

The two-scale neural network architecture that augments the spatial input with rescaled features and uses separate networks for state and adjoint with different center points.

Load-bearing premise

The assumption that suitably chosen center points for the two networks and rescaled features will align with the actual layer locations for both state and adjoint across the range of diffusion coefficients considered.

What would settle it

If numerical tests with very small diffusion coefficients, such as 10 to the power of -8, show that the networks fail to resolve the layers even with adjusted centers, the effectiveness of the two-scale approach would be questioned.

Figures

Figures reproduced from arXiv: 2605.17740 by Marcus Sarkis, Sijing Liu, Yi Zhang, Zhongqiang Zhang.

Figure 1
Figure 1. Figure 1: Deep neural networks Given a boundary value problem (3.2) Dw = g(x) in Ω, Bw = h(x) on ∂Ω, with differential operators D and B, the PINN approximation wθ(x) := Nθ(x) is obtained by minimizing the loss (3.3) Jd(θ) = 1 Nc X Nc k=1 |g(x k r ) − Dwθ(x k r )| 2 + α Nb X Nb k=1 |h(x k b ) − Bwθ(x k b )| 2 , where {x k r } Nc k=1 ⊂ Ω are interior collocation points, {x k b } Nb k=1 ⊂ ∂Ω are boundary collocation p… view at source ↗
Figure 2
Figure 2. Figure 2: Two-scale neural network x x−xc ε ε −1 a 3 1 a 3 2 4. Double Two-scale PINNs for Optimal Control Problems This section presents the PINN methodology for (2.1). The construction is organized around two choices that are consequential for singularly perturbed optimal control systems. The first is architectural: the state and the adjoint, or the state and the control, are represented by separate two-scale netw… view at source ↗
Figure 3
Figure 3. Figure 3: Collocation points for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 directly and relies on penalty weights to recover the control behavior [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Training loss and L 1 errors comparison for Example 5.1 0 0000 0000 0000 0000 0000 0000 0000 0000  0 0 00 0 0      (a) Training loss comparison 0 0000 0000 0000 0000 0000 0000 0000 0000  0 0      (b) L 1 error com￾parison for the state variable 0 0000 0000 0000 0000 0000 0000 0000 0000   0 0 0 00    !   … view at source ↗
Figure 5
Figure 5. Figure 5: NN predictions vs Exact solutions using (4.4) for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 000 000 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (d) NN Solution for p 00 0 0 0 0 0 00 0 0 0 0 0… view at source ↗
Figure 6
Figure 6. Figure 6: NN predictions vs Exact solutions using (4.9) for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 000 00 00 00 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0… view at source ↗
Figure 7
Figure 7. Figure 7: NN predictions vs Exact solutions using (4.4) for Example 5.1 with ε = 5×10−4 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 (d) NN Solution for p 00 0 0 0 0 0 00 0 0 0 0 0 00… view at source ↗
Figure 8
Figure 8. Figure 8: Training loss and L 1 errors comparison for Example 5.2 0 0000 0000 0000 0000 0000 0000 0000 0000  0 0 0 00 0 0 0      (a) Training loss comparison 0 0000 0000 0000 0000 0000 0000 0000 0000  0 0      (b) L 1 error com￾parison for the state variable 0 0000 0000 0000 0000 0000 0000 0000 0000   0 0 0    !  … view at source ↗
Figure 9
Figure 9. Figure 9: NN predictions vs Exact solutions using (4.4) for Example 5.2 00 0 0 0 0 0 00 0 0 0 0 0  0 0 00 0 0  (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0  0 0 00 0 0  (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 000 00 00 00 00 00 00 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0 0… view at source ↗
Figure 10
Figure 10. Figure 10: NN predictions vs Exact solutions using (4.9) for Example 5.2 00 0 0 0 0 0 00 0 0 0 0 0  0 0 00 0 0  (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0  0 0 00 0 0  (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 00 000 00 00 0 00 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0… view at source ↗
Figure 11
Figure 11. Figure 11: Comparison between NN predictions and EAFE solutions using (4.4) for Ex￾ample 5.3 00 0 0 0 0 0 [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗
read the original abstract

We propose a two-scale neural network method for optimal control problems governed by convection-dominated convection-diffusion-reaction equations. Building on two-scale architectures developed for singularly perturbed forward problems, we augment the spatial input with suitably rescaled features that become increasingly important as the diffusion coefficient becomes small. The approach employs separate neural networks for the state and adjoint state variables of the optimality system, reflecting the fact that these quantities develop sharp layers in different parts of the domain due to opposite convection fields. By choosing different center points for the two networks, the architecture naturally aligns with the layer location of each variable. We present two formulations of the method, one based on the first-order optimality conditions and another using penalization of the PDE constraint, and combine them with a successive training strategy that gradually decreases the diffusion coefficient toward its target value. Numerical experiments on benchmark problems illustrate the effectiveness and behavior of the proposed approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-scale neural network method for optimal control problems governed by linear convection-dominated convection-diffusion-reaction equations. Separate networks are used for the state and adjoint, each augmented with rescaled features that gain importance for small diffusion; distinct center points are chosen to align with the opposing layer locations induced by the convection fields. Two formulations are given (first-order optimality conditions and PDE-constraint penalization), both paired with a successive training schedule that lowers the diffusion coefficient toward the target value. Effectiveness is illustrated by numerical experiments on standard benchmark problems.

Significance. If the reported performance is robust, the work supplies a concrete architectural adaptation of two-scale networks to optimality systems, addressing the fact that state and adjoint develop layers in different subdomains. The successive-training strategy and explicit separation of networks constitute a practical response to the known difficulties of standard PINNs with sharp interior layers. The benchmark results, if reproducible, would constitute useful evidence that the method can resolve the coupled system without requiring mesh adaptation.

major comments (2)
  1. [§3.1] §3.1 (Architecture description): the claim that different center points 'naturally align with the layer location of each variable' is load-bearing for the two-scale advantage, yet the centers appear to be fixed a priori and chosen by hand for the reported benchmarks. Because the true layer positions depend on the unknown optimal control and become increasingly sensitive as the diffusion coefficient decreases, it is unclear whether the same fixed centers remain effective when the control changes or when diffusion is lowered further in the successive-training loop.
  2. [§4] §4 (Numerical experiments): the reported error tables compare the two-scale method only against a standard single-network PINN on the same benchmark set; no ablation is shown that isolates the contribution of the rescaled features versus the choice of centers. Without such controls it is difficult to confirm that the observed improvement stems from the two-scale construction rather than from the successive-training schedule alone.
minor comments (2)
  1. [§2.2] Notation for the rescaled feature functions is introduced without an explicit formula in the main text; a compact definition (perhaps as an equation) would improve readability.
  2. [§3.3] In the penalization formulation, the weighting parameter between the PDE residual and the cost functional is stated to be fixed, but its dependence (or lack thereof) on the diffusion coefficient is not discussed; a brief remark on this choice would clarify robustness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We address each major comment in detail below and outline the changes we plan to make in the revised manuscript.

read point-by-point responses
  1. Referee: [§3.1] §3.1 (Architecture description): the claim that different center points 'naturally align with the layer location of each variable' is load-bearing for the two-scale advantage, yet the centers appear to be fixed a priori and chosen by hand for the reported benchmarks. Because the true layer positions depend on the unknown optimal control and become increasingly sensitive as the diffusion coefficient decreases, it is unclear whether the same fixed centers remain effective when the control changes or when diffusion is lowered further in the successive-training loop.

    Authors: The convection field is prescribed and fixed in the problem formulation, and the locations of the sharp layers in the state and adjoint variables are determined by the convection direction and the domain boundaries. The optimal control acts as a forcing term that affects the magnitude of the solution but does not shift the layer positions in this linear setting. Therefore, the a priori choice of distinct center points for the state and adjoint networks, aligned with the opposing convection directions, remains valid independently of the specific control. In the successive training procedure, the centers are held fixed as the diffusion coefficient is decreased, and the numerical experiments confirm that the approximation quality is maintained. We will revise Section 3.1 to include a more detailed explanation of this choice and its independence from the control. revision: partial

  2. Referee: [§4] §4 (Numerical experiments): the reported error tables compare the two-scale method only against a standard single-network PINN on the same benchmark set; no ablation is shown that isolates the contribution of the rescaled features versus the choice of centers. Without such controls it is difficult to confirm that the observed improvement stems from the two-scale construction rather than from the successive-training schedule alone.

    Authors: We acknowledge that the current experiments do not include an ablation study to separate the effects of the rescaled features and the distinct centers from the successive training. To address this, we will add new numerical results in the revised version of Section 4. These will include comparisons of the proposed method against ablated versions: one without rescaled features and one using identical centers for both networks, all under the same successive training schedule. This will help demonstrate the specific contributions of the two-scale architecture. revision: yes

Circularity Check

0 steps flagged

Proposed two-scale NN method validated on external benchmarks with no reduction to self-defined inputs

full rationale

The paper introduces a two-scale neural network architecture augmented with rescaled features and separate networks for state and adjoint variables, combined with successive training, to address optimal control problems for convection-dominated equations. It explicitly builds on prior two-scale methods for forward problems and demonstrates effectiveness through numerical experiments on independent benchmark problems. No load-bearing step in the method description or results reduces by construction to a fitted parameter, self-citation chain, or internal definition; the central claims rest on external validation rather than tautological equivalence to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not enumerate explicit free parameters or axioms; the method implicitly relies on the existence of suitable center points and rescaling functions that capture layer behavior, but these are presented as design choices rather than fitted constants.

pith-pipeline@v0.9.0 · 5688 in / 1134 out tokens · 32687 ms · 2026-05-20T01:21:20.032978+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

  1. [1]

    J. H. Adler, C. Cavanaugh, X. Hu, A. Huang, and N. Trask. A s table mimetic finite-difference method for convection- dominated diffusion equations. SIAM Journal on Scientific Computing , 45(6):A2973–A3000, 2023

  2. [2]

    Ayuso and L

    B. Ayuso and L. D. Marini. Discontinuous Galerkin method s for advection-diffusion-reaction problems. SIAM Journal on Numerical Analysis , 47(2):1391–1420, 2009

  3. [3]

    Barry-Straume, A

    J. Barry-Straume, A. Sarshar, A. A. Popov, and A. Sandu. P hysics-informed neural networks for PDE-constrained opti - mization and control. Communications on Applied Mathematics and Computation , pages 1–24, 2025

  4. [4]

    Bergounioux

    M. Bergounioux. A penalization method for optimal contr ol of elliptic problems with state constraints. SIAM Journal on Control and Optimization , 30(2):305–323, 1992

  5. [5]

    Bradbury, R

    J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, Y. Kat ariya, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. W anderman-Milne, and Q. Zhang. JAX: composable transformations of Python+NumPy programs, 2018

  6. [6]

    S. C. Brenner, S. Liu, and L.-Y. Sung. A p1 finite element method for a distributed elliptic optimal con trol problem with a general state equation and pointwise state constraints. Computational Methods in Applied Mathematics , 21(4):777–790, 2021

  7. [7]

    S. C. Brenner, S. Liu, and L.-Y. Sung. Multigrid methods f or an elliptic optimal control problem with pointwise state constraints. Results in Applied Mathematics , 17:100356, 2023. 16 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Figure 11. Comparison between NN predictions and EAFE solutions using (4.4) fo r Ex- ample 5.3 0.0 0.2 0.4 0.6 0.8...

  8. [8]

    A. N. Brooks and T. J. Hughes. Streamline upwind/petrov- galerkin formulations for convection dominated flows with p ar- ticular emphasis on the incompressible navier-stokes equa tions. Computer Methods in Applied Mechanics and Engineering , 32(1):199–259, 1982

  9. [9]

    F. Cao, F. Gao, X. Guo, and D. Yuan. Physics-informed neur al networks with parameter asymptotic strategy for learnin g singularly perturbed convection-dominated problem. Computers & Mathematics with Applications , 150:229–242, 2023

  10. [10]

    F. Cao, F. Gao, D. Yuan, and J. Liu. Multistep asymptotic pre-training strategy based on pinns for solving steep boun dary singular perturbation problems. Computer Methods in Applied Mechanics and Engineering , 431:117222, 2024

  11. [11]

    Y. Cao, C. C. So, Y. Dai, S. P. Yung, and J.-M. W ang. Advers arial physics-informed neural networks with hard constrai nts for optimal control of PDEs. Journal of Computational Physics , page 114307, 2025

  12. [12]

    Casas, M

    E. Casas, M. Mateos, and J.-P. Raymond. Error estimates for the numerical approximation of a distributed control pr oblem for the steady-state navier–stokes equations. SIAM Journal on Control and Optimization , 46(3):952–982, 2007

  13. [13]

    G. Chen, W. Hu, J. Shen, J. R. Singler, Y. Zhang, and X. Zhe ng. An HDG method for distributed control of convection diffusion PDEs. Journal of Computational and Applied Mathematics , 343:643–661, 2018

  14. [14]

    Cockburn and C.-W

    B. Cockburn and C.-W. Shu. The local discontinuous gale rkin method for time-dependent convection-diffusion syste ms. SIAM Journal on Numerical Analysis , 35(6):2440–2463, 1998

  15. [15]

    Y. Dai, B. Jin, R. C. Sau, and Z. Zhou. Solving elliptic op timal control problems via neural networks and optimality system. Advances in Computational Mathematics , 51(4):31, 2025

  16. [16]

    Dupret and D

    J.-L. Dupret and D. Hainaut. Deep learning for high-dim ensional continuous-time stochastic optimal control with out explicit solution. Operations Research, 2026

  17. [17]

    Gao and M

    Z. Gao and M. Yang. More consistent accuracy PINN via alt ernating easy-hard training. arXiv:2512.17607, 2025

  18. [18]

    Hinze, R

    M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich. Optimization with PDE constraints . Springer Science & Business Media, 2008

  19. [19]

    Houston, C

    P. Houston, C. Schwab, and E. S¨ uli. Discontinuous hp-fi nite element methods for advection-diffusion-reaction pro blems. SIAM Journal on Numerical Analysis , 39(6):2133–2163, 2002. PINNS-OCP 17

  20. [20]

    T. J. R. Hughes and A. N. Brooks. Multi-dimensional upwi nd scheme with no crosswind diffusion. 1979

  21. [21]

    Jeong, S

    S. Jeong, S. Lee, and S. Liu. A monotone finite element met hod for an elliptic distributed optimal control problem wit h a convection-dominated state equation. arXiv:2510.27167, 2025

  22. [22]

    B. Jin, R. Sau, L. Yin, and Z. Zhou. Solving elliptic opti mal control problems using physics informed neural network s. arXiv:2308.11925, 2023

  23. [23]

    D. P. Kingma and J. Ba. Adam: A method for stochastic opti mization. arXiv:1412.6980, 2017

  24. [24]

    Knobloch and G

    P. Knobloch and G. Lube. Local projection stabilizatio n for advection-diffusion-reaction problems: One-level vs . two-level approach. Applied Numerical Mathematics , 59(12):2891–2907, 2009

  25. [25]

    Leykekhman and M

    D. Leykekhman and M. Heinkenschloss. Local error analy sis of discontinuous Galerkin methods for advection-domin ated elliptic linear-quadratic optimal control problems. SIAM Journal on Numerical Analysis , 50(4):2012–2038, 2012

  26. [26]

    J. L. Lions. Optimal Control of Systems Governed by Partial Differential Equations. Springer, 1971

  27. [27]

    Liu and V

    S. Liu and V. Simoncini. Multigrid preconditioning for discontinuous Galerkin discretizations of an elliptic opt imal control problem with a convection-dominated state equation. Journal of Scientific Computing , 101(3):79, 2024

  28. [28]

    S. Liu, Z. Tan, and Y. Zhang. Discontinuous galerkin met hods for an elliptic optimal control problem with a general s tate equation and pointwise state constraints. Journal of Computational and Applied Mathematics , 437:115494, 2024

  29. [29]

    Liu and J

    S. Liu and J. Zhang. A balancing domain decomposition by constraints preconditioner for a hybridizable discontinu ous Galerkin discretization of an elliptic optimal control pro blem. arXiv:2504.02072, 2025

  30. [30]

    Liu and J

    S. Liu and J. Zhang. Convergence analysis of a balancing domain decomposition method for an elliptic optimal contro l problem with HDG discretizations. ESAIM: Mathematical Modelling and Numerical Analysis , 2026

  31. [31]

    L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis. Deepxde: A d eep learning library for solving differential equations. SIAM Review, 63(1):208–228, 2021

  32. [32]

    M¨ unzer and C

    M. M¨ unzer and C. Bard. A curriculum-training-based st rategy for distributing collocation points during physics -informed neural network training. arXiv:2211.11396, 2022

  33. [33]

    J. Nitsche. ¨Uber ein variationsprinzip zur l¨ osung von dirichlet-problemen bei verwendung von teilr¨ aumen, die keinen randbe- dingungen unterworfen sind. In Abhandlungen aus dem mathematischen Seminar der Universit ¨ at Hamburg, volume 36, pages 9–15. Springer, 1971

  34. [34]

    R. D. Nzoyem Ngueguin, D. A. Barton, and T. Deakin. A comp arison of mesh-free differentiable programming and data- driven strategies for optimal control under pde constraint s. In Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storag e, and Analysis , SC-W’23, page 21–28, New York, NY, USA, 2023...

  35. [35]

    H.-G. Roos, M. Stynes, and L. Tobiska. Robust numerical methods for singularly perturbed differen tial equations: convection-diffusion-reaction and flow problems . Springer, 2008

  36. [36]

    Tr¨ oltzsch.Optimal Control of Partial Differential Equations: Theory, Methods, and Applications, volume 112

    F. Tr¨ oltzsch.Optimal Control of Partial Differential Equations: Theory, Methods, and Applications, volume 112. American Mathematical Soc., 2010

  37. [37]

    W ang, P

    S. W ang, P. Zhao, Q. Ma, and T. Song. General-kindred phy sics-informed neural network to the solutions of singularl y perturbed differential equations. Physics of Fluids , 36(11), 2024

  38. [38]

    W ang, P

    S. W ang, P. Zhao, and T. Song. Aspinn: An asymptotic stra tegy for solving singularly perturbed differential equatio ns. arXiv:2409.13185, 2024

  39. [39]

    W ang, Y

    X. W ang, Y. Dou, X. Yi, Y. Zhang, X. Li, B. Li, H. Peng, L. W a ng, and K. L. Teo. When optimal control meets neural network: A comprehensive survey. Archives of Computational Methods in Engineering , pages 1–56, 2026

  40. [40]

    W ang, P

    X. W ang, P. Yin, B. Zhang, and C. Yang. AONN-2: An adjoint -oriented neural network method for PDE-constrained shape optimization. Journal of Computational Physics , 513:113160, 2024

  41. [41]

    W ang, C

    Y. W ang, C. Xu, M. Yang, and J. Zhang. Less emphasis on har d regions: curriculum learning of PINNs for singularly perturbed convection-diffusion-reaction problems. East Asian Journal on Applied Mathematics , 14(1):104–123, 2024

  42. [42]

    C. W u, M. Zhu, Q. Tan, Y. Kartha, and L. Lu. A comprehensiv e study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering , 403:115671, 2023

  43. [43]

    Xu and L

    J. Xu and L. Zikatanov. A monotone finite element scheme f or convection-diffusion equations. Mathematics of Computation , 68(228):1429–1446, 1999

  44. [44]

    P. Yin, G. Xiao, K. Tang, and C. Yang. AONN: An adjoint-or iented neural network method for all-at-once solutions of parametric optimal control problems. SIAM Journal on Scientific Computing , 46(1):C127–C153, 2024

  45. [45]

    Two-scale Neural Networks for Singularly Perturbed Dynamical Systems with Multiple Parameters

    Q. Zhuang, T. W ang, R. W anjiku, M. Bani-Yaghoub, and Z. Z hang. Two-scale neural networks for singularly perturbed dynamical systems with multiple parameters. arXiv:2605.02799, 2026

  46. [46]

    Zhuang, C

    Q. Zhuang, C. Z. Yao, Z. Zhang, and G. E. Karniadakis. Two -scale neural networks for partial differential equations w ith small parameters. Communications in Computational Physics , 38(3):603–629, 2025. 18 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Sijing Liu, Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institu...