Two-scale neural networks for optimal control of linear convection-dominated equations

Marcus Sarkis; Sijing Liu; Yi Zhang; Zhongqiang Zhang

arxiv: 2605.17740 · v1 · pith:SNF5XMYFnew · submitted 2026-05-18 · 🧮 math.NA · cs.NA· math.OC

Two-scale neural networks for optimal control of linear convection-dominated equations

Sijing Liu , Marcus Sarkis , Yi Zhang , Zhongqiang Zhang This is my paper

Pith reviewed 2026-05-20 01:21 UTC · model grok-4.3

classification 🧮 math.NA cs.NAmath.OC

keywords neural networksoptimal controlconvection-diffusion equationssingular perturbationtwo-scale methodnumerical experimentsadjoint method

0 comments

The pith

Two-scale neural networks with separate state and adjoint networks and rescaled features solve optimal control problems for convection-dominated equations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a neural network approach tailored to optimal control problems where the governing equations are convection-dominated convection-diffusion-reaction equations. Standard neural networks struggle when the diffusion coefficient is small because solutions develop sharp layers in different locations for the state and adjoint variables. The method augments inputs with rescaled features that emphasize these layers and uses two separate networks, each centered to match the layer position of its variable. A successive training strategy starts with larger diffusion and reduces it gradually to the target value. This setup is tested on benchmark problems using both direct optimality conditions and a penalized formulation.

Core claim

The central claim is that augmenting spatial inputs with rescaled features and employing separate neural networks for the state and adjoint variables, with centers chosen to align with their respective layer locations, combined with successive training by decreasing the diffusion coefficient, enables effective numerical solution of optimal control problems governed by convection-dominated equations.

What carries the argument

The two-scale neural network architecture that augments the spatial input with rescaled features and uses separate networks for state and adjoint with different center points.

Load-bearing premise

The assumption that suitably chosen center points for the two networks and rescaled features will align with the actual layer locations for both state and adjoint across the range of diffusion coefficients considered.

What would settle it

If numerical tests with very small diffusion coefficients, such as 10 to the power of -8, show that the networks fail to resolve the layers even with adjusted centers, the effectiveness of the two-scale approach would be questioned.

Figures

Figures reproduced from arXiv: 2605.17740 by Marcus Sarkis, Sijing Liu, Yi Zhang, Zhongqiang Zhang.

**Figure 1.** Figure 1: Deep neural networks Given a boundary value problem (3.2) Dw = g(x) in Ω, Bw = h(x) on ∂Ω, with differential operators D and B, the PINN approximation wθ(x) := Nθ(x) is obtained by minimizing the loss (3.3) Jd(θ) = 1 Nc X Nc k=1 |g(x k r ) − Dwθ(x k r )| 2 + α Nb X Nb k=1 |h(x k b ) − Bwθ(x k b )| 2 , where {x k r } Nc k=1 ⊂ Ω are interior collocation points, {x k b } Nb k=1 ⊂ ∂Ω are boundary collocation p… view at source ↗

**Figure 2.** Figure 2: Two-scale neural network x x−xc ε ε −1 a 3 1 a 3 2 4. Double Two-scale PINNs for Optimal Control Problems This section presents the PINN methodology for (2.1). The construction is organized around two choices that are consequential for singularly perturbed optimal control systems. The first is architectural: the state and the adjoint, or the state and the control, are represented by separate two-scale netw… view at source ↗

**Figure 3.** Figure 3: Collocation points for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 directly and relies on penalty weights to recover the control behavior [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Training loss and L 1 errors comparison for Example 5.1 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 00 0 0 (a) Training loss comparison 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 (b) L 1 error comparison for the state variable 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 0 00 ! … view at source ↗

**Figure 5.** Figure 5: NN predictions vs Exact solutions using (4.4) for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 000 000 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (d) NN Solution for p 00 0 0 0 0 0 00 0 0 0 0 0… view at source ↗

**Figure 6.** Figure 6: NN predictions vs Exact solutions using (4.9) for Example 5.1 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 000 00 00 00 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 0… view at source ↗

**Figure 7.** Figure 7: NN predictions vs Exact solutions using (4.4) for Example 5.1 with ε = 5×10−4 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 0 0 0 0 (d) NN Solution for p 00 0 0 0 0 0 00 0 0 0 0 0 00… view at source ↗

**Figure 8.** Figure 8: Training loss and L 1 errors comparison for Example 5.2 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 0 00 0 0 0 (a) Training loss comparison 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 (b) L 1 error comparison for the state variable 0 0000 0000 0000 0000 0000 0000 0000 0000 0 0 0 ! … view at source ↗

**Figure 9.** Figure 9: NN predictions vs Exact solutions using (4.4) for Example 5.2 00 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 000 00 00 00 00 00 00 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0 0… view at source ↗

**Figure 10.** Figure 10: NN predictions vs Exact solutions using (4.9) for Example 5.2 00 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 (a) NN Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0 0 00 0 0 (b) Exact Solution for y 00 0 0 0 0 0 00 0 0 0 0 0 0000 000 000 000 000 000 00 (c) Absolute Error for y 00 0 0 0 0 0 00 0 0 0 0 0 00 00 000 00 00 0 00 (d) NN Solution for p = −u 00 0 0 0 0 0 00 0 0… view at source ↗

**Figure 11.** Figure 11: Comparison between NN predictions and EAFE solutions using (4.4) for Example 5.3 00 0 0 0 0 0 [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

read the original abstract

We propose a two-scale neural network method for optimal control problems governed by convection-dominated convection-diffusion-reaction equations. Building on two-scale architectures developed for singularly perturbed forward problems, we augment the spatial input with suitably rescaled features that become increasingly important as the diffusion coefficient becomes small. The approach employs separate neural networks for the state and adjoint state variables of the optimality system, reflecting the fact that these quantities develop sharp layers in different parts of the domain due to opposite convection fields. By choosing different center points for the two networks, the architecture naturally aligns with the layer location of each variable. We present two formulations of the method, one based on the first-order optimality conditions and another using penalization of the PDE constraint, and combine them with a successive training strategy that gradually decreases the diffusion coefficient toward its target value. Numerical experiments on benchmark problems illustrate the effectiveness and behavior of the proposed approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adapts two-scale NNs to the optimality system with separate state/adjoint networks and successive diffusion reduction, but fixed centers may limit robustness when layers shift.

read the letter

The core contribution is a two-scale neural network setup for optimal control of convection-dominated equations. It uses one network for the state and another for the adjoint, each with its own center point chosen to match the opposing boundary layers that arise from the convection directions. Rescaled features are added to the input, and training proceeds by gradually lowering the diffusion coefficient from larger to target values. Two variants are given, one enforcing the first-order conditions directly and one using penalty on the PDE constraint. Numerical tests on benchmark problems are included to show behavior as diffusion shrinks. This pairing of separate centered networks with the successive schedule is a reasonable extension of earlier two-scale work on forward singularly perturbed problems. The experiments appear to demonstrate that the method can resolve the layers in the tested cases without the usual NN difficulties at small diffusion. The main soft spot is the reliance on manually chosen, fixed center points. Layer locations depend on the unknown optimal control, and they become more sensitive as diffusion decreases further. If the chosen centers do not stay aligned under different controls or smaller diffusion values, the resolution advantage disappears and performance falls back to that of a standard network. The paper does not appear to include adaptive center selection or systematic sensitivity checks on this point. The work is aimed at people already working on numerical methods for PDE-constrained optimization with small diffusion, particularly those open to neural network approaches. Readers who need practical tools for transport-type control problems could find the architecture and training schedule useful. It is solid enough on its own terms to deserve peer review, though referees will likely press on the robustness of the fixed centers and the scope of the error behavior shown.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes a two-scale neural network method for optimal control problems governed by linear convection-dominated convection-diffusion-reaction equations. Separate networks are used for the state and adjoint, each augmented with rescaled features that gain importance for small diffusion; distinct center points are chosen to align with the opposing layer locations induced by the convection fields. Two formulations are given (first-order optimality conditions and PDE-constraint penalization), both paired with a successive training schedule that lowers the diffusion coefficient toward the target value. Effectiveness is illustrated by numerical experiments on standard benchmark problems.

Significance. If the reported performance is robust, the work supplies a concrete architectural adaptation of two-scale networks to optimality systems, addressing the fact that state and adjoint develop layers in different subdomains. The successive-training strategy and explicit separation of networks constitute a practical response to the known difficulties of standard PINNs with sharp interior layers. The benchmark results, if reproducible, would constitute useful evidence that the method can resolve the coupled system without requiring mesh adaptation.

major comments (2)

[§3.1] §3.1 (Architecture description): the claim that different center points 'naturally align with the layer location of each variable' is load-bearing for the two-scale advantage, yet the centers appear to be fixed a priori and chosen by hand for the reported benchmarks. Because the true layer positions depend on the unknown optimal control and become increasingly sensitive as the diffusion coefficient decreases, it is unclear whether the same fixed centers remain effective when the control changes or when diffusion is lowered further in the successive-training loop.
[§4] §4 (Numerical experiments): the reported error tables compare the two-scale method only against a standard single-network PINN on the same benchmark set; no ablation is shown that isolates the contribution of the rescaled features versus the choice of centers. Without such controls it is difficult to confirm that the observed improvement stems from the two-scale construction rather than from the successive-training schedule alone.

minor comments (2)

[§2.2] Notation for the rescaled feature functions is introduced without an explicit formula in the main text; a compact definition (perhaps as an equation) would improve readability.
[§3.3] In the penalization formulation, the weighting parameter between the PDE residual and the cost functional is stated to be fixed, but its dependence (or lack thereof) on the diffusion coefficient is not discussed; a brief remark on this choice would clarify robustness.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and valuable suggestions. We address each major comment in detail below and outline the changes we plan to make in the revised manuscript.

read point-by-point responses

Referee: [§3.1] §3.1 (Architecture description): the claim that different center points 'naturally align with the layer location of each variable' is load-bearing for the two-scale advantage, yet the centers appear to be fixed a priori and chosen by hand for the reported benchmarks. Because the true layer positions depend on the unknown optimal control and become increasingly sensitive as the diffusion coefficient decreases, it is unclear whether the same fixed centers remain effective when the control changes or when diffusion is lowered further in the successive-training loop.

Authors: The convection field is prescribed and fixed in the problem formulation, and the locations of the sharp layers in the state and adjoint variables are determined by the convection direction and the domain boundaries. The optimal control acts as a forcing term that affects the magnitude of the solution but does not shift the layer positions in this linear setting. Therefore, the a priori choice of distinct center points for the state and adjoint networks, aligned with the opposing convection directions, remains valid independently of the specific control. In the successive training procedure, the centers are held fixed as the diffusion coefficient is decreased, and the numerical experiments confirm that the approximation quality is maintained. We will revise Section 3.1 to include a more detailed explanation of this choice and its independence from the control. revision: partial
Referee: [§4] §4 (Numerical experiments): the reported error tables compare the two-scale method only against a standard single-network PINN on the same benchmark set; no ablation is shown that isolates the contribution of the rescaled features versus the choice of centers. Without such controls it is difficult to confirm that the observed improvement stems from the two-scale construction rather than from the successive-training schedule alone.

Authors: We acknowledge that the current experiments do not include an ablation study to separate the effects of the rescaled features and the distinct centers from the successive training. To address this, we will add new numerical results in the revised version of Section 4. These will include comparisons of the proposed method against ablated versions: one without rescaled features and one using identical centers for both networks, all under the same successive training schedule. This will help demonstrate the specific contributions of the two-scale architecture. revision: yes

Circularity Check

0 steps flagged

Proposed two-scale NN method validated on external benchmarks with no reduction to self-defined inputs

full rationale

The paper introduces a two-scale neural network architecture augmented with rescaled features and separate networks for state and adjoint variables, combined with successive training, to address optimal control problems for convection-dominated equations. It explicitly builds on prior two-scale methods for forward problems and demonstrates effectiveness through numerical experiments on independent benchmark problems. No load-bearing step in the method description or results reduces by construction to a fitted parameter, self-citation chain, or internal definition; the central claims rest on external validation rather than tautological equivalence to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not enumerate explicit free parameters or axioms; the method implicitly relies on the existence of suitable center points and rescaling functions that capture layer behavior, but these are presented as design choices rather than fitted constants.

pith-pipeline@v0.9.0 · 5688 in / 1134 out tokens · 32687 ms · 2026-05-20T01:21:20.032978+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The two-scale network takes the augmented vector (x, ε^γ(x−xc), ε^γ) … By choosing different center points for the two networks, the architecture naturally aligns with the layer location of each variable.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

successive training strategy that gradually decreases the diffusion coefficient toward its target value

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 2 internal anchors

[1]

J. H. Adler, C. Cavanaugh, X. Hu, A. Huang, and N. Trask. A s table mimetic ﬁnite-diﬀerence method for convection- dominated diﬀusion equations. SIAM Journal on Scientiﬁc Computing , 45(6):A2973–A3000, 2023

work page 2023
[2]

Ayuso and L

B. Ayuso and L. D. Marini. Discontinuous Galerkin method s for advection-diﬀusion-reaction problems. SIAM Journal on Numerical Analysis , 47(2):1391–1420, 2009

work page 2009
[3]

Barry-Straume, A

J. Barry-Straume, A. Sarshar, A. A. Popov, and A. Sandu. P hysics-informed neural networks for PDE-constrained opti - mization and control. Communications on Applied Mathematics and Computation , pages 1–24, 2025

work page 2025
[4]

Bergounioux

M. Bergounioux. A penalization method for optimal contr ol of elliptic problems with state constraints. SIAM Journal on Control and Optimization , 30(2):305–323, 1992

work page 1992
[5]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, Y. Kat ariya, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. W anderman-Milne, and Q. Zhang. JAX: composable transformations of Python+NumPy programs, 2018

work page 2018
[6]

S. C. Brenner, S. Liu, and L.-Y. Sung. A p1 ﬁnite element method for a distributed elliptic optimal con trol problem with a general state equation and pointwise state constraints. Computational Methods in Applied Mathematics , 21(4):777–790, 2021

work page 2021
[7]

S. C. Brenner, S. Liu, and L.-Y. Sung. Multigrid methods f or an elliptic optimal control problem with pointwise state constraints. Results in Applied Mathematics , 17:100356, 2023. 16 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Figure 11. Comparison between NN predictions and EAFE solutions using (4.4) fo r Ex- ample 5.3 0.0 0.2 0.4 0.6 0.8...

work page 2023
[8]

A. N. Brooks and T. J. Hughes. Streamline upwind/petrov- galerkin formulations for convection dominated ﬂows with p ar- ticular emphasis on the incompressible navier-stokes equa tions. Computer Methods in Applied Mechanics and Engineering , 32(1):199–259, 1982

work page 1982
[9]

F. Cao, F. Gao, X. Guo, and D. Yuan. Physics-informed neur al networks with parameter asymptotic strategy for learnin g singularly perturbed convection-dominated problem. Computers & Mathematics with Applications , 150:229–242, 2023

work page 2023
[10]

F. Cao, F. Gao, D. Yuan, and J. Liu. Multistep asymptotic pre-training strategy based on pinns for solving steep boun dary singular perturbation problems. Computer Methods in Applied Mechanics and Engineering , 431:117222, 2024

work page 2024
[11]

Y. Cao, C. C. So, Y. Dai, S. P. Yung, and J.-M. W ang. Advers arial physics-informed neural networks with hard constrai nts for optimal control of PDEs. Journal of Computational Physics , page 114307, 2025

work page 2025
[12]

Casas, M

E. Casas, M. Mateos, and J.-P. Raymond. Error estimates for the numerical approximation of a distributed control pr oblem for the steady-state navier–stokes equations. SIAM Journal on Control and Optimization , 46(3):952–982, 2007

work page 2007
[13]

G. Chen, W. Hu, J. Shen, J. R. Singler, Y. Zhang, and X. Zhe ng. An HDG method for distributed control of convection diﬀusion PDEs. Journal of Computational and Applied Mathematics , 343:643–661, 2018

work page 2018
[14]

Cockburn and C.-W

B. Cockburn and C.-W. Shu. The local discontinuous gale rkin method for time-dependent convection-diﬀusion syste ms. SIAM Journal on Numerical Analysis , 35(6):2440–2463, 1998

work page 1998
[15]

Y. Dai, B. Jin, R. C. Sau, and Z. Zhou. Solving elliptic op timal control problems via neural networks and optimality system. Advances in Computational Mathematics , 51(4):31, 2025

work page 2025
[16]

Dupret and D

J.-L. Dupret and D. Hainaut. Deep learning for high-dim ensional continuous-time stochastic optimal control with out explicit solution. Operations Research, 2026

work page 2026
[17]

Gao and M

Z. Gao and M. Yang. More consistent accuracy PINN via alt ernating easy-hard training. arXiv:2512.17607, 2025

work page arXiv 2025
[18]

Hinze, R

M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich. Optimization with PDE constraints . Springer Science & Business Media, 2008

work page 2008
[19]

Houston, C

P. Houston, C. Schwab, and E. S¨ uli. Discontinuous hp-ﬁ nite element methods for advection-diﬀusion-reaction pro blems. SIAM Journal on Numerical Analysis , 39(6):2133–2163, 2002. PINNS-OCP 17

work page 2002
[20]

T. J. R. Hughes and A. N. Brooks. Multi-dimensional upwi nd scheme with no crosswind diﬀusion. 1979

work page 1979
[21]

Jeong, S

S. Jeong, S. Lee, and S. Liu. A monotone ﬁnite element met hod for an elliptic distributed optimal control problem wit h a convection-dominated state equation. arXiv:2510.27167, 2025

work page arXiv 2025
[22]

B. Jin, R. Sau, L. Yin, and Z. Zhou. Solving elliptic opti mal control problems using physics informed neural network s. arXiv:2308.11925, 2023

work page arXiv 2023
[23]

D. P. Kingma and J. Ba. Adam: A method for stochastic opti mization. arXiv:1412.6980, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[24]

Knobloch and G

P. Knobloch and G. Lube. Local projection stabilizatio n for advection-diﬀusion-reaction problems: One-level vs . two-level approach. Applied Numerical Mathematics , 59(12):2891–2907, 2009

work page 2009
[25]

Leykekhman and M

D. Leykekhman and M. Heinkenschloss. Local error analy sis of discontinuous Galerkin methods for advection-domin ated elliptic linear-quadratic optimal control problems. SIAM Journal on Numerical Analysis , 50(4):2012–2038, 2012

work page 2012
[26]

J. L. Lions. Optimal Control of Systems Governed by Partial Diﬀerential Equations. Springer, 1971

work page 1971
[27]

Liu and V

S. Liu and V. Simoncini. Multigrid preconditioning for discontinuous Galerkin discretizations of an elliptic opt imal control problem with a convection-dominated state equation. Journal of Scientiﬁc Computing , 101(3):79, 2024

work page 2024
[28]

S. Liu, Z. Tan, and Y. Zhang. Discontinuous galerkin met hods for an elliptic optimal control problem with a general s tate equation and pointwise state constraints. Journal of Computational and Applied Mathematics , 437:115494, 2024

work page 2024
[29]

Liu and J

S. Liu and J. Zhang. A balancing domain decomposition by constraints preconditioner for a hybridizable discontinu ous Galerkin discretization of an elliptic optimal control pro blem. arXiv:2504.02072, 2025

work page arXiv 2025
[30]

Liu and J

S. Liu and J. Zhang. Convergence analysis of a balancing domain decomposition method for an elliptic optimal contro l problem with HDG discretizations. ESAIM: Mathematical Modelling and Numerical Analysis , 2026

work page 2026
[31]

L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis. Deepxde: A d eep learning library for solving diﬀerential equations. SIAM Review, 63(1):208–228, 2021

work page 2021
[32]

M¨ unzer and C

M. M¨ unzer and C. Bard. A curriculum-training-based st rategy for distributing collocation points during physics -informed neural network training. arXiv:2211.11396, 2022

work page arXiv 2022
[33]

J. Nitsche. ¨Uber ein variationsprinzip zur l¨ osung von dirichlet-problemen bei verwendung von teilr¨ aumen, die keinen randbe- dingungen unterworfen sind. In Abhandlungen aus dem mathematischen Seminar der Universit ¨ at Hamburg, volume 36, pages 9–15. Springer, 1971

work page 1971
[34]

R. D. Nzoyem Ngueguin, D. A. Barton, and T. Deakin. A comp arison of mesh-free diﬀerentiable programming and data- driven strategies for optimal control under pde constraint s. In Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storag e, and Analysis , SC-W’23, page 21–28, New York, NY, USA, 2023...

work page 2023
[35]

H.-G. Roos, M. Stynes, and L. Tobiska. Robust numerical methods for singularly perturbed diﬀeren tial equations: convection-diﬀusion-reaction and ﬂow problems . Springer, 2008

work page 2008
[36]

Tr¨ oltzsch.Optimal Control of Partial Diﬀerential Equations: Theory, Methods, and Applications, volume 112

F. Tr¨ oltzsch.Optimal Control of Partial Diﬀerential Equations: Theory, Methods, and Applications, volume 112. American Mathematical Soc., 2010

work page 2010
[37]

W ang, P

S. W ang, P. Zhao, Q. Ma, and T. Song. General-kindred phy sics-informed neural network to the solutions of singularl y perturbed diﬀerential equations. Physics of Fluids , 36(11), 2024

work page 2024
[38]

W ang, P

S. W ang, P. Zhao, and T. Song. Aspinn: An asymptotic stra tegy for solving singularly perturbed diﬀerential equatio ns. arXiv:2409.13185, 2024

work page arXiv 2024
[39]

W ang, Y

X. W ang, Y. Dou, X. Yi, Y. Zhang, X. Li, B. Li, H. Peng, L. W a ng, and K. L. Teo. When optimal control meets neural network: A comprehensive survey. Archives of Computational Methods in Engineering , pages 1–56, 2026

work page 2026
[40]

W ang, P

X. W ang, P. Yin, B. Zhang, and C. Yang. AONN-2: An adjoint -oriented neural network method for PDE-constrained shape optimization. Journal of Computational Physics , 513:113160, 2024

work page 2024
[41]

W ang, C

Y. W ang, C. Xu, M. Yang, and J. Zhang. Less emphasis on har d regions: curriculum learning of PINNs for singularly perturbed convection-diﬀusion-reaction problems. East Asian Journal on Applied Mathematics , 14(1):104–123, 2024

work page 2024
[42]

C. W u, M. Zhu, Q. Tan, Y. Kartha, and L. Lu. A comprehensiv e study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering , 403:115671, 2023

work page 2023
[43]

Xu and L

J. Xu and L. Zikatanov. A monotone ﬁnite element scheme f or convection-diﬀusion equations. Mathematics of Computation , 68(228):1429–1446, 1999

work page 1999
[44]

P. Yin, G. Xiao, K. Tang, and C. Yang. AONN: An adjoint-or iented neural network method for all-at-once solutions of parametric optimal control problems. SIAM Journal on Scientiﬁc Computing , 46(1):C127–C153, 2024

work page 2024
[45]

Two-scale Neural Networks for Singularly Perturbed Dynamical Systems with Multiple Parameters

Q. Zhuang, T. W ang, R. W anjiku, M. Bani-Yaghoub, and Z. Z hang. Two-scale neural networks for singularly perturbed dynamical systems with multiple parameters. arXiv:2605.02799, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[46]

Zhuang, C

Q. Zhuang, C. Z. Yao, Z. Zhang, and G. E. Karniadakis. Two -scale neural networks for partial diﬀerential equations w ith small parameters. Communications in Computational Physics , 38(3):603–629, 2025. 18 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Sijing Liu, Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institu...

work page 2025

[1] [1]

J. H. Adler, C. Cavanaugh, X. Hu, A. Huang, and N. Trask. A s table mimetic ﬁnite-diﬀerence method for convection- dominated diﬀusion equations. SIAM Journal on Scientiﬁc Computing , 45(6):A2973–A3000, 2023

work page 2023

[2] [2]

Ayuso and L

B. Ayuso and L. D. Marini. Discontinuous Galerkin method s for advection-diﬀusion-reaction problems. SIAM Journal on Numerical Analysis , 47(2):1391–1420, 2009

work page 2009

[3] [3]

Barry-Straume, A

J. Barry-Straume, A. Sarshar, A. A. Popov, and A. Sandu. P hysics-informed neural networks for PDE-constrained opti - mization and control. Communications on Applied Mathematics and Computation , pages 1–24, 2025

work page 2025

[4] [4]

Bergounioux

M. Bergounioux. A penalization method for optimal contr ol of elliptic problems with state constraints. SIAM Journal on Control and Optimization , 30(2):305–323, 1992

work page 1992

[5] [5]

Bradbury, R

J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, Y. Kat ariya, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. W anderman-Milne, and Q. Zhang. JAX: composable transformations of Python+NumPy programs, 2018

work page 2018

[6] [6]

S. C. Brenner, S. Liu, and L.-Y. Sung. A p1 ﬁnite element method for a distributed elliptic optimal con trol problem with a general state equation and pointwise state constraints. Computational Methods in Applied Mathematics , 21(4):777–790, 2021

work page 2021

[7] [7]

S. C. Brenner, S. Liu, and L.-Y. Sung. Multigrid methods f or an elliptic optimal control problem with pointwise state constraints. Results in Applied Mathematics , 17:100356, 2023. 16 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Figure 11. Comparison between NN predictions and EAFE solutions using (4.4) fo r Ex- ample 5.3 0.0 0.2 0.4 0.6 0.8...

work page 2023

[8] [8]

A. N. Brooks and T. J. Hughes. Streamline upwind/petrov- galerkin formulations for convection dominated ﬂows with p ar- ticular emphasis on the incompressible navier-stokes equa tions. Computer Methods in Applied Mechanics and Engineering , 32(1):199–259, 1982

work page 1982

[9] [9]

F. Cao, F. Gao, X. Guo, and D. Yuan. Physics-informed neur al networks with parameter asymptotic strategy for learnin g singularly perturbed convection-dominated problem. Computers & Mathematics with Applications , 150:229–242, 2023

work page 2023

[10] [10]

F. Cao, F. Gao, D. Yuan, and J. Liu. Multistep asymptotic pre-training strategy based on pinns for solving steep boun dary singular perturbation problems. Computer Methods in Applied Mechanics and Engineering , 431:117222, 2024

work page 2024

[11] [11]

Y. Cao, C. C. So, Y. Dai, S. P. Yung, and J.-M. W ang. Advers arial physics-informed neural networks with hard constrai nts for optimal control of PDEs. Journal of Computational Physics , page 114307, 2025

work page 2025

[12] [12]

Casas, M

E. Casas, M. Mateos, and J.-P. Raymond. Error estimates for the numerical approximation of a distributed control pr oblem for the steady-state navier–stokes equations. SIAM Journal on Control and Optimization , 46(3):952–982, 2007

work page 2007

[13] [13]

G. Chen, W. Hu, J. Shen, J. R. Singler, Y. Zhang, and X. Zhe ng. An HDG method for distributed control of convection diﬀusion PDEs. Journal of Computational and Applied Mathematics , 343:643–661, 2018

work page 2018

[14] [14]

Cockburn and C.-W

B. Cockburn and C.-W. Shu. The local discontinuous gale rkin method for time-dependent convection-diﬀusion syste ms. SIAM Journal on Numerical Analysis , 35(6):2440–2463, 1998

work page 1998

[15] [15]

Y. Dai, B. Jin, R. C. Sau, and Z. Zhou. Solving elliptic op timal control problems via neural networks and optimality system. Advances in Computational Mathematics , 51(4):31, 2025

work page 2025

[16] [16]

Dupret and D

J.-L. Dupret and D. Hainaut. Deep learning for high-dim ensional continuous-time stochastic optimal control with out explicit solution. Operations Research, 2026

work page 2026

[17] [17]

Gao and M

Z. Gao and M. Yang. More consistent accuracy PINN via alt ernating easy-hard training. arXiv:2512.17607, 2025

work page arXiv 2025

[18] [18]

Hinze, R

M. Hinze, R. Pinnau, M. Ulbrich, and S. Ulbrich. Optimization with PDE constraints . Springer Science & Business Media, 2008

work page 2008

[19] [19]

Houston, C

P. Houston, C. Schwab, and E. S¨ uli. Discontinuous hp-ﬁ nite element methods for advection-diﬀusion-reaction pro blems. SIAM Journal on Numerical Analysis , 39(6):2133–2163, 2002. PINNS-OCP 17

work page 2002

[20] [20]

T. J. R. Hughes and A. N. Brooks. Multi-dimensional upwi nd scheme with no crosswind diﬀusion. 1979

work page 1979

[21] [21]

Jeong, S

S. Jeong, S. Lee, and S. Liu. A monotone ﬁnite element met hod for an elliptic distributed optimal control problem wit h a convection-dominated state equation. arXiv:2510.27167, 2025

work page arXiv 2025

[22] [22]

B. Jin, R. Sau, L. Yin, and Z. Zhou. Solving elliptic opti mal control problems using physics informed neural network s. arXiv:2308.11925, 2023

work page arXiv 2023

[23] [23]

D. P. Kingma and J. Ba. Adam: A method for stochastic opti mization. arXiv:1412.6980, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[24] [24]

Knobloch and G

P. Knobloch and G. Lube. Local projection stabilizatio n for advection-diﬀusion-reaction problems: One-level vs . two-level approach. Applied Numerical Mathematics , 59(12):2891–2907, 2009

work page 2009

[25] [25]

Leykekhman and M

D. Leykekhman and M. Heinkenschloss. Local error analy sis of discontinuous Galerkin methods for advection-domin ated elliptic linear-quadratic optimal control problems. SIAM Journal on Numerical Analysis , 50(4):2012–2038, 2012

work page 2012

[26] [26]

J. L. Lions. Optimal Control of Systems Governed by Partial Diﬀerential Equations. Springer, 1971

work page 1971

[27] [27]

Liu and V

S. Liu and V. Simoncini. Multigrid preconditioning for discontinuous Galerkin discretizations of an elliptic opt imal control problem with a convection-dominated state equation. Journal of Scientiﬁc Computing , 101(3):79, 2024

work page 2024

[28] [28]

S. Liu, Z. Tan, and Y. Zhang. Discontinuous galerkin met hods for an elliptic optimal control problem with a general s tate equation and pointwise state constraints. Journal of Computational and Applied Mathematics , 437:115494, 2024

work page 2024

[29] [29]

Liu and J

S. Liu and J. Zhang. A balancing domain decomposition by constraints preconditioner for a hybridizable discontinu ous Galerkin discretization of an elliptic optimal control pro blem. arXiv:2504.02072, 2025

work page arXiv 2025

[30] [30]

Liu and J

S. Liu and J. Zhang. Convergence analysis of a balancing domain decomposition method for an elliptic optimal contro l problem with HDG discretizations. ESAIM: Mathematical Modelling and Numerical Analysis , 2026

work page 2026

[31] [31]

L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis. Deepxde: A d eep learning library for solving diﬀerential equations. SIAM Review, 63(1):208–228, 2021

work page 2021

[32] [32]

M¨ unzer and C

M. M¨ unzer and C. Bard. A curriculum-training-based st rategy for distributing collocation points during physics -informed neural network training. arXiv:2211.11396, 2022

work page arXiv 2022

[33] [33]

J. Nitsche. ¨Uber ein variationsprinzip zur l¨ osung von dirichlet-problemen bei verwendung von teilr¨ aumen, die keinen randbe- dingungen unterworfen sind. In Abhandlungen aus dem mathematischen Seminar der Universit ¨ at Hamburg, volume 36, pages 9–15. Springer, 1971

work page 1971

[34] [34]

R. D. Nzoyem Ngueguin, D. A. Barton, and T. Deakin. A comp arison of mesh-free diﬀerentiable programming and data- driven strategies for optimal control under pde constraint s. In Proceedings of the SC ’23 Workshops of the International Conference on High Performance Computing, Network, Storag e, and Analysis , SC-W’23, page 21–28, New York, NY, USA, 2023...

work page 2023

[35] [35]

H.-G. Roos, M. Stynes, and L. Tobiska. Robust numerical methods for singularly perturbed diﬀeren tial equations: convection-diﬀusion-reaction and ﬂow problems . Springer, 2008

work page 2008

[36] [36]

Tr¨ oltzsch.Optimal Control of Partial Diﬀerential Equations: Theory, Methods, and Applications, volume 112

F. Tr¨ oltzsch.Optimal Control of Partial Diﬀerential Equations: Theory, Methods, and Applications, volume 112. American Mathematical Soc., 2010

work page 2010

[37] [37]

W ang, P

S. W ang, P. Zhao, Q. Ma, and T. Song. General-kindred phy sics-informed neural network to the solutions of singularl y perturbed diﬀerential equations. Physics of Fluids , 36(11), 2024

work page 2024

[38] [38]

W ang, P

S. W ang, P. Zhao, and T. Song. Aspinn: An asymptotic stra tegy for solving singularly perturbed diﬀerential equatio ns. arXiv:2409.13185, 2024

work page arXiv 2024

[39] [39]

W ang, Y

X. W ang, Y. Dou, X. Yi, Y. Zhang, X. Li, B. Li, H. Peng, L. W a ng, and K. L. Teo. When optimal control meets neural network: A comprehensive survey. Archives of Computational Methods in Engineering , pages 1–56, 2026

work page 2026

[40] [40]

W ang, P

X. W ang, P. Yin, B. Zhang, and C. Yang. AONN-2: An adjoint -oriented neural network method for PDE-constrained shape optimization. Journal of Computational Physics , 513:113160, 2024

work page 2024

[41] [41]

W ang, C

Y. W ang, C. Xu, M. Yang, and J. Zhang. Less emphasis on har d regions: curriculum learning of PINNs for singularly perturbed convection-diﬀusion-reaction problems. East Asian Journal on Applied Mathematics , 14(1):104–123, 2024

work page 2024

[42] [42]

C. W u, M. Zhu, Q. Tan, Y. Kartha, and L. Lu. A comprehensiv e study of non-adaptive and residual-based adaptive sampling for physics-informed neural networks. Computer Methods in Applied Mechanics and Engineering , 403:115671, 2023

work page 2023

[43] [43]

Xu and L

J. Xu and L. Zikatanov. A monotone ﬁnite element scheme f or convection-diﬀusion equations. Mathematics of Computation , 68(228):1429–1446, 1999

work page 1999

[44] [44]

P. Yin, G. Xiao, K. Tang, and C. Yang. AONN: An adjoint-or iented neural network method for all-at-once solutions of parametric optimal control problems. SIAM Journal on Scientiﬁc Computing , 46(1):C127–C153, 2024

work page 2024

[45] [45]

Two-scale Neural Networks for Singularly Perturbed Dynamical Systems with Multiple Parameters

Q. Zhuang, T. W ang, R. W anjiku, M. Bani-Yaghoub, and Z. Z hang. Two-scale neural networks for singularly perturbed dynamical systems with multiple parameters. arXiv:2605.02799, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[46] [46]

Zhuang, C

Q. Zhuang, C. Z. Yao, Z. Zhang, and G. E. Karniadakis. Two -scale neural networks for partial diﬀerential equations w ith small parameters. Communications in Computational Physics , 38(3):603–629, 2025. 18 SIJING LIU, MARCUS SARKIS, YI ZHANG, AND ZHONGQIANG ZHANG Sijing Liu, Department of Mathematical Sciences, Worcester Polytechnic Institute, 100 Institu...

work page 2025