pith. sign in

arxiv: 2606.17971 · v1 · pith:O62IVY76new · submitted 2026-06-16 · 💻 cs.CE · math-ph· math.MP

Online Spectral Deflation for State Constrained Optimal Control Problems

Pith reviewed 2026-06-26 21:55 UTC · model grok-4.3

classification 💻 cs.CE math-phmath.MP
keywords spectral deflationSchur complementactive-set methodsstate-constrained optimal controlconjugate gradientparametric PDEpreconditioning
0
0 comments X

The pith

A single full-domain reference Schur complement supplies reusable low eigenmodes that accelerate CG solves of every parameter-dependent inactive-set system by 55 to 98 percent.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Parametric optimal-control problems with pointwise state constraints generate a sequence of Schur-complement systems whose active sets, and therefore their dimensions and spectra, change with the parameter. Standard sparse-direct, multigrid, or Krylov reuse strategies become expensive because each new inactive set requires a fresh factorization or hierarchy. The paper anchors a spectral deflation basis to one fixed full-domain reference operator whose low eigenmodes are computed once offline. These modes are then restricted online to each new inactive set and inserted as an A-DEF2 deflation space inside a Jacobi-preconditioned CG iteration. The resulting solver reuses the same reference information across an entire parameter sweep while leaving the original high-fidelity discrete system and the requested solver tolerance unchanged.

Core claim

Low eigenmodes of one fixed full-domain reference Schur complement remain effective deflation vectors after they are restricted to each parameter-dependent inactive set. When these restricted modes are used inside an A-DEF2 deflation framework for Jacobi-preconditioned CG, iteration counts drop by 55 to 98 percent on diffusion, convection-diffusion, nonlinear thermal, and conjugate-heat-transfer benchmarks. The method therefore replaces repeated expensive rebuilds of preconditioners or factorizations with a single offline eigensolve whose output is reused for every active-set instance.

What carries the argument

The reusable A-DEF2 deflation basis obtained by restricting low eigenmodes of a single full-domain reference Schur complement to each new inactive set.

If this is right

  • The same reference basis works without modification on diffusion, convection-diffusion, nonlinear thermal, and conjugate-heat-transfer problems.
  • GPU wall-time gains appear because the reference basis is built once while competing solver structures must be rebuilt per instance.
  • Coarse-grid or analytical reference modes can amortize the offline cost inside a single parameter sweep.
  • The framework admits POD enrichment and Rayleigh-Ritz reselection while still preserving the exact inactive-set operator.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The spectral-coherence argument may apply to other active-set problems whose inactive sets evolve with a parameter, such as obstacle or contact problems.
  • If the restricted reference modes remain accurate, similar deflation could be tested on time-dependent or stochastic control problems where the active set evolves continuously rather than parametrically.
  • Pairing the deflation with a stronger base preconditioner than Jacobi could produce further gains, though the paper isolates the contribution of the deflation step itself.

Load-bearing premise

Low eigenmodes computed on the full-domain reference Schur complement remain effective deflation vectors after they are restricted to each parameter-dependent inactive set.

What would settle it

A benchmark instance in which the principal angles between the restricted reference modes and the true low eigenmodes of the new inactive-set operator become large enough that the deflation iteration count exceeds that of plain Jacobi-preconditioned CG.

Figures

Figures reproduced from arXiv: 2606.17971 by Francesco Ballarin, Sanghyun Lee, Teeratorn Kadeethum, Youngsoo Choi.

Figure 1
Figure 1. Figure 1: Overview of the proposed reusable spectral deflation strategy: compute a reference spectral basis once, restrict it to each per-instance inactive set, and use it as a deflation basis for the online Krylov solve. Full Schur-complement, active-set, and A-DEF2 details are in Section 2; empirical support is in Section 4. Kadeethum et al.: Preprint submitted to Elsevier Page 4 of 70 [PITH_FULL_IMAGE:figures/fu… view at source ↗
Figure 2
Figure 2. Figure 2: Reduction pathway used throughout the paper. The continuous state constrained optimal control problem (OCP) is discretized, reduced to the SPD Schur complement 𝑀 = 𝛼𝐴⊤𝐴 + 𝐼, and then restricted by the primal active￾set identification to the inactive set system (4). The proposed method accelerates this last solve across many parameter instances. Panel (c) is drawn after reordering the degrees of freedom so … view at source ↗
Figure 3
Figure 3. Figure 3: Schematic of A-DEF2 deflated CG. The deflation vectors 𝑍 split the current inactive space into a small coarse subspace (range(𝑍), blue), handled in closed form by the Gram matrix solve via 𝑄, and a complementary subspace (range(𝑃 ), orange), handled by preconditioned CG on the projected operator 𝑃 ⊤𝑀 . The bottom box depicts only the deflated initial guess (the coarse-correction predictor in Algorithm 1,… view at source ↗
Figure 4
Figure 4. Figure 4: Three deflation basis sources. Eigenmodes provide a fixed reusable reference basis, POD modes accumulate online from previous solves, and the safe combined basis appends POD information only while the restricted Gram matrix remains well conditioned. By construction, every accepted update keeps 𝑍 orthonormal: line 5 removes the component of the candidate inside range(𝑍) and line 9 normalizes the residual. T… view at source ↗
Figure 5
Figure 5. Figure 5: Coarse-grid eigenmode prolongation. Eigenmodes are computed on a reduced grid, prolongated to the fine grid through the tensor-product interpolation operator, and then orthonormalized before use. Optional Ritz cleanup can be applied after prolongation. Remark 6 (Coarse-grid regularization: empirical observation). Across the benchmark families tested in this paper, coarse-grid prolongation often produces be… view at source ↗
Figure 6
Figure 6. Figure 6: Schematic online protocol. At query 𝑚, the reusable reference eigenbasis 𝑍eig is already available because it is computed once from a fixed reference parameter. Optional POD enrichment can only use previously solved states {𝐲 (1) , …, 𝐲 (𝑚−1)}, and optional warm start transfers the previous solution to the current query. The same protocol is used with 𝑁 = 30 sequential instances for solver benchmarks and w… view at source ↗
Figure 7
Figure 7. Figure 7: Spectral coherence across three problems. (a) Subspace angles vs active set distance 𝛿. Each problem appears with two marker styles in the same colour: solid markers show the principal angle at the deflation cutoff 𝜃20, which stays below 8.4 ◦ across the sweep; faded markers show the worst-case angle 𝜃max, which can reach 90◦ . (b) Deflation effectiveness (ef f = 1 − 𝑛def l∕𝑛cold) is stable across the para… view at source ↗
Figure 8
Figure 8. Figure 8: Per-instance wall-time ratio AMG-RS (CPU) / Kronecker (GPU) at the three grids where AMG-RS was evaluated (153×10, 203×10, 253×20). Each dot is one Re ×𝜅𝑟 configuration; orange bars show the median. This is a deployment ratio (CPU AMG-RS vs GPU Kronecker), not a per-iteration algorithmic comparison. The GPU advantage widens with DOF because the measured wall-time follows 𝑡 ∼ 𝑁1.22 for AMG-RS and 𝑡 ∼ 𝑁0.75 … view at source ↗
Figure 9
Figure 9. Figure 9: Eigenvalue trajectories 𝜆𝑖 (𝜇) for the leading 50 eigenvalues across all parametric instances. All three problems exhibit smooth, crossing-free evolution even as active set membership changes discretely. 0.00 0.25 0.50 0.75 1.00 1.25 1.50 4.0 4.2 4.4 4.6 (MII) (× 1 0 6 ) 2d_asym (M) 21/ 20 0.8 0.9 1.0 1.1 1.2 6 8 10 12 14 (MII) (× 1 0 6 ) 2d_nonsep (M) 21/ 20 0.00 0.25 0.50 0.75 1.00 1.25 1.50 0.9 1.0 1.1 … view at source ↗
Figure 10
Figure 10. Figure 10: Condition number 𝜅(𝑀 ) and spectral gap evolution across parametric instances. For 2d_nonsep, 𝜅 decreases as the active set grows (spectral self-regulation) [PITH_FULL_IMAGE:figures/full_fig_p039_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Maximum and mean subspace angles vs active set distance 𝛿 for all modes and the first 20 modes. 𝜃max reaches 90◦ but 𝜃20 stays small. 0 10 20 30 40 50 Instance index 1 10 11 20 21 30 31 40 41 50 Mode quintile 2d_asym 0 10 20 30 40 50 Instance index 1 10 11 20 21 30 31 40 41 50 Mode quintile 2d_nonsep 0 10 20 30 40 50 Instance index 1 10 11 20 21 30 31 40 41 50 Mode quintile thermal_ra100 0 10 20 30 40 50 … view at source ↗
Figure 12
Figure 12. Figure 12: Two-regime structure: mean subspace angle by mode quintile and instance index, one panel per problem. The stable (blue) to randomized (red) transition is problem-dependent — earliest for 2d_nonsep (mode ∼20–30) and progressively later for thermal_ra100 and 2d_asym. For 2d_nonsep, cold CG drops from 16.6K to 10.6K as 𝜅 falls with the growing active set; deflated CG drops in parallel (6.2K→4.3K). Effectiven… view at source ↗
Figure 13
Figure 13. Figure 13: Eigenmode mass in the active region vs mode index at maximum 𝛿. For 2d_asym (𝛿 = 2.1%), mass is negligible at all modes. For 2d_nonsep (𝛿 = 22%), mass grows with mode index (6% to 35%), consistent with trailing modes concentrating at the active set boundary. For thermal (𝛿 = 15%), mass is moderate and flatter (6–13%). 0.000 0.005 0.010 0.015 0.020 (active-set distance) 3500 4000 4500 5000 5500 6000 6500 7… view at source ↗
Figure 14
Figure 14. Figure 14: Cold and deflated CG iterations vs active set distance 𝛿, with 𝜃20 overlaid (right axis, red). The cutoff angle 𝜃20 is a useful diagnostic for deflation behavior; baseline CG difficulty also affects the effectiveness ratio, particularly at small 𝛿 (e.g., 2d_asym) and in configurations where the cold count itself varies. For the linear problems, operator drift is identically zero (𝜀 ≡ 0) across all instanc… view at source ↗
Figure 15
Figure 15. Figure 15: Two-source decomposition: active set distance 𝛿 vs operator drift 𝜀, colored by deflation effectiveness. For linear problems, 𝜀 ≡ 0; for thermal_ra100, 𝜀 is at most ∼ 5 × 10−8 ((10−8)) [PITH_FULL_IMAGE:figures/full_fig_p042_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Per-instance scatter of deflation effectiveness against uniform active-set distance 𝛿 (blue circles) and the eigenmode-weighted variant 𝛿𝑤 (orange triangles), one panel per problem; legend entries report |𝑟| with the corresponding metric. Eigenmode weighting mainly helps by rescuing the under-resolved 2d_asym interior sweep (𝛿𝑤 correlation −0.55 vs uniform −0.18 on the interior, max uniform 𝛿 is only 2.1%… view at source ↗
Figure 17
Figure 17. Figure 17: Correlation of per-quintile mean subspace angle with deflation effectiveness across all three problems. The dashed red horizontal line separates the deflation subspace (modes 1–20, above the line) from trailing modes (21–50, below). For 2d_nonsep, the strongest correlation is in quintile 21–30 (|𝑟| = 0.94), immediately below the cutoff; see the caveat in the body text on causal interpretation. 0.3 0.4 0.5… view at source ↗
Figure 18
Figure 18. Figure 18: Coarse Gram conditioning 𝜅(𝑍⊤𝑀𝑍) and 𝜃20 vs active fraction. The coarse-solve conditioning improves as the active set grows, remaining well below the empirical divergence threshold (∼107 ). Kadeethum et al.: Preprint submitted to Elsevier Page 44 of 70 [PITH_FULL_IMAGE:figures/full_fig_p044_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Block structure of the space–time operators for the parabolic extension. (a) The forward operator 𝐹 is block lower bidiagonal under implicit Euler time stepping (Pearson et al., 2012). (b) The space–time Schur complement 𝑀st = 𝛼𝐹 ⊤𝐹 +𝐼 is block tridiagonal and SPD. Each block is 𝑁𝑠 × 𝑁𝑠 (spatial DOF), and the full system has 𝑛𝑡𝑁𝑠 unknowns. Kadeethum et al.: Preprint submitted to Elsevier Page 47 of 70 [P… view at source ↗
read the original abstract

Parametric PDE-constrained optimal control with pointwise state constraints requires repeated solution of restricted Schur-complement systems on parameter-dependent inactive sets. In a primal active-set method, each inactive-set system is symmetric positive definite, but the active set can change nonsmoothly with the parameter. The resulting operator may vary in dimension, sparsity pattern, and spectrum, limiting reuse of sparse factorizations, multigrid hierarchies, and Krylov information. We propose a reusable spectral-deflation strategy anchored to one full-domain reference Schur complement. Low reference eigenmodes are computed once, restricted online to each inactive set, and used as an A-DEF2 deflation basis for Jacobi-preconditioned CG. The framework also supports POD enrichment, Rayleigh-Ritz reselection, coarse-grid or analytical reference modes, and conditioning safeguards. Given the active set, the method preserves the high-fidelity inactive-set system and solves it to the prescribed CG tolerance; it accelerates the linear algebra rather than replacing the optimal-control solve with a surrogate. We explain the method through a spectral-coherence view, motivated by interlacing and perturbation arguments and assessed with principal-angle diagnostics. Across diffusion, convection-diffusion, nonlinear thermal, and conjugate-heat-transfer benchmarks, deflation reduces CG iterations by about 55 to 98 percent. GPU deployments also show wall-time gains over CPU sparse-direct and algebraic-multigrid baselines, because the reference basis is built once whereas competing solver structures are rebuilt per instance. Coarse-grid or analytical modes amortize the offline cost within a single parameter sweep; fine-grid eigensolves remain more precompute-limited. Timings isolate the inactive-set linear-solve kernel; reducing the active-set outer loop is outside the present scope.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a reusable spectral-deflation preconditioner for the sequence of symmetric positive definite Schur-complement systems that arise inside a primal active-set solver for parametric PDE-constrained optimal control problems with pointwise state constraints. Low eigenmodes are computed once on a single full-domain reference Schur complement, restricted online to each parameter-dependent inactive set, and employed as an A-DEF2 deflation basis inside Jacobi-preconditioned CG. The method is motivated by spectral-coherence, interlacing and perturbation arguments, assessed via principal-angle diagnostics, and reported to reduce CG iterations by 55–98 % on diffusion, convection-diffusion, nonlinear thermal and conjugate-heat-transfer benchmarks while preserving the original high-fidelity inactive-set operator and solving it to the prescribed tolerance.

Significance. If the restricted reference modes remain effective deflation vectors across the full range of inactive-set configurations encountered in a parameter sweep, the approach would amortize the dominant precomputation cost and yield substantial wall-time savings for repeated high-fidelity solves inside active-set optimal-control loops, particularly on GPU hardware. The framework’s support for POD enrichment, Rayleigh-Ritz reselection and analytical reference modes is a practical strength.

major comments (2)
  1. [spectral-coherence view / principal-angle diagnostics] The central iteration-reduction claim (55–98 %) rests on the assertion that low eigenmodes of the single full-domain reference Schur complement remain effective A-DEF2 vectors after restriction to each parameter-dependent inactive-set operator. The spectral-coherence section motivates this via interlacing and perturbation bounds, yet supplies no worst-case analysis or quantitative thresholds on principal angles when the inactive set is small, disconnected or topologically dissimilar from the reference; without such guarantees the reported savings cannot be asserted uniformly.
  2. [numerical results / benchmark descriptions] Although the abstract states that the method “preserves the high-fidelity inactive-set system and solves it to the prescribed CG tolerance,” the manuscript reports no quantitative verification (e.g., comparison of optimal-control objective values, constraint violation norms, or solution differences with and without deflation) that the deflation step does not degrade accuracy relative to a direct or AMG solve of the same inactive-set system.
minor comments (2)
  1. [GPU timings / baseline comparisons] Baseline solver details (exact AMG hierarchy construction, fill-in levels, and convergence tolerances) are not tabulated; this makes it difficult to reproduce the reported wall-time comparisons.
  2. [abstract and results tables] The phrase “about 55 to 98 percent” should be replaced by precise per-benchmark percentages together with the number of CG iterations and the corresponding reference values.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful comments on our manuscript. We address the major concerns below and outline the revisions we will make.

read point-by-point responses
  1. Referee: [spectral-coherence view / principal-angle diagnostics] The central iteration-reduction claim (55–98 %) rests on the assertion that low eigenmodes of the single full-domain reference Schur complement remain effective A-DEF2 vectors after restriction to each parameter-dependent inactive-set operator. The spectral-coherence section motivates this via interlacing and perturbation bounds, yet supplies no worst-case analysis or quantitative thresholds on principal angles when the inactive set is small, disconnected or topologically dissimilar from the reference; without such guarantees the reported savings cannot be asserted uniformly.

    Authors: We agree that a rigorous worst-case analysis for arbitrary inactive-set configurations is not provided. Our spectral-coherence arguments based on interlacing and perturbation theory offer insight into why the reference modes remain effective, and the principal-angle diagnostics in the manuscript quantify the alignment for the benchmark problems. These benchmarks include varying inactive-set sizes and topologies arising from different parameter values. In the revised version we will expand the discussion to include additional principal-angle plots for the most dissimilar inactive sets encountered and clarify that the 55-98% iteration reductions are empirical observations for the tested problem families rather than a uniform guarantee. revision: partial

  2. Referee: [numerical results / benchmark descriptions] Although the abstract states that the method “preserves the high-fidelity inactive-set system and solves it to the prescribed CG tolerance,” the manuscript reports no quantitative verification (e.g., comparison of optimal-control objective values, constraint violation norms, or solution differences with and without deflation) that the deflation step does not degrade accuracy relative to a direct or AMG solve of the same inactive-set system.

    Authors: The deflation is used strictly as a preconditioner inside CG, which is converged to the same tolerance on the exact inactive-set matrix; therefore the computed solution satisfies the same residual criterion as a direct or AMG solve. Nevertheless, we acknowledge that explicit verification of the downstream optimal-control quantities would be valuable. We will add a new subsection or table in the numerical results section that compares the optimal objective values, maximum constraint violations, and L2 solution differences for selected parameter instances solved with and without deflation (using a direct solver as reference). revision: yes

Circularity Check

0 steps flagged

No circularity: reference eigenmodes and spectral arguments are independent of online instances

full rationale

The paper's core construction computes low eigenmodes once from a single full-domain reference Schur complement (independent of any parameter-dependent inactive set), then restricts them for A-DEF2 deflation. Justification rests on interlacing/perturbation arguments and principal-angle diagnostics rather than any fitted parameter, self-definition, or self-citation chain. No equation or claim reduces the reported iteration savings to a quantity defined by the method itself; the inactive-set system remains the exact high-fidelity operator. This is the common case of a self-contained algorithmic proposal whose central claims do not collapse by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the approach rests on standard properties of symmetric positive definite Schur complements and the effectiveness of low-mode deflation; no free parameters, ad-hoc entities, or paper-specific axioms are stated.

axioms (2)
  • standard math The inactive-set Schur complement remains symmetric positive definite for each active-set configuration.
    Invoked implicitly when stating that each inactive-set system is symmetric positive definite and suitable for CG.
  • domain assumption Low eigenmodes of the full-domain reference operator provide useful deflation information after restriction to subsets.
    Central to the spectral-coherence view and interlacing arguments mentioned in the abstract.

pith-pipeline@v0.9.1-grok · 5851 in / 1430 out tokens · 24227 ms · 2026-06-26T21:55:20.400709+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 7 canonical work pages · 1 internal anchor

  1. [1]

    arXiv preprint arXiv:2606.13429

    A scalable deflated conjugate gradient solver for the time-dependent pseudo-stress Stokes problem. arXiv preprint arXiv:2606.13429 . Candès, E.J., Romberg, J.K., Tao, T.,

  2. [2]

    Communications on Pure and Applied Mathematics 59, 1207–1223

    Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics 59, 1207–1223. Casas,E.,1993. Boundarycontrolofsemilinearellipticequationswithpointwisestateconstraints. SIAMJournalonControlandOptimization31, 993–1006. Choi, Y., Boncoraglio, G., Anderson, S., Amsallem, D., Farhat, C., 2020a. Gradient-based...

  3. [3]

    arXiv preprint arXiv:1909.11320

    Accelerating design optimization using reduced order models. arXiv preprint arXiv:1909.11320 . Choi, Y.S.,

  4. [4]

    SIAM Journal on Matrix Analysis and Applications 34, 495–518

    A framework for deflated and augmented Krylov subspace methods. SIAM Journal on Matrix Analysis and Applications 34, 495–518. doi:10.1137/110820713. Gong, W., Tan, Z.,

  5. [5]

    Gutknecht, M.H.,

    doi:10.1007/s10915-024-02747-3. Gutknecht, M.H.,

  6. [6]

    Applied Numerical Mathematics 41, 155–177

    BoomerAMG: A parallel algebraic multigrid solver and preconditioner. Applied Numerical Mathematics 41, 155–177. doi:10.1016/S0168-9274(01)00115-5. Hestenes,M.R.,Stiefel,E.,1952. Methodsofconjugategradientsforsolvinglinearsystems. JournalofresearchoftheNationalBureauofStandards 49, 409–436. Hesthaven, J.S., Rozza, G., Stamm, B.,

  7. [7]

    The American mathematical monthly 111, 157–159

    Cauchy’s interlace theorem for eigenvalues of hermitian matrices. The American mathematical monthly 111, 157–159. Ito,K.,Kunisch,K.,2003. Semi-smoothNewtonmethodsforstate-constrainedoptimalcontrolproblems. Systems&ControlLetters50,221–228. Kadeethum et al.:Preprint submitted to ElsevierPage 69 of 70 Online Spectral Deflation for State Constrained Optimal ...

  8. [8]

    SIAM Journal on Scientific Computing 35, A1847–A1879

    A trust-region algorithm with adaptive stochastic collocation for PDE optimization under uncertainty. SIAM Journal on Scientific Computing 35, A1847–A1879. Kunisch,K.,Volkwein,S.,2001. Galerkinproperorthogonaldecompositionmethodsforparabolicproblems. NumerischeMathematik90,117–148. Langer,U.,Steinbach,O.,Tröltzsch,F.,Yang,H.,2021. Space-timefiniteelementd...

  9. [9]

    Lehoucq, D

    ARPACK Users’ Guide: Solution of Large-Scale Eigenvalue Problems with Implicitly Restarted Arnoldi Methods. SIAM, Philadelphia, PA. doi:10.1137/1.9780898719628. Leugering, G., Engell, S., Griewank, A., Hinze, M., Rannacher, R., Schulz, V., Ulbrich, M., Ulbrich, S. (Eds.),

  10. [10]

    arXiv preprint arXiv:2605.07828

    NSPOD: Accelerating Krylov solvers via DeepONet-learned POD subspaces. arXiv preprint arXiv:2605.07828 . Li, Y., Zikatanov, L.T., Zuo, C.,

  11. [11]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Reduced Krylov basis methods for parametric partial differential equations. SIAM Journal on Numerical Analysis 63, 976–999. doi:10.1137/24M1661236. Li,Z.,Kovachki,N.B.,Azizzadenesheli,K.,Liu,B.,Bhattacharya,K.,Stuart,A.M.,Anandkumar,A.,2020. Fourierneuraloperatorforparametric partial differential equations. arXiv preprint arXiv:2010.08895 . Lu, L., Jin, P...

  12. [12]

    doi:10.1038/s42256-021-00302-5 Lu Lu, Raphaël Pestourie, Steven G

    Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators. Nature Machine Intelligence 3, 218–229. doi:10.1038/s42256-021-00302-5. McBane, S., Choi, Y.,

  13. [13]

    SIAM Journal on Numerical Analysis 24, 355–365

    Deflation of conjugate gradients with applications to boundary value problems. SIAM Journal on Numerical Analysis 24, 355–365. Parks,M.L.,deSturler,E.,Mackey,G.,Johnson,D.D.,Maiti,S.,2006. RecyclingKrylovsubspacesforsequencesoflinearsystems. SIAMJournal on Scientific Computing 28, 1651–1674. Patankar, S.,

  14. [14]

    Raissi, P

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics 378, 686–707. doi:10.1016/j.jcp.2018.10

  15. [15]

    OptimalsolversforPDE-constrainedoptimization

    Rees,T.,Dollar,H.S.,Wathen,A.J.,2010. OptimalsolversforPDE-constrainedoptimization. SIAMJournalonScientificComputing32,271–298. Ruge, J.W., Stüben, K.,

  16. [16]

    SIAM Journal on Scientific Computing 21, 1909–1926

    A deflated version of the conjugate gradient algorithm. SIAM Journal on Scientific Computing 21, 1909–1926. Schöberl,J.,Zulehner,W.,2007. SymmetricindefinitepreconditionersforsaddlepointproblemswithapplicationstoPDE-constrainedoptimization problems. SIAM Journal on Matrix Analysis and Applications 29, 752–773. Soodhalter, K.M., Szyld, D.B., Xue, F.,

  17. [17]

    arXiv preprint arXiv:2605.20639

    Time-dependent pde-constrained optimization via weak-form latent dynamics. arXiv preprint arXiv:2605.20639 . Tröltzsch, F.,

  18. [18]

    International Journal for Numerical Methods in Engineering 69, 2441–2468

    Large-scale topology optimization using preconditioned Krylov subspace methods with recycling. International Journal for Numerical Methods in Engineering 69, 2441–2468. Zahr,M.J.,Farhat,C.,2015. Progressiveconstructionofaparametricreduced-ordermodelforPDE-constrainedoptimization. InternationalJournal for Numerical Methods in Engineering 102, 1111–1135. Ka...