pith. sign in

arxiv: 2606.12337 · v1 · pith:5XUYLWIXnew · submitted 2026-06-10 · 🧮 math.NA · cs.LG· cs.NA

Adjoint Method versus Physics-Informed Neural Networks in PDE-Constrained Inverse Problems

Pith reviewed 2026-06-27 08:57 UTC · model grok-4.3

classification 🧮 math.NA cs.LGcs.NA
keywords inverse problemsPDE-constrained optimizationadjoint methodsphysics-informed neural networksBurgers equationDarcy flowAllen-Cahn equationNavier-Stokes
0
0 comments X

The pith

The representation of the unknown largely determines whether adjoint methods or PINNs perform better in PDE-constrained inverse problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a head-to-head comparison of adjoint-based optimization and physics-informed neural networks on identical PDE inverse problems. It matches domains, equations, observations, regularization, optimizers, parameterizations, and precision across four benchmarks. The central finding is that grid-based unknowns favor discrete adjoints while neural representations align naturally with PINNs, especially for constitutive modeling. For time-dependent cases, PINNs avoid heavy trajectory costs that burden adjoints, and a hybrid warm-start approach recovers adjoint accuracy more cheaply.

Core claim

From a common abstract formulation the authors instantiate both methods on identical domains, governing equations, observation models, and regularization terms while matching the optimizer, unknown parameterization, and arithmetic precision. The results show that the representation of the unknown largely determines the preferred method: grid-based fields favor the discrete adjoint, whereas neural representations are native to PINNs and relevant for closure and constitutive modeling. For time-dependent problems, adjoint inversion can be dominated by trajectory storage and differentiation, while PINNs provide satisfactory reconstructions at lower cost. A PINN-warm-started adjoint strategy then

What carries the argument

The fair comparison protocol that enforces identical domains, governing equations, observation models, regularization terms, optimizer, unknown parameterization, and arithmetic precision to isolate the effect of unknown representation.

If this is right

  • Grid-based fields make the discrete adjoint the stronger choice.
  • Neural representations make PINNs the natural fit for closure and constitutive modeling tasks.
  • Time-dependent problems favor PINNs when trajectory storage dominates adjoint cost.
  • A PINN-warm-started adjoint recovers full adjoint accuracy at lower overall cost.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Hybrid strategies may become standard when the unknown mixes grid and neural elements.
  • The cost advantage of PINNs in evolutionary problems could extend to other time-dependent systems such as reaction-diffusion or fluid-structure interaction.
  • Future work could test whether the representation preference persists when the unknown must satisfy additional physical constraints such as positivity or monotonicity.

Load-bearing premise

That the matched settings and four chosen benchmarks produce a comparison whose outcomes generalize beyond those specific cases.

What would settle it

A new benchmark in which a grid-based unknown is inverted and the PINN version outperforms the adjoint version, or vice versa, under the same matched protocol.

Figures

Figures reproduced from arXiv: 2606.12337 by Alessandro Alla, George Em Karniadakis, Zhen Zhang.

Figure 1
Figure 1. Figure 1: Test 1 - 1D unsteady Burgers. Convergence histories. Losses for PINN, objectives for adjoint, forcing errors, and the wall-clock times are shown. Horizontal axes indicate quasi-Newton iteration. of roughly five (189 s for PINNs versus 934 s for adjoint). Each Adjoint NN iteration propagates sensitivities backwards through the full Crank–Nicolson trajectory, and the parameter space is the high-dimensional s… view at source ↗
Figure 2
Figure 2. Figure 2: Test 1 - 1D unsteady Burgers. Recovered forcing field (top) and terminal￾state field (bottom). εu is calculated based on the terminal state u Nt . forward discretization allows. The parameter error, however, increases by an order of magnitude (εf = 1.55 × 10−3 ), reflecting the limited expressive resolution of the piecewise-linear basis for the smooth target f ⋆ (x) = sin(2x). This is a first concrete illu… view at source ↗
Figure 3
Figure 3. Figure 3: Test 2 - 2D Darcy. Convergence histories with noisy observations. Losses for PINN, objectives for adjoint, log-permeability errors, and the wall-clock times are shown. Horizontal axes indicate quasi-Newton iteration. The observation is sparse and noisy: the data y ∈ R 81 are pointwise pressure measurements at a 9×9 uniform grid of interior locations, obtained 15 [PITH_FULL_IMAGE:figures/full_fig_p015_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Test 2 - 2D Darcy. Recovered log-permeability frecovered (top) and pointwise error frecovered −f ⋆ (bottom), shown for the adjoint, PINN, and EKI estimators. The 9×9 observation grid is overlaid on the reference panel (green dots). by applying the exact P2 interpolation matrix P ∈ R 81×Nx to the state u ⋆ produced by the same solver under the reference log-permeability f ⋆ sampled on the forward mesh, and … view at source ↗
Figure 5
Figure 5. Figure 5: Test 2 - 2D Darcy. Recovered state urecovered (top) and pointwise error urecovered − u ⋆ (bottom). The convergence histories of the two gradient-based estimators are re￾ported in [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Test 3 - 3D Allen–Cahn. Convergence histories. Losses for PINN, the objective for adjoint, reaction-term errors, and the wall-clock times are shown. Horizontal axes indicate quasi-Newton iteration. a second-order IMEX BDF2/AB2 scheme with ∆t = 5 × 10−2 . Because the implicit operator is the constant-coefficient Neumann Laplacian, it is diagonalized exactly by the discrete cosine basis [21], so each step is… view at source ↗
Figure 7
Figure 7. Figure 7: Test 3 - 3D Allen–Cahn. Recovered reaction term frecovered (left) and point￾wise error frecovered − f ⋆ (right) on the physical range u ∈ [−1, 1]. joint method with the same MLP parameterization f(θf ) = Nθf (u) (“Adjoint NN”). The parameter network is an MLP with two hidden layers of width 32 and tanh activations, while the state network has three hidden layers of width 32 and the same activation. The con… view at source ↗
Figure 8
Figure 8. Figure 8: Test 3 - 3D Allen–Cahn. Slice (z = 0.336) of the recovered terminal state u Nt recovered (top) and pointwise error relative to u ⋆Nt (bottom). εu is calculated based on the full terminal state u Nt . adjoint from the PINN-recovered reaction network—the “Adjoint restart” his￾tory in [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Test 4 - 2D unsteady Navier–Stokes cylinder wake. Left: PINN composite loss (top) and trainable viscosity (bottom), Adam→SSBroyden transition marked. Right: adjoint reduced objective (top) and recovered viscosity (bottom), for both the baseline adjoint and the PINN-seeded restart. Wall-clock times are annotated in the legends. The convergence histories are reported in [PITH_FULL_IMAGE:figures/full_fig_p02… view at source ↗
Figure 10
Figure 10. Figure 10: Test 4 - 2D unsteady Navier–Stokes cylinder wake. Velocity-magnitude snapshots at t ∈ {0, 2.5, 5}. Left: FEM reference with the 16 wake probe locations (green crosses). Middle: inverse-PINN field at the converged viscosity. Right: pointwise absolute error. The probes constrain only a small wake window, yet drive the recovered viscosity to within 5.2 × 10−3 relative error. nitude faster than the adjoint—to… view at source ↗
read the original abstract

Inverse problems governed by partial differential equations (PDEs) are central to computational mechanics and are commonly solved by adjoint-based optimization, while physics-informed neural networks (PINNs) have emerged as a flexible alternative. Their relative performance remains difficult to assess because the two approaches are often compared under different formulations, parameterizations, optimizers, and regularization choices. We present a fair comparison of adjoint optimization and PINNs for PDE-constrained inverse problems. From a common abstract formulation, we instantiate both methods on identical domains, governing equations, observation models, and regularization terms, while matching the optimizer, unknown parameterization, and arithmetic precision wherever applicable. The benchmarks include unsteady Burgers, noisy Darcy permeability inversion, three-dimensional Allen--Cahn reaction identification, and unsteady Navier--Stokes viscosity identification. The results show that the representation of the unknown largely determines the preferred method: grid-based fields favor the discrete adjoint, whereas neural representations are native to PINNs and relevant for closure and constitutive modeling. For time-dependent problems, adjoint inversion can be dominated by trajectory storage and differentiation, while PINNs provide satisfactory reconstructions at lower cost. A PINN-warm-started adjoint strategy then recovers adjoint-level accuracy at substantially reduced cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to deliver a fair empirical comparison of discrete adjoint optimization versus PINNs for PDE-constrained inverse problems. Starting from a common abstract formulation, both methods are instantiated on identical domains, governing equations, observation models, and regularization terms, with optimizer, unknown parameterization, and arithmetic precision matched wherever applicable. Four benchmarks (unsteady Burgers, noisy Darcy, 3D Allen-Cahn, unsteady Navier-Stokes) are used to conclude that the representation of the unknown largely determines the preferred method: grid-based fields favor the discrete adjoint while neural representations are native to PINNs; adjoint methods suffer from trajectory storage costs in time-dependent cases, while a PINN-warm-started adjoint recovers accuracy at reduced cost.

Significance. If the comparisons are shown to be free of confounding mismatches, the work supplies concrete, benchmark-level guidance on method selection in computational mechanics and constitutive modeling. The multi-benchmark design and the hybrid warm-start strategy are positive features that could be useful to practitioners.

major comments (2)
  1. [Abstract] Abstract: the qualifier 'wherever applicable' for matching optimizer, parameterization, arithmetic precision, and gradient computation is load-bearing for the central claim that performance differences can be attributed to representation type alone. Without an explicit enumeration (in §2 or §3) of which cells of the comparison grid could not be matched and the quantitative effect of any residual differences, the attribution to representation remains unverified and the generalization beyond the four benchmarks is weakened.
  2. [Results] Results section (benchmarks): the claim that grid-based fields favor the discrete adjoint while neural representations favor PINNs requires explicit side-by-side runs in which the same representation is inverted by both methods (e.g., discrete adjoint applied to a neural parameterization of the unknown). If such cross-representation experiments are absent, the reported performance gaps cannot be cleanly separated from representation effects.
minor comments (2)
  1. [Methods] Clarify in the methods section how arithmetic precision and optimizer hyperparameters were enforced to be identical when one method uses automatic differentiation and the other uses discrete adjoints.
  2. [Figures] Figure captions should state the exact number of optimization iterations, wall-clock times, and error norms reported so that readers can reproduce the cost-accuracy trade-offs without re-deriving them from the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the attribution of performance differences. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the qualifier 'wherever applicable' for matching optimizer, parameterization, arithmetic precision, and gradient computation is load-bearing for the central claim that performance differences can be attributed to representation type alone. Without an explicit enumeration (in §2 or §3) of which cells of the comparison grid could not be matched and the quantitative effect of any residual differences, the attribution to representation remains unverified and the generalization beyond the four benchmarks is weakened.

    Authors: We agree that an explicit enumeration would make the 'wherever applicable' qualifier more transparent and support the central claim. In the revised manuscript we will add a dedicated subsection (or table) in §2 that lists, for each benchmark, every element of the comparison grid (optimizer, parameterization, arithmetic precision, gradient computation) together with whether it was matched and the quantitative effect of any residual mismatch. This directly addresses the concern about unverified attribution. revision: yes

  2. Referee: [Results] Results section (benchmarks): the claim that grid-based fields favor the discrete adjoint while neural representations favor PINNs requires explicit side-by-side runs in which the same representation is inverted by both methods (e.g., discrete adjoint applied to a neural parameterization of the unknown). If such cross-representation experiments are absent, the reported performance gaps cannot be cleanly separated from representation effects.

    Authors: The manuscript compares the two methods in their standard, computationally practical configurations (grid-based unknowns with the discrete adjoint; neural representations with PINNs), which is the setting most relevant to practitioners. Nevertheless, the referee correctly notes that cross-representation runs would further isolate representation effects from method effects. We will therefore add a short discussion subsection that (i) explains the technical obstacles to performing a full cross (e.g., memory and differentiability requirements when applying a discrete adjoint to a neural field) and (ii) reports any feasible preliminary cross-experiments or explicitly flags their absence as a limitation. This revision clarifies the scope of the attribution claim. revision: partial

Circularity Check

0 steps flagged

Empirical method comparison with no derivation chain

full rationale

This paper conducts an empirical benchmark comparison of adjoint optimization versus PINNs on four PDE inverse problems, matching conditions wherever applicable and reporting observed performance differences. No first-principles derivations, fitted predictions, or self-citation chains are present that reduce any central claim to its inputs by construction. The representation-based conclusion follows directly from the numerical results on external benchmarks rather than from any tautological redefinition or imported uniqueness theorem.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Empirical benchmarking study; no new mathematical axioms, free parameters, or invented entities are introduced.

pith-pipeline@v0.9.1-grok · 5749 in / 1100 out tokens · 15438 ms · 2026-06-27T08:57:06.491215+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 12 canonical work pages

  1. [1]

    J. L. Lions, Optimal Control of Systems Governed by Partial Differential Equations, Springer-Verlag, 1971

  2. [2]

    Hinze, R

    M. Hinze, R. Pinnau, M. Ulbrich, S. Ulbrich, Optimization with PDE Constraints, Vol. 23 of Mathematical Modelling: Theory and Applica- tions, Springer, Dordrecht, 2009.doi:10.1007/978-1-4020-8839-1

  3. [3]

    Duraisamy, G

    K. Duraisamy, G. Iaccarino, H. Xiao, Turbulence modeling in the age of data, Annual Review of Fluid Mechanics 51 (2019) 357–377.doi: 10.1146/annurev-fluid-010518-040547

  4. [4]

    G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang, L. Yang, Physics-informed machine learning, Nature Reviews Physics 3 (6) (2021) 422–440.doi:10.1038/s42254-021-00314-5. 27

  5. [5]

    Jameson, Aerodynamic design via control theory, Journal of Scientific Computing 3 (3) (1988) 233–260.doi:10.1007/BF01061285

    A. Jameson, Aerodynamic design via control theory, Journal of Scientific Computing 3 (3) (1988) 233–260.doi:10.1007/BF01061285

  6. [6]

    M. B. Giles, N. A. Pierce, An introduction to the adjoint approach to design, Flow, Turbulence and Combustion 65 (3) (2000) 393–415

  7. [7]

    C. S. Skene, M. F. Eggl, P. J. Schmid, A parallel-in-time approach for accelerating direct-adjoint studies, Journal of Computational Physics 429 (2021) 110033.doi:10.1016/j.jcp.2020.110033

  8. [8]

    R. D. Nzoyem, D. A. W. Barton, T. Deakin, A comparison of mesh- free differentiable programming and data-driven strategies for optimal control under PDE constraints, in: Proceedings of the SC ’23 Work- shops of the International Conference on High Performance Computing, Network, Storage, and Analysis (SC-W ’23), Association for Computing Machinery, 2023, ...

  9. [9]

    A. Alla, A. Pacifico, M. Palladino, A. Pesare, Online identification and control of pdes via reinforcement learning methods, Adv Comput Math 50 (2024)

  10. [10]

    A. Alla, A. Pacifico, A pod approach to identify and control pdes on- line through state dependent riccati equations, Dyn Games Appl 15 (1) (2025) 481–502

  11. [11]

    Raissi, P

    M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics 378 (2019) 686–707

  12. [12]

    A. G. Baydin, B. A. Pearlmutter, A. A. Radul, J. M. Siskind, Auto- matic differentiation in machine learning: a survey, Journal of Machine Learning Research 18 (1) (2018) 5595–5637

  13. [13]

    Kiyani, K

    E. Kiyani, K. Shukla, J. F. Urbán, J. Darbon, G. E. Karniadakis, Optimizing the optimizer for physics-informed neural networks and kolmogorov-arnold networks, Computer Methods in Applied Mechan- ics and Engineering 446 (2025) 118308. 28

  14. [14]

    A. Alla, G. Bertaglia, E. Calzola, A pinn approach for the online identi- fication and control of unknown pdes, Journal of Optimization Theory and Applications 206 (1) (2025) 8

  15. [15]

    Mowlavi, S

    S. Mowlavi, S. Nabi, Optimal control of PDEs using physics-informed neural networks, Journal of Computational Physics 473 (2023) 111731. doi:10.1016/j.jcp.2022.111731

  16. [16]

    Gratton, A

    S. Gratton, A. Sartenaer, P. L. Toint, Recursive trust-region methods for multiscale nonlinear optimization, SIAM Journal on Optimization 19 (1) (2008) 414–444.doi:10.1137/050623012

  17. [17]

    Bunks, F

    C. Bunks, F. M. Saleck, S. Zaleski, G. Chavent, Multiscale seismic wave- form inversion, Geophysics 60 (5) (1995) 1457–1473.doi:10.1190/1. 1443880

  18. [18]

    Zhang, S

    Z. Zhang, S. Liu, A. Alla, J. Darbon, G. E. Karniadakis, PINNs in PDE constrained optimal control problems: Direct vs indirect methods, arXiv preprint arXiv:2604.04920 (2026)

  19. [19]

    Glorot, Y

    X. Glorot, Y. Bengio, Understanding the difficulty of training deep feed- forward neural networks, in: Proceedings of the thirteenth international conference on artificial intelligence and statistics, JMLR Workshop and Conference Proceedings, 2010, pp. 249–256

  20. [20]

    M. A. Iglesias, K. J. Law, A. M. Stuart, Ensemble kalman methods for inverse problems, Inverse Problems 29 (4) (2013) 045001

  21. [21]

    Strang, The discrete cosine transform, SIAM Review 41 (1) (1999) 135–147.doi:10.1137/S0036144598336745

    G. Strang, The discrete cosine transform, SIAM Review 41 (1) (1999) 135–147.doi:10.1137/S0036144598336745

  22. [22]

    Raissi, Z

    M. Raissi, Z. Wang, M. S. Triantafyllou, G. E. Karniadakis, Deep learn- ing of vortex-induced vibrations, Journal of Fluid Mechanics 861 (2019) 119–137.doi:10.1017/jfm.2018.872

  23. [23]

    S. Cai, Z. Wang, S. Wang, P. Perdikaris, G. E. Karniadakis, Physics- informed neural networks for heat transfer problems, Journal of Heat Transfer 143 (6) (2021) 060801.doi:10.1115/1.4050542. 29 Appendix A. Implementation Details This appendix collects the discretization, parameterization, regulariza- tion, and optimization details deferred from Section 3...