pith. sign in

arxiv: 2605.28858 · v1 · pith:4SNXFIQBnew · submitted 2026-05-19 · 💻 cs.CE · cs.LG· math-ph· math.MP

An End-to-End PyTorch Interface for Differentiable PDE Solvers: A RANS Model-Correction Study

Pith reviewed 2026-06-30 17:50 UTC · model grok-4.3

classification 💻 cs.CE cs.LGmath-phmath.MP
keywords differentiable PDE solversPyTorchimplicit layersRANS equationsturbulence modelingmodel correctioninverse problemsdata assimilation
0
0 comments X

The pith

Reformulating a differentiable PDE solver as an implicit PyTorch layer lets users add and optimize a trainable correction term for inverse problems such as RANS closure modeling.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to embed a baseline PDE solver inside PyTorch by treating the nonlinear residual equation as an implicit layer, then adding a parametrized correction that can be trained end-to-end with automatic differentiation. This creates a single workflow for tasks that mix physics solvers with data-driven adjustments, demonstrated on compressible Reynolds-Averaged Navier-Stokes flows. A reader would care because the method removes the need to derive custom gradients or run separate adjoint codes when fitting model parameters to data. The approach is tested on two cases: tuning a production-term coefficient against LES data for the NASA wall-mounted hump, and reconstructing the Spalart-Allmaras eddy-viscosity field for a turbine blade.

Core claim

By writing the corrected residual as R(w) + f_phi(w) = 0 and solving it as an implicit layer, the parameters phi of the correction can be optimized directly inside arbitrary PyTorch loss functions; the baseline solver supplies the state w and the autograd graph propagates gradients through the nonlinear solve without manual derivation or finite differences. The method is shown to work for both scalar parameter fitting and full-field neural-network corrections on the RANS equations.

What carries the argument

The implicit-layer reformulation of the residual equation R(w) = 0 together with an additive differentiable correction f_phi(w), which lets PyTorch autograd flow through the solver.

If this is right

  • The same interface can be used for data assimilation by optimizing parameters to match observed flow fields.
  • Turbulence closure terms or portions of them can be replaced by neural networks whose weights are trained inside the solver loop.
  • The workflow extends directly to other physics-informed inverse problems that combine PDE residuals with trainable components.
  • Spatial fields such as eddy viscosity can be reconstructed by treating their values at mesh points as trainable parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could replace custom adjoint implementations in many engineering optimization loops if the baseline solver is already available in differentiable form.
  • Integration into larger machine-learning pipelines would allow joint training of geometry parameters, boundary conditions, and closure models in one graph.
  • The method might be tested on other equation types such as heat conduction or linear elasticity to check whether the implicit-layer trick remains stable.

Load-bearing premise

The baseline PDE solver must already be fully differentiable with respect to its inputs and parameters.

What would settle it

Run the framework on a simple Poisson equation whose exact solution is known, add a deliberately incorrect correction term, and check whether the optimizer can recover the correct phi values to within a small tolerance.

Figures

Figures reproduced from arXiv: 2605.28858 by C\'edric Content, Denis Sipp (MONHADE), Gianmarco Farro, Luca Saverio, Michele Alessandro Bucci.

Figure 1
Figure 1. Figure 1: Diagram representing the explicit RANS layer. optimized memory access patterns, which results in good computational performance and facilitates the implementation of high-order discretization schemes. It should be noted, however, that the methods used in this work, applied here to structured meshes using the BROADCAST solver, could also be implemented for unstructured ones. To do so, one would need to prov… view at source ↗
Figure 2
Figure 2. Figure 2: Diagram representing the operations inside of the implicit layer (light-gray boxes) appearing in (2.26) and (2.27). In the case of the turbulent closure problem with the eddy viscosity 𝜇𝑡(𝒘), a Convolutional Neural Network (CNN) (LeCun and Bengio 1998; O’Shea and Nash 2015) was implemented to represent this relation. Its architecture combines convolutions with varying kernel sizes, with the output excludin… view at source ↗
Figure 3
Figure 3. Figure 3: Outline of the geometry of the 2D NASA Wall-Mounted Hump with the corresponding bound￾ary conditions. • Then (§ 4.3), we demonstrate that, given the SA solution for the VKI LS-59 case, 𝒘 𝑆𝐴 = [𝜌, 𝜌𝑼, 𝜌𝐸], we are able to satisfactorily reconstruct the SA eddy viscosity field 𝜇 𝑆𝐴 𝑡 using the explicit residual minimization strategy by tuning 𝜶 = 𝜇𝑡 . • In § 4.4, we first introduce a custom dataset of flow fi… view at source ↗
Figure 4
Figure 4. Figure 4: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with implicit layer: behavior of the normalized cost-function J /J0 for 50 optimizer iterations using the L-BFGS PyTorch optimizer. The procedure is halted once the variation in the loss function satisfies a specified tolerance threshold. Specifically, the norm of the loss function starts at 3.245 × 10−4 , and at the end of the computation the reac… view at source ↗
Figure 5
Figure 5. Figure 5: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with implicit layer: represen￾tation of the error fields for the velocities computed with the SA turbulence model (the baseline) with respect to the LES solutions, normalized by the reference freestream velocity 𝑈∞. It is not possible to show the whole computational domain depicted in [PITH_FULL_IMAGE:figures/full_fig_p018_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with implicit layer: repre￾sentation of the error fields for the velocities at the final iteration with respect to the LES solutions, normalized by the reference freestream velocity 𝑈∞. It is not possible to show the whole computational domain depicted in [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with implicit layer [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with implicit layer: comparison between wall-quantities computed from LES data Uzun and Malik 2018, RANS using the baseline SA model and the corrected RANS model using the optimized 𝛽. 0 0 00 0 000 0 00 0 000        0 0 0 00  0 [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with explicit layer: behavior of the normalized cost-function J /J0 consisting here in the residual of the corrected RANS system for 2000 optimizer iterations using the SGD PyTorch optimizer. In this case, the proposed optimiza￾tion procedure consistently reduces this residual by several orders of magnitude, reflecting a drastically improved consis… view at source ↗
Figure 10
Figure 10. Figure 10: Data assimilation with 𝛽 on Wall-Mounted Hump case handled with explicit layer: 𝛽 𝑜 𝑝𝑡 obtained by minimizing the residual of RANS equations. It is possible to see that it is practically exact with respect to the results showed in [PITH_FULL_IMAGE:figures/full_fig_p022_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Representation of the outline of the VKI LS-59 structured mesh with the boundary conditions imposed in the CFD solver [PITH_FULL_IMAGE:figures/full_fig_p022_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Solution of baseline SA model on VKI LS-59 case with Newton-based relaxation method: isentropic Mach Number at the wall computed with the BROADCAST solver with respect to experi￾mental results extracted from Jagodzińska et al 2024. By then solving the RANS equations using the relaxation method (see § 2.1.2), it is possible to compare the results with reference or experimental data. To this end, the isentr… view at source ↗
Figure 13
Figure 13. Figure 13: Data assimilation with 𝜇𝑡 on VKI LS-59 case handled with explicit layer: results of the residual minimization strategy on the VKI LS-59 case with respect to SA results. are localized in the wake region downstream of the blade, where thin convective structures and strong eddy viscosity gradients increase the sensitivity of the prediction. Nevertheless, the overall distribution and magnitude of the eddy vis… view at source ↗
Figure 14
Figure 14. Figure 14: Superposition of all generated blade geometries with the original blade profile and control points (from Hercus and Cinnella 2011). Optimization with respect to an eddy viscosity model depending on the state 𝒘 is now performed: 𝜇𝑡,𝝑 (𝒘) = M𝝑 (𝒘) , (4.3) The model M𝝑 can be any kind of NN. For this study, a CNN is selected since we are dealing with struc￾tured meshes. CNNs exploit spatial correlations thro… view at source ↗
Figure 15
Figure 15. Figure 15: Closure with a CNN for 𝜇𝑡 on custom dataset handled with explicit layer: eddy viscosity prediction on 2 cases extracted from the validation set. boundary conditions information and modifications of the loss function, for instance through importance sampling based on velocity gradients or by adding a regularization term directly on the turbulent viscos￾ity. Finally, much care must be taken to ensure that t… view at source ↗
Figure 16
Figure 16. Figure 16: Closure with a CNN for 𝜇𝑡 on custom dataset handled with explicit layer: ensemble statistics of the predicted turbulent viscosity field for case 676. The mean field highlights the robustness of the prediction, while the standard deviation identifies regions of higher variability across models. 5. Discussions and conclusions The main objective of this work was to demonstrate the capability of the proposed … view at source ↗
Figure 17
Figure 17. Figure 17: Histograms of the input parameters for the converged simulations. Despite the removal of some cases due to lack of convergence, the retained simulations remain well distributed across the prescribed parameter ranges, ensuring a representative coverage of the design space. To further investigate the sensitivity of the output variables with respect to the input parameters, a variance-based global sensitivit… view at source ↗
Figure 18
Figure 18. Figure 18: Sobol sensitivity indices for selected principal modes of the conservative variables. In red the 0 𝑡 ℎ order indices and in green the 1 𝑠𝑡 order. The first four modes capture dominant flow features, while subsequent modes highlight the contribution of input parameters to higher-order variations. C. Computing the gradients for the backward operation in PyTorch Section 2.2.2 details how the gradients must b… view at source ↗
Figure 19
Figure 19. Figure 19: Schematic illustration of the convolutional kernel. The light blue cells denote the physical domain and the white cells the ghost cells. The convolutional kernels (yellow, red, and green) are chosen smaller than the stencil of the underlying numerical scheme, ensuring locality and preventing the intro￾duction of additional couplings. This preserves the sparsity pattern of the Jacobian matrix. Moreover, th… view at source ↗
read the original abstract

This work presents an end-to-end strategy for solving inverse problems constrained by Partial Differential Equations within a fully differentiable Machine Learning framework. The proposed formulation provides a unified and user-friendly methodology applicable to a wide range of problems, from data assimilation to closure modeling. Our approach combines a baseline differentiable PDE solver, which predicts the state w from the nonlinear system $R(w) = 0$, with a generic additive, parametrized, and differentiable correction $f_\phi(w)$, with trainable parameters $\phi$. We show how to optimize phi within a fully differentiable Python workflow by reformulating the PDE as an implicit layer, enabling its integration into arbitrary objective functions, while leveraging PyTorch's automatic differentiation graph. The method is demonstrated on the Reynolds-Averaged Navier-Stokes equations for compressible flows, where the closure term, or a portion of it, is modeled using trainable parameters or a Neural Network. The first application considers the 2D NASA Wall-Mounted Hump test case, where a production-term parameter is optimized against time-averaged LES data. A second application is carried out on the VKI LS-59 turbine blade, where the Spalart-Allmaras eddy viscosity field is reconstructed through the optimization of a trainable spatial field. A dataset is generated starting from the VKI LS-59 turbine blade geometry using the differentiable BROADCAST solver with the Spalart-Allmaras turbulence model. The results highlight the flexibility of the framework, showing its applicability beyond turbulence modeling to a broader class of physics-informed PDE-constrained problems with data-driven components.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript presents an end-to-end PyTorch framework that reformulates a differentiable PDE solver as an implicit layer to enable gradient-based optimization of an additive correction term f_φ(w) for the residual R(w)=0. This is applied to RANS turbulence modeling, with two demonstrations: optimization of a production-term parameter in the NASA Wall-Mounted Hump case against time-averaged LES data, and reconstruction of the Spalart-Allmaras eddy viscosity field on the VKI LS-59 turbine blade using the BROADCAST solver. The approach is positioned as a unified interface for data assimilation and closure modeling problems.

Significance. If the quantitative results support the claims, the work offers a practical, user-friendly interface for embedding existing differentiable PDE solvers into PyTorch workflows via implicit layers. This could lower the barrier for physics-informed inverse problems in CFD without requiring custom adjoint derivations, and the two RANS examples illustrate applicability to parameter fitting and field reconstruction.

major comments (2)
  1. [Abstract / Results] Abstract and results sections: The demonstrations on the NASA hump and VKI LS-59 cases are described as successful, but the provided text supplies no quantitative error metrics (e.g., L2 norms, drag/lift errors), convergence histories, or ablation studies comparing corrected vs. baseline solutions. This information is load-bearing for the central claim that the implicit-layer formulation enables effective model correction.
  2. [Method / Implicit Layer] Implicit layer construction (around Eq. for R(w)=0): The formulation assumes the baseline solver (BROADCAST) is fully differentiable w.r.t. inputs and parameters so that gradients flow through the nonlinear solve without finite differences. The manuscript should explicitly verify this property holds for the compressible RANS system with the Spalart-Allmaras model in the two test cases, as it is a prerequisite for the end-to-end claim.
minor comments (2)
  1. [Abstract] Notation: The correction is introduced as both f_φ(w) and a 'trainable spatial field' in the VKI case; a single consistent symbol and a brief statement of how the spatial field is parametrized would improve clarity.
  2. [Abstract / Conclusions] The claim of applicability 'to a broader class of physics-informed PDE-constrained problems' is stated but not supported by any non-RANS example; either qualify the statement or move it to future work.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and for recognizing the potential of the proposed PyTorch interface. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and results sections: The demonstrations on the NASA hump and VKI LS-59 cases are described as successful, but the provided text supplies no quantitative error metrics (e.g., L2 norms, drag/lift errors), convergence histories, or ablation studies comparing corrected vs. baseline solutions. This information is load-bearing for the central claim that the implicit-layer formulation enables effective model correction.

    Authors: We agree that quantitative metrics are essential to substantiate the claims. The revised manuscript adds L2-norm errors on velocity and pressure, drag and lift coefficient errors for the hump case, optimization convergence histories, and direct baseline-versus-corrected comparisons (including an ablation on the correction term) to both the abstract and a dedicated results subsection. These additions are supported by the existing data generated with the BROADCAST solver. revision: yes

  2. Referee: [Method / Implicit Layer] Implicit layer construction (around Eq. for R(w)=0): The formulation assumes the baseline solver (BROADCAST) is fully differentiable w.r.t. inputs and parameters so that gradients flow through the nonlinear solve without finite differences. The manuscript should explicitly verify this property holds for the compressible RANS system with the Spalart-Allmaras model in the two test cases, as it is a prerequisite for the end-to-end claim.

    Authors: BROADCAST implements the compressible RANS equations with the Spalart-Allmaras model using a fully differentiable residual evaluation, so that the implicit-layer formulation propagates exact gradients via automatic differentiation. To make this explicit, the revised methods section now includes a verification subsection that compares implicit-layer gradients against central finite differences for small perturbations of the production parameter (hump) and the eddy-viscosity field (VKI blade), confirming agreement within numerical tolerance for both cases. revision: yes

Circularity Check

0 steps flagged

No circularity: framework demonstration relies on external differentiability and intended data-fitting use case

full rationale

The paper presents a software interface that wraps an existing differentiable PDE solver (BROADCAST) as a PyTorch implicit layer to enable gradient-based optimization of an additive correction term f_φ(w) for inverse RANS problems. The two demonstrations optimize a production-term parameter against LES data and reconstruct an eddy-viscosity field; both are explicit inverse-problem fits to supplied data, which is the stated purpose rather than an out-of-sample prediction. No equation or claim reduces by construction to its own inputs, no uniqueness theorem is invoked, and no self-citation chain is load-bearing for the central methodology. The derivation is self-contained once the baseline solver's differentiability is granted externally.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is limited to elements explicitly named in the abstract. The trainable correction phi is the central free parameter; the differentiability of the baseline solver is the key domain assumption.

free parameters (1)
  • phi
    Trainable parameters of the additive correction f_phi(w) that are optimized against data.
axioms (1)
  • domain assumption The baseline PDE solver that computes w from R(w)=0 is differentiable with respect to its inputs and parameters.
    Required for the implicit-layer formulation to integrate with PyTorch autodiff.

pith-pipeline@v0.9.1-grok · 5843 in / 1324 out tokens · 21816 ms · 2026-06-30T17:50:10.234020+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references · 12 canonical work pages

  1. [1]

    Journal of Applied Fluid Mechanics 17(12), 2514–2532

    Aly AM (2024) Deep Learning-Based Eddy Viscosity Modeling for Improved RANS Simulations of Wind Pressures on Bluff Bodies. Journal of Applied Fluid Mechanics 17(12), 2514–2532. ISSN: 1735-3572. doi: 10.47176/jafm.17.12.2770 . eprint: https://www.jafmonline.net/article_2512_f958e5979d404a47b904e6f736c36eed.pdf . https://www.jafmonline.net/article_ 2512.htm...

  2. [2]

    ACM Transactions on Mathematical Software 45 (1), 2:1–2:26

    Performance and Scalability of the Block Low-Rank Multifrontal Factorization on Multicore Architectures. ACM Transactions on Mathematical Software 45 (1), 2:1–2:26. Amestoy P, Duff IS, Koster J and L’Excellent JY(2001) A Fully Asynchronous Multifrontal Solver Using Distributed Dynamic Scheduling. SIAM Journal on Matrix Analysis and Applications 23(1), 15–...

  3. [3]

    https: //doi.org/10.21105/joss.07602

    doi: 10.21105/joss.07602 . https: //doi.org/10.21105/joss.07602. Bai S, Kolter JZ and Koltun V (2019) Deep equilibrium models. Advances in Neural Information Processing Systems

  4. [4]

    In 16th aerospace sciences meeting,

    Baldwin B and Lomax H (1978) Thin-layer approximation and algebraic model for separated turbulentflows. In 16th aerospace sciences meeting,

  5. [5]

    arXiv: 1707.08552 [math.OC]

    Berahas AS and Takáč M (2019) A Robust Multi-Batch L-BFGS Method for Machine Learning. arXiv: 1707.08552 [math.OC]. Available at https://arxiv.org/abs/1707.08552. Boussinesq J (1877) Essai sur la théorie des eaux courantes. Mémoires présentés par divers savants, Paris, France: Académie des Sciences, 1–680. Brantner B, Romemont G de, Kraus M and Li Z (2024...

  6. [6]

    Chu M and Qian W (2024) Physics Constrained Deep Learning For Turbulence Model Uncertainty Quantification

    doi: 10.3389/fphy.2024.1347657. Chu M and Qian W (2024) Physics Constrained Deep Learning For Turbulence Model Uncertainty Quantification . arXiv: 2405.16554 [physics.flu-dyn]. Available at https://arxiv.org/abs/2405.16554. REFERENCES 31 Cinnella P and Content C (2016) High-order implicit residual smoothing time scheme for direct and large eddy simulation...

  7. [7]

    In ASME-JSME-KSME 2011 Joint Fluids Engineering Conference, AJK

    Robust shape optimization of uncertain dense gas flows through a plane turbine cascade. In ASME-JSME-KSME 2011 Joint Fluids Engineering Conference, AJK

  8. [8]

    Huang K (2008) Statistical Mechanics, 2nd Ed

    doi: 10.1115/AJK2011- 05007. Huang K (2008) Statistical Mechanics, 2nd Ed . Wiley India Pvt. Limited. ISBN: 9788126518494. Available at https: // books. google.fr/books?id=ZHl8HLk-K3AC. Jagodzińska I, Olszański B, Gumowski K and Kubacki S (2024) Experimental investigation of subsonic and transonic flows through a linear turbine cascade. European Journal o...

  9. [9]

    Raissi, P

    ISSN: 0021-9991. doi: https://doi.org/10.1016/j.jcp.2018.10.045 . https://www.sciencedirect.com/science/article/pii/ S0021999118307125 . Romémont G de, Renac F, Chinesta F, Nunez J and Gueyffier D (2025) Data-Driven Adaptive Gradient Recovery for Unstructured Finite Volume Computations. arXiv: 2507.16571 [math.NA]. Available at https://arxiv.org/abs/2507.1...

  10. [10]

    Available at https://arxiv.org/abs/2412.07541

    07541 [math.NA]. Available at https://arxiv.org/abs/2412.07541. Ruder S (2016) An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747. Rumsey NC (2020) The Spalart-Allmaras Turbulence Model. https://turbmodels.larc.nasa.gov/spalart.html. Accessed: 2023-07-

  11. [11]

    SIAM Journal on Scientific and Statistical Computing 7(3), 856–869

    Saad Y and Schultz MH (1986) GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Sys- tems. SIAM Journal on Scientific and Statistical Computing 7(3), 856–869. doi: 10.1137/0907058. eprint: https://doi.org/10. 1137/0907058. https://doi.org/10.1137/0907058. Sanhueza RD, Smit S, Peeters J and Pecnik R (2022) Machine Learning for ...

  12. [12]

    Theoretical and Applied Mechanics Letters 11 (4)

    End-to-end differentiable learning of turbulence models from indirect observations. Theoretical and Applied Mechanics Letters 11 (4). ISSN: 20950349. doi: 10.1016/j.taml.2021.100280. Talagrand O (1997) Assimilation of observations, an introduction. Journal of the Meteorological Society of Japan 75(1B), 191–

  13. [13]

    arXiv: 2101.04413 [math.OC]

    Tankaria H, Sugimoto S and Yamashita N(2021) A Regularized Limited Memory BFGS method for Large-Scale Unconstrained Optimization and its Efficient Implementations. arXiv: 2101.04413 [math.OC]. Available athttps://arxiv.org/abs/2101.04413. Torquato S et al (2002) Random heterogeneous materials: microstructure and macroscopic properties. vol

  14. [14]

    Uzun A and Malik MR (2018) Large-Eddy Simulation of Flow over a Wall-Mounted Hump with Separation and Reattachment

    Springer. Uzun A and Malik MR (2018) Large-Eddy Simulation of Flow over a Wall-Mounted Hump with Separation and Reattachment. AIAA Journal 56(2), 715–730. doi: 10.2514/1.J056397. eprint: https://doi.org/10.2514/1.J056397. https://doi.org/10.2514/1. J056397. Wilcox D (2006) Turbulence Modeling for CFD, 3rd ed. DCW Industries. Wu JL, Xiao H and Paterson E (3

  15. [15]

    Physical Review Fluids 7 (3)

    Physics-informed machine learning approach for augmenting turbulence models: A comprehensive framework. Physical Review Fluids 7 (3). ISSN: 2469990X. doi: 10.1103/PhysRevFluids.3.074602. Zhang XL, Xiao H, Jee S and He G (2023) Physical interpretation of neural network-based nonlinear eddy viscosity models. Aerospace Science and Technology 142, 108632. ISS...

  16. [16]

    saved_tensors 12 x_sol , parameters = saved 13 JT = ctx.JT 14 grad_x = ADJOINT_solver (JT , grad_output ) 15 f = F(x_sol , parameters ) 16 grads_params = torch

    8 return x_sol 9 @staticmethod 10 def backward (ctx , grad_output ): 11 saved = ctx. saved_tensors 12 x_sol , parameters = saved 13 JT = ctx.JT 14 grad_x = ADJOINT_solver (JT , grad_output ) 15 f = F(x_sol , parameters ) 16 grads_params = torch . autograd . grad ( outputs =f, inputs = parameters , grad_outputs = grad_x ) 17 return grad_x , grads_params , ...