A differentiable software suite for accelerated simulation of turbulent flows
Pith reviewed 2026-05-10 03:25 UTC · model grok-4.3
The pith
A Julia package for incompressible Navier-Stokes simulations supplies hand-written adjoint kernels that make the full solver differentiable for neural network training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The package achieves efficient reverse-mode automatic differentiation through the entire solver by using hand-written adjoint kernels for all discrete operators, combined with matrix-free, hardware-agnostic implementations compiled for multi-threaded CPU or GPU execution, allowing neural network closure models to be trained a-posteriori while embedded in large-eddy simulation.
What carries the argument
Hand-written adjoint kernels for the discrete staggered-grid operators, which supply the exact discrete derivatives required for reverse-mode automatic differentiation of the full solver.
If this is right
- Neural network closure models can be trained end-to-end inside large-eddy simulations without needing separate forward and adjoint solvers.
- Direct numerical simulations of incompressible turbulence reach 840 cubed resolution in double precision on a single GPU.
- Sensitivity analysis and optimization of flow quantities become practical because the entire discrete operator is differentiable.
- The single-source kernel design reduces the effort required to maintain or extend the code across CPU and GPU platforms.
Where Pith is reading between the lines
- The same pattern of hand-written adjoints could be added to other existing CFD codes to make them compatible with embedded machine-learning models.
- The differentiability opens the door to solving inverse problems that infer boundary data or model parameters from measured flow statistics.
- Extending the memory optimizations to distributed multi-GPU runs would allow even larger grids while preserving the single-source code structure.
Load-bearing premise
The hand-written adjoint kernels compute derivatives that match the discrete operators closely enough that they do not introduce significant errors into the simulation or into the training of any embedded neural network.
What would settle it
Compute the gradient of a simple output functional with respect to an input parameter using the adjoint kernels, then recompute the same gradient with finite differences on an identical small grid; a mismatch larger than discretization error would falsify the claim that the adjoints are accurate.
Figures
read the original abstract
We present IncompressibleNavierStokes.jl, an open-source Julia package for solving the incompressible Navier--Stokes equations on staggered Cartesian grids. The package features matrix-free, hardware-agnostic kernels that are compiled from a single source for multi-threaded CPU or GPU execution, and hand-written adjoint kernels for all discrete operators, enabling efficient reverse-mode automatic differentiation through the entire solver. This differentiability allows neural network closure models to be trained a-posteriori while embedded in a large-eddy simulation. Memory optimizations permit double-precision direct numerical simulations at resolutions up to $840^3$ on a single GPU. The software design, numerical methods, hardware performance, and integration of neural network closure models are described, and results for turbulent channel flow are validated against reference data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces IncompressibleNavierStokes.jl, an open-source Julia package for solving the incompressible Navier-Stokes equations on staggered Cartesian grids. It features matrix-free, hardware-agnostic kernels compiled from a single source for multi-threaded CPU or GPU execution, hand-written adjoint kernels for all discrete operators to enable reverse-mode automatic differentiation through the full solver, memory optimizations supporting double-precision DNS up to 840^3 on a single GPU, and integration of neural network closure models for a-posteriori training in large-eddy simulations, with validation results for turbulent channel flow against reference data.
Significance. If the differentiability and performance claims hold, the work provides a practical open-source platform for accelerated turbulent flow simulations that embeds neural network models directly in the solver loop. The single-source kernel design and high-resolution single-GPU capability are concrete strengths for reproducible CFD research.
major comments (2)
- Abstract: The central claim that hand-written adjoint kernels for all discrete operators enable reliable reverse-mode AD through the solver is load-bearing for the a-posteriori NN training application, yet no verification is reported (e.g., the adjoint consistency test v·(J u) = (Jᵀ v)·u within machine epsilon for random vectors u,v on each operator, or a finite-difference gradient check on the full solver). Without such checks, implementation errors in the adjoints could silently corrupt gradients used for NN closure training.
- Validation section (implied by abstract): The turbulent channel flow results are stated to be validated against reference data, but the manuscript provides no quantitative error metrics, grid resolutions used in the comparison, or assessment of how memory optimizations affect accuracy at high Reynolds numbers; this information is required to substantiate the accuracy claims that underpin the package's utility.
minor comments (1)
- The abstract and methods description would benefit from explicit cross-references to the specific discrete operators (e.g., divergence, gradient, convection) for which adjoints were implemented.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and positive assessment of the work's significance. We address each major comment below, indicating the revisions that will be incorporated to strengthen the manuscript.
read point-by-point responses
-
Referee: Abstract: The central claim that hand-written adjoint kernels for all discrete operators enable reliable reverse-mode AD through the solver is load-bearing for the a-posteriori NN training application, yet no verification is reported (e.g., the adjoint consistency test v·(J u) = (Jᵀ v)·u within machine epsilon for random vectors u,v on each operator, or a finite-difference gradient check on the full solver). Without such checks, implementation errors in the adjoints could silently corrupt gradients used for NN closure training.
Authors: We agree that explicit verification of the adjoint kernels is essential to substantiate the reliability of reverse-mode AD for a-posteriori NN training. The manuscript describes the hand-written adjoints but does not report numerical checks. We will add a dedicated subsection (likely in the Numerical Methods or Software Design section) presenting adjoint consistency tests of the form v·(J u) = (Jᵀ v)·u within machine epsilon for random vectors on each operator, along with a finite-difference gradient check on the full solver for a representative test case. These additions will directly address the concern regarding potential silent errors in the gradients. revision: yes
-
Referee: Validation section (implied by abstract): The turbulent channel flow results are stated to be validated against reference data, but the manuscript provides no quantitative error metrics, grid resolutions used in the comparison, or assessment of how memory optimizations affect accuracy at high Reynolds numbers; this information is required to substantiate the accuracy claims that underpin the package's utility.
Authors: We acknowledge that the current manuscript states validation against reference data without providing the requested quantitative details. We will revise the Validation section to include the specific grid resolutions employed in the comparisons, quantitative error metrics (such as L2 norms of mean velocity profiles and skin-friction coefficients relative to reference DNS), and a short assessment of the impact of the memory optimizations on accuracy, particularly at the higher Reynolds numbers considered. These changes will better support the accuracy claims. revision: yes
Circularity Check
Software implementation paper with no mathematical derivation chain
full rationale
The manuscript is a description of the IncompressibleNavierStokes.jl package, its matrix-free kernels, hand-written adjoints, memory optimizations, and integration with neural-network closures for a-posteriori training. No first-principles derivation, parameter fitting, or predictive claim is presented that could reduce to its own inputs by construction. Validation is performed against external reference data for turbulent channel flow. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results appear in the provided text. The hand-written adjoint kernels are an implementation choice whose correctness is an engineering verification issue rather than a circularity issue in any derivation.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Incompressible Navier-Stokes equations accurately model the fluid flows of interest.
- domain assumption The discrete operators on staggered grids can be differentiated via hand-written adjoints.
Reference graph
Works this paper leans on
-
[1]
IEEEStandardforFloating-PointArithmetic.IEEEStd 754-2019(RevisionofIEEE754-2008),pages1–84,July 2019
work page 2019
-
[2]
Syver Døving Agdestein and Benjamin Sanderse. Dis- cretizefirst,filternext: Learningdivergence-consistent closure models for large-eddy simulation.Journal of ComputationalPhysics,522:113577,February2025
-
[3]
Asci- entificsoftwaresuiteforacceleratedsimulationoftur- bulentflows.preparation,February2026
SyverDøvingAgdesteinandBenjaminSanderse. Asci- entificsoftwaresuiteforacceleratedsimulationoftur- bulentflows.preparation,February2026
-
[4]
Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah. Julia: A Fresh Approach to Numerical Computing.SIAMReview,February2017. 20
-
[5]
ValentinChuravy,DilumAluthge,AntonSmirnov,Ju- lian Samaroo, James Schloss, Lucas C Wilcox, Simon Byrne, Tim Besard, Maciej Waruszewski, Ali Ramad- han, Meredith, Simeon Schaub, Navid C. Constanti- nou, Jake Bolewski, Max Ng, William Moses, Carsten Bauer, Ben Arthur, Charles Kawczynski, Chris Hill, ChristopherRackauckas,GunnarFarnebäck,Hendrik Ranocha,Jame...
-
[6]
SimonDanischandJuliusKrumbiegel. Makie.jl: Flexi- blehigh-performancedatavisualizationforJulia.Jour- nal of Open Source Software, 6(65):3349, September 2021
work page 2021
-
[7]
Hardware Trends Impacting Floating-PointComputationsInScientificApplications, December2024
JackDongarra,JohnGunnels,HarunBayraktar,Azzam Haidar, and Dan Ernst. Hardware Trends Impacting Floating-PointComputationsInScientificApplications, December2024
-
[8]
FrancisH.HarlowandJ.EddieWelch. NumericalCal- culation of Time-Dependent Viscous Incompressible FlowofFluidwithFreeSurface.ThePhysicsofFluids, 8(12):2182–2189,December1965
-
[9]
Nicholas J. Higham and Theo Mary. Mixed precision algorithmsinnumericallinearalgebra.ActaNumerica, 31:347–414,May2022
-
[10]
Rik Hoekstra and Wouter Edeling. Reduced subgrid scale terms in three-dimensional turbulence.Com- puter Methods in Applied Mechanics and Engineering, 449:118506,February2026
-
[11]
Don’t Unroll Adjoint: Differentiating SSA-FormPrograms,March2019
Michael Innes. Don’t Unroll Adjoint: Differentiating SSA-FormPrograms,March2019
-
[12]
Anna Ivagnes, Toby Van Gastelen, Syver Døving Agdestein,BenjaminSanderse,GiovanniStabile,andGi- anluigiRozza. Anewdata-drivenenergy-stableevolve- filter-relaxmodelforturbulentflowsimulation.Com- puter Methods in Applied Mechanics and Engineering, 450:118654,March2026
-
[13]
Effects of lower floating-point precision on scale-resolving numerical simulations of turbulence
MartinKarp,RonithStanly,TimofeyMukha,LucaGal- imberti, Siavash Toosi, Hang Song, Lisandro Dalcin, SalehRezaeiravesh,NiclasJansson,StefanoMarkidis, MatteoParsani,SanjeebBose,SanjivaLele,andPhilipp Schlatter. Effects of lower floating-point precision on scale-resolving numerical simulations of turbulence. Journal of Computational Physics, 549:114600, March 2026
work page 2026
-
[14]
Adam Paxton, Matthew Chantry, and Tim Palmer
Tom Kimpson, E. Adam Paxton, Matthew Chantry, and Tim Palmer. Climate-change modelling at re- duced floating-point precision with stochastic round- ing.QuarterlyJournaloftheRoyalMeteorologicalSociety, 149(752):843–855,2023
work page 2023
-
[15]
Marius Kurz, Philipp Offenhäuser, Dominic Viola, MichaelResch,andAndreaBeck. Relexi—Ascalable opensourcereinforcementlearningframeworkforhigh- performancecomputing.SoftwareImpacts,14:100422, December2022
-
[16]
The representation of small-scale turbulence innumericalsimulationexperiments
D Lilly. The representation of small-scale turbulence innumericalsimulationexperiments. Technicalreport, UCAR/NCAR,November1966
-
[17]
DouglasK.Lilly. Onthecomputationalstabilityofnu- mericalsolutionsoftime-dependentnon-lineargeophys- icalfluiddynamicsproblems.MonthlyWeatherReview, 93(1):11–25,January1965
-
[18]
F. Nicoud and F. Ducros. Subgrid-Scale Stress Mod- ellingBasedontheSquareoftheVelocityGradientTen- sor.Flow,TurbulenceandCombustion,62(3):183–200, September1999
-
[19]
FranckNicoud,HubertBayaToda,OlivierCabrit,San- jeebBose,andJungilLee. Usingsingularvaluestobuild asubgrid-scalemodelforlargeeddysimulations.Physics ofFluids,23(8):085106,August2011
-
[20]
J.BlairPerot. DiscreteConservationPropertiesofUn- structuredMeshSchemes.AnnualReviewofFluidMe- chanics,43(1):299–318,January2011
-
[21]
B.Sanderse. Energy-conservingRunge–Kuttamethods fortheincompressibleNavier–Stokesequations.Journal ofComputationalPhysics,233:100–131,January2013
-
[22]
B.SanderseandF.X.Trias. Energy-consistentdiscretiza- tion of viscous dissipation with application to natural convectionflow.Computers&Fluids,286:106473,Jan- uary2025
-
[23]
B.(Benjamin)Sanderse. Energy-conservingdiscretiza- tionmethodsfortheincompressibleNavier-Stokesequa- tions:application to the simulation of wind-turbine wakes,2013
work page 2013
-
[24]
J. Smagorinsky. General circulation experiments with the primitive equations: I. The basic experiment. MonthlyWeatherReview,91(3):99–164,March1963
-
[25]
TobiasThornes,PeterDüben,andTimPalmer. Onthe useofscale-dependentprecisioninEarthSystemmod- elling.Quarterly Journal of the Royal Meteorological Society,143(703):897–908,January2017
-
[26]
F.X.Trias,D.Folch,A.Gorobets,andA.Oliva. Build- ing proper invariants for eddy-viscosity subgrid-scale models.PhysicsofFluids,27(6):065103,June2015
-
[27]
F. X. Trias, M. Soria, A. Oliva, and C. D. Pérez- Segarra. Directnumericalsimulationsoftwo-andthree- dimensional turbulent natural convection flows in a differentiallyheatedcavityofaspectratio4.Journalof FluidMechanics,586:259–293,September2007. 21
-
[28]
T.VanGastelen,W.Edeling,andB.Sanderse. Energy- conservingneuralnetworkforturbulenceclosuremod- eling.Journal of Computational Physics, 508:113003, July2024
-
[29]
RoelVerstappen. WhenDoesEddyViscosityDampSub- filterScalesSufficiently?JournalofScientificComputing, 49(1):94–110,October2011
-
[30]
R.W.C.P.VerstappenandA.E.P.Veldman.DirectNumer- icalSimulationofTurbulenceatLowerCosts.Journalof EngineeringMathematics,32(2):143–159,October1997
-
[31]
R.W.C.P. Verstappen and A.E.P. Veldman. Symmetry- preservingdiscretizationofturbulentflow.Journalof ComputationalPhysics,187(1):343–368,May2003
-
[32]
PhysicsofFluids,16(10):3670–3681,October2004
A.W.Vreman.Aneddy-viscositysubgrid-scalemodelfor turbulentshearflow: Algebraictheoryandapplications. PhysicsofFluids,16(10):3670–3681,October2004
-
[33]
A.W.VremanandJ.G.M.Kuerten. Comparisonofdi- rectnumericalsimulationdatabasesofturbulentchan- nel flow at Re𝜏= 180.PhysicsofFluids, 26(1):015102, January2014
-
[34]
Gabriel D. Weymouth and Bernat Font. WaterLily.jl: A differentiable and backend-agnostic Julia solver for incompressible viscous flow around dynamic bodies. ComputerPhysicsCommunications,315:109748,October 2025
work page 2025
-
[35]
Alan A. Wray. Minimal storage time advancement schemes for spectral methods.NASA Ames Research Center,California,ReportNo.MS,202,1990
work page 1990
-
[36]
MaochaoXiao,YuningWang,FelixRodach,BernatFont, MariusKurz,PolSuárez,DiZhou,FranciscoAlcántara- Ávila,TingZhu,JunleLiu,RicardMontalà,JiaweiChen, Jean Rabault, Oriol Lehmkuhl, Andrea Beck, Johan Larsson,RicardoVinuesa,andSergioPirozzoli. Smart- Flow: ACFD-solver-agnosticdeepreinforcementlearn- ing framework for computational fluid dynamics on HPCplatforms,A...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.