Data-driven modeling of multiscale phenomena with applications to fluid turbulence
Pith reviewed 2026-05-17 22:56 UTC · model grok-4.3
The pith
Data from 2D turbulence simulations trains closed equations that capture small-scale backscatter without added physics rules.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Direct numerical simulations of freely decaying incompressible turbulence in two dimensions are used to infer an effective field theory that supplies explicit, interpretable evolution equations for both the large (resolved) and small (modeled) scales. The closed system of equations obtained this way accurately describes the effect of small scales on large scales, including backscatter, the transfer of energy from small to large scales that is pronounced in two-dimensional flows.
What carries the argument
Inference of an equivariant effective field theory from simulation data, supplying explicit evolution equations for both resolved large scales and modeled small scales.
If this is right
- The closed system accurately accounts for backscatter in two-dimensional turbulence.
- No additional physical assumptions or closure hypotheses are required beyond the training data.
- The inferred equations remain stable and predictive when applied to regimes outside the training simulations.
- The approach supplies interpretable equations rather than black-box predictions for the subgrid-scale effects.
Where Pith is reading between the lines
- The same inference procedure could be tested on three-dimensional turbulence datasets to check whether the resulting models also handle forward energy cascades correctly.
- The framework might be applied to other multiscale systems such as atmospheric flows or plasma turbulence by supplying appropriate simulation data.
- If the learned equations prove robust, they could replace conventional subgrid-scale models in large-eddy simulations of engineering flows.
Load-bearing premise
Data from freely decaying 2D turbulence simulations alone is sufficient to infer a general equivariant model that remains accurate and stable when used outside the original training conditions.
What would settle it
Running the inferred equations on a forced 2D turbulence simulation and observing that the predicted energy spectrum deviates significantly from direct numerical simulation results or that the solution becomes unstable would show the claim is false.
Figures
read the original abstract
This paper introduces a novel data driven framework for constructing accurate and general equivariant models of multiscale phenomena which does not rely on specific assumptions about the underlying physics. This framework is illustrated using incompressible fluid turbulence as an example that is representative, practically important, reasonably simple, and exceedingly well studied. We use direct numerical simulations of freely decaying turbulence in two spatial dimensions to infer an effective field theory comprising explicit, interpretable evolution equations for both the large (resolved) and small (modeled) scales. The resulting closed system of equations is capable of accurately describing the effect of small scales, including backscatter -- the flow of energy from small to large scales, which is particularly pronounced in two dimensions -- which is an outstanding challenge that, to our knowledge, no existing alternative successfully tackles.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces a data-driven framework for constructing equivariant effective field theories of multiscale phenomena without specific physics assumptions. It is illustrated on incompressible fluid turbulence by inferring closed evolution equations for resolved and subgrid scales from direct numerical simulations of freely decaying two-dimensional turbulence. The central claim is that the resulting system accurately captures small-scale effects, including energy backscatter, addressing an outstanding challenge unmet by existing closures.
Significance. If the inferred equations prove stable, accurate, and generalizable beyond the training data, the approach could advance turbulence modeling by providing an interpretable, physics-agnostic closure that handles backscatter in two dimensions. The data-driven inference from DNS offers potential for reproducible and falsifiable predictions, though these strengths are not yet demonstrated through quantitative out-of-sample validation.
major comments (2)
- [Abstract] Abstract: The claim that the closed system 'accurately describes the effect of small scales, including backscatter' and 'successfully tackles' an outstanding challenge is not supported by any quantitative validation metrics, error norms, energy spectra comparisons, or stability diagnostics. This absence prevents assessment of the central claim.
- [Results / Validation] The inference relies exclusively on freely decaying 2D DNS data. No evidence is presented that the resulting equations remain stable and accurate when integrated in forced turbulence, different initial spectra, or longer times outside the training regime, undermining the assertion of a general equivariant effective field theory.
minor comments (2)
- [Methods] Clarify the precise definition of the equivariant operators and the procedure for inferring coefficients to avoid ambiguity in reproducibility.
- [Results] Include explicit comparisons to at least one standard subgrid-scale model (e.g., Smagorinsky or dynamic Smagorinsky) in the turbulence results to quantify improvement.
Simulated Author's Rebuttal
We thank the referee for their constructive comments on our manuscript. We have revised the paper to address the concerns regarding quantitative validation and the scope of the demonstrated generalizability. Our point-by-point responses are provided below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The claim that the closed system 'accurately describes the effect of small scales, including backscatter' and 'successfully tackles' an outstanding challenge is not supported by any quantitative validation metrics, error norms, energy spectra comparisons, or stability diagnostics. This absence prevents assessment of the central claim.
Authors: We agree that the abstract would benefit from more explicit quantitative support to allow readers to assess the central claims. In the revised manuscript, we have updated the abstract to reference specific validation metrics, including L2 error norms between the model predictions and DNS data, as well as comparisons of energy spectra that demonstrate the capture of backscatter. We have also added stability diagnostics in the results section showing bounded behavior over long integration times. revision: yes
-
Referee: [Results / Validation] The inference relies exclusively on freely decaying 2D DNS data. No evidence is presented that the resulting equations remain stable and accurate when integrated in forced turbulence, different initial spectra, or longer times outside the training regime, undermining the assertion of a general equivariant effective field theory.
Authors: The demonstration in the manuscript uses freely decaying turbulence to provide a controlled setting for inferring the effective equations and highlighting the backscatter effect. The framework itself is general due to its equivariant and data-driven nature, independent of specific flow regimes. To address the concern, we have added results for longer integration times beyond the training regime and for different initial spectra, confirming stability and accuracy. A discussion has been included on how the same approach can be applied to forced turbulence, although full validation in that regime is planned for future work. revision: partial
Circularity Check
Data-driven inference from DNS data yields closed EFT without circular reduction
full rationale
The paper presents a data-driven framework that infers an effective field theory directly from DNS of freely decaying 2D turbulence to produce explicit evolution equations for resolved and modeled scales. The resulting closed system is asserted to capture backscatter and other small-scale effects. No quoted equations or steps in the provided abstract or context demonstrate that any prediction or first-principles result reduces by construction to the input data or to a self-citation chain. The method is described as physics-agnostic and general, with the derivation chain relying on empirical fitting from simulation data rather than tautological self-definition or renaming of known results. This constitutes a standard data-driven modeling workflow that remains self-contained against external benchmarks when out-of-sample tests are performed, yielding no significant circularity.
Axiom & Free-Parameter Ledger
free parameters (1)
- inferred coefficients in effective equations
axioms (1)
- domain assumption DNS data of freely decaying 2D turbulence is representative for inferring general multiscale dynamics
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We use direct numerical simulations of freely decaying turbulence in two spatial dimensions to infer an effective field theory comprising explicit, interpretable evolution equations for both the large (resolved) and small (modeled) scales.
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the resulting closed system of equations is capable of accurately describing the effect of small scales, including backscatter
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
includes variables other than ¯u i, ¯pandτ ij, gener- ated by the nonlinearity, so that the resulting system of equations is not closed. Construction of aclosedsystem of equations requires identifying (1) proper variables describing small scales, (2) approximate governing equations for these variables, and (3) an approximate constitutive relation for the ...
work page 2048
- [2]
-
[3]
Fradkin,Field Theories of Condensed Matter Physics (Cambridge University Press, 2013)
E. Fradkin,Field Theories of Condensed Matter Physics (Cambridge University Press, 2013)
work page 2013
-
[4]
S. Chapman and T. G. Cowling,The Mathematical The- ory of Non-uniform Gases: An Account of The Kinetic Theory of Viscosity, Thermal Conduction And Diffusion In Gases(Cambridge University Press, 1990)
work page 1990
-
[5]
L. F. Richardson,Weather Prediction by Numerical Pro- cess(Franklin Classics, 1922)
work page 1922
-
[6]
Smagorinsky, General circulation experiments with the primitive equations,Mon
J. Smagorinsky, General circulation experiments with the primitive equations,Mon. Weather Rev.91, 99 (1963)
work page 1963
-
[7]
M. Germano, U. Piomelli, P. Moin, and W. H. Cabot, A dynamic subgrid-scale eddy viscosity model,Phys. Fluids A3, 1760 (1991)
work page 1991
-
[8]
D. K. Lilly, A proposed modification of the Germano subgrid-scale closure method,Phys. Fluids A4, 633 (1992)
work page 1992
-
[9]
J. Bardina, J. H. Ferziger, and W. C. Reynolds, Im- proved subgrid-scale models for large-eddy simulation, AIAA PAPER 80-1357 (1980)
work page 1980
-
[10]
S. Liu, C. Meneveau, and J. Katz, On the properties of similarity subgrid-scale models as deduced from measure- ments in a turbulent jet,J. Fluid Mech275, 83 (1994)
work page 1994
- [11]
-
[12]
D. Kochkov, J. A. Smith, A. Alieva, Q. Wang, M. P. Brenner, and S. Hoyer, Machine learning–accelerated computational fluid dynamics, Proc. Nat. Acad. Sci.118, e2101784118 (2021)
work page 2021
-
[13]
J. Ling, R. Jones, and J. Templeton, Machine learn- ing strategies for systems with invariance properties,J. Comp. Phys.318, 22 (2016)
work page 2016
-
[14]
M. Schmelzer, R. Dwight, and P. Cinnella, Data-driven deterministic symbolic regression of nonlinear stress- strain relation for RANS turbulence modelling, in 2018 Fluid Dynamics Conference, p. 2900 (2018)
work page 2018
- [15]
-
[16]
J. Weatheritt and R. D. Sandberg, The development of algebraic stress models using a novel evolutionary algo- rithm,Int. J. Heat Fluid Flow.68, 298 (2017)
work page 2017
-
[17]
M. Reissmann, J. Hasslberger, R. D. Sandberg, and M. Klein, Application of gene expression programming to a-posteriori LES modeling of a Taylor-Green vortex, J. Comp. Phys.424, 109859 (2021)
work page 2021
-
[18]
A. Ross, Z. Li, P. Perezhogin, C. Fernandez-Granda, and L. Zanna, Benchmarking of machine learning ocean sub- grid parameterizations in an idealized model,JAMES15, e2022MS003258 (2023)
work page 2023
-
[19]
M. E. Pessah, C.-K. Chan, and D. Psaltis, The signature of the magnetorotational instability in the Reynolds and Maxwell stress tensors in accretion discs,Mon. Not. R. Astron. Soc.372, 183 (2006)
work page 2006
-
[20]
S. B. Pope,Turbulent Flows(Cambridge University Press, 2000)
work page 2000
-
[21]
Sagaut,Large Eddy Simulation for Incompressible Flows: An Introduction(Springer, 2006)
P. Sagaut,Large Eddy Simulation for Incompressible Flows: An Introduction(Springer, 2006)
work page 2006
-
[22]
P. Y. Chou, On velocity correlations and the solutions of the equations of turbulent fluctuation,Q. Appl. Math.3, 38 (1945)
work page 1945
-
[23]
R. A. Clark, J. H. Ferziger, and W. C. Reynolds, Eval- uation of subgrid-scale models using an accurately simu- lated turbulent flow,J. Fluid Mech91, 1 (1979)
work page 1979
-
[24]
M. Germano, A proposal for a redefinition of the tur- bulent stresses in the filtered Navier–Stokes equations, Phys. Fluids A29, 2323 (1986)
work page 1986
-
[25]
Gurevich, Data-driven inference of symmetry- equivariant models of natural phenomena, Ph.D
D. Gurevich, Data-driven inference of symmetry- equivariant models of natural phenomena, Ph.D. thesis, Princeton University(2025)
work page 2025
- [26]
-
[27]
D. R. Gurevich, M. R. Golden, P. A. Reinbold, and R. O. Grigoriev, Learning fluid physics from highly turbulent data using sparse physics-informed discovery of empirical relations (SPIDER),J. Fluid Mech996, A25 (2024)
work page 2024
-
[28]
C. J. Wareing, A. T. Roy, M. Golden, R. O. Grigoriev, and S. M. Tobias, Data-driven discovery of the equations of turbulent convection,GAFD,1(2025)
work page 2025
-
[29]
G. Boffetta and R. E. Ecke, Two-dimensional turbulence, Ann. Rev. Fluid Mech.44, 427 (2011)
work page 2011
-
[30]
C. Meneveau and J. Katz, Scale-invariance and turbu- lence models for large-eddy simulation,Ann. Rev. Fluid Mech.32, 1 (2000)
work page 2000
-
[31]
S. B. Pope, A more general effective-viscosity hypothesis, J. Fluid Mech72, 331 (1975)
work page 1975
- [32]
-
[33]
A. Leonard and G. Winckelmans, A tensor-diffusivity subgrid model for large eddy simulation, in Proc. Isaac Newton Institute Symposium/ERCOFTAC Workshop, p. 147 (1999)
work page 1999
-
[34]
C. G. Speziale, S. Sarkar, and T. B. Gatski, Modelling the pressure–strain correlation of turbulence: an invariant dynamical systems approach,J. Fluid Mech227, 245 (1991)
work page 1991
-
[35]
https://github.com/google/jax-cfd
-
[36]
https://github.com/sibirica/PySPIDER
-
[37]
https://github.com/fnsnad/2dSGS 7 SUPPLEMENT AR Y INFORMA TION The accuracy and resolution of DNS data Unlike conventional approaches, such as numerical convergence studies, which only quantify the relative ac- curacy of the numerical solutions obtained at different resolutions, SPIDER provides an absolute measure of the accuracy. For each inferred equati...
work page 2048
-
[38]
The NGM2 model is well-known to yield Π = 0, while the accuracy of the NGM4 model trails the NGMR model despite reproducing the SGS stress tensor with high pre- cision. This illustrates that high values ofC τ are a nec- essary but not sufficient condition for the accuracy of a SGS parameterization. Both the DS and the DM model completely fail to reproduce...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.