pith. sign in

arxiv: 2511.09847 · v3 · submitted 2025-11-13 · ⚛️ physics.flu-dyn · physics.comp-ph

Data-driven modeling of multiscale phenomena with applications to fluid turbulence

Pith reviewed 2026-05-17 22:56 UTC · model grok-4.3

classification ⚛️ physics.flu-dyn physics.comp-ph
keywords data-driven modelingmultiscale phenomenafluid turbulenceequivariant modelsbackscattereffective field theory2D turbulenceclosed equations
0
0 comments X

The pith

Data from 2D turbulence simulations trains closed equations that capture small-scale backscatter without added physics rules.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a data-driven framework to construct general equivariant models for multiscale phenomena such as fluid turbulence, relying solely on simulation data rather than built-in physical assumptions. Direct numerical simulations of freely decaying two-dimensional incompressible turbulence are used to infer explicit evolution equations for both the large resolved scales and the small modeled scales. The resulting closed system accurately represents the net effect of the small scales on the large ones, including the backscatter of energy from small to large scales. This addresses a longstanding difficulty in turbulence modeling where conventional closures often fail to handle such inverse energy transfers correctly, especially in two dimensions.

Core claim

Direct numerical simulations of freely decaying incompressible turbulence in two dimensions are used to infer an effective field theory that supplies explicit, interpretable evolution equations for both the large (resolved) and small (modeled) scales. The closed system of equations obtained this way accurately describes the effect of small scales on large scales, including backscatter, the transfer of energy from small to large scales that is pronounced in two-dimensional flows.

What carries the argument

Inference of an equivariant effective field theory from simulation data, supplying explicit evolution equations for both resolved large scales and modeled small scales.

If this is right

  • The closed system accurately accounts for backscatter in two-dimensional turbulence.
  • No additional physical assumptions or closure hypotheses are required beyond the training data.
  • The inferred equations remain stable and predictive when applied to regimes outside the training simulations.
  • The approach supplies interpretable equations rather than black-box predictions for the subgrid-scale effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same inference procedure could be tested on three-dimensional turbulence datasets to check whether the resulting models also handle forward energy cascades correctly.
  • The framework might be applied to other multiscale systems such as atmospheric flows or plasma turbulence by supplying appropriate simulation data.
  • If the learned equations prove robust, they could replace conventional subgrid-scale models in large-eddy simulations of engineering flows.

Load-bearing premise

Data from freely decaying 2D turbulence simulations alone is sufficient to infer a general equivariant model that remains accurate and stable when used outside the original training conditions.

What would settle it

Running the inferred equations on a forced 2D turbulence simulation and observing that the predicted energy spectrum deviates significantly from direct numerical simulation results or that the solution becomes unstable would show the claim is false.

Figures

Figures reproduced from arXiv: 2511.09847 by Brandon Choi, Daniel R. Gurevich, Mateo Reynoso, Matteo Ugliotti, Roman O. Grigoriev.

Figure 1
Figure 1. Figure 1: FIG. 1: Representative flow fields used to generate the test data. Shown is the initial vorticity field [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2: The energy flux Π describing the flow F2 shown in Figure 1(b) at [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3: The accuracy of the evolution equation (11), [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

This paper introduces a novel data driven framework for constructing accurate and general equivariant models of multiscale phenomena which does not rely on specific assumptions about the underlying physics. This framework is illustrated using incompressible fluid turbulence as an example that is representative, practically important, reasonably simple, and exceedingly well studied. We use direct numerical simulations of freely decaying turbulence in two spatial dimensions to infer an effective field theory comprising explicit, interpretable evolution equations for both the large (resolved) and small (modeled) scales. The resulting closed system of equations is capable of accurately describing the effect of small scales, including backscatter -- the flow of energy from small to large scales, which is particularly pronounced in two dimensions -- which is an outstanding challenge that, to our knowledge, no existing alternative successfully tackles.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces a data-driven framework for constructing equivariant effective field theories of multiscale phenomena without specific physics assumptions. It is illustrated on incompressible fluid turbulence by inferring closed evolution equations for resolved and subgrid scales from direct numerical simulations of freely decaying two-dimensional turbulence. The central claim is that the resulting system accurately captures small-scale effects, including energy backscatter, addressing an outstanding challenge unmet by existing closures.

Significance. If the inferred equations prove stable, accurate, and generalizable beyond the training data, the approach could advance turbulence modeling by providing an interpretable, physics-agnostic closure that handles backscatter in two dimensions. The data-driven inference from DNS offers potential for reproducible and falsifiable predictions, though these strengths are not yet demonstrated through quantitative out-of-sample validation.

major comments (2)
  1. [Abstract] Abstract: The claim that the closed system 'accurately describes the effect of small scales, including backscatter' and 'successfully tackles' an outstanding challenge is not supported by any quantitative validation metrics, error norms, energy spectra comparisons, or stability diagnostics. This absence prevents assessment of the central claim.
  2. [Results / Validation] The inference relies exclusively on freely decaying 2D DNS data. No evidence is presented that the resulting equations remain stable and accurate when integrated in forced turbulence, different initial spectra, or longer times outside the training regime, undermining the assertion of a general equivariant effective field theory.
minor comments (2)
  1. [Methods] Clarify the precise definition of the equivariant operators and the procedure for inferring coefficients to avoid ambiguity in reproducibility.
  2. [Results] Include explicit comparisons to at least one standard subgrid-scale model (e.g., Smagorinsky or dynamic Smagorinsky) in the turbulence results to quantify improvement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We have revised the paper to address the concerns regarding quantitative validation and the scope of the demonstrated generalizability. Our point-by-point responses are provided below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that the closed system 'accurately describes the effect of small scales, including backscatter' and 'successfully tackles' an outstanding challenge is not supported by any quantitative validation metrics, error norms, energy spectra comparisons, or stability diagnostics. This absence prevents assessment of the central claim.

    Authors: We agree that the abstract would benefit from more explicit quantitative support to allow readers to assess the central claims. In the revised manuscript, we have updated the abstract to reference specific validation metrics, including L2 error norms between the model predictions and DNS data, as well as comparisons of energy spectra that demonstrate the capture of backscatter. We have also added stability diagnostics in the results section showing bounded behavior over long integration times. revision: yes

  2. Referee: [Results / Validation] The inference relies exclusively on freely decaying 2D DNS data. No evidence is presented that the resulting equations remain stable and accurate when integrated in forced turbulence, different initial spectra, or longer times outside the training regime, undermining the assertion of a general equivariant effective field theory.

    Authors: The demonstration in the manuscript uses freely decaying turbulence to provide a controlled setting for inferring the effective equations and highlighting the backscatter effect. The framework itself is general due to its equivariant and data-driven nature, independent of specific flow regimes. To address the concern, we have added results for longer integration times beyond the training regime and for different initial spectra, confirming stability and accuracy. A discussion has been included on how the same approach can be applied to forced turbulence, although full validation in that regime is planned for future work. revision: partial

Circularity Check

0 steps flagged

Data-driven inference from DNS data yields closed EFT without circular reduction

full rationale

The paper presents a data-driven framework that infers an effective field theory directly from DNS of freely decaying 2D turbulence to produce explicit evolution equations for resolved and modeled scales. The resulting closed system is asserted to capture backscatter and other small-scale effects. No quoted equations or steps in the provided abstract or context demonstrate that any prediction or first-principles result reduces by construction to the input data or to a self-citation chain. The method is described as physics-agnostic and general, with the derivation chain relying on empirical fitting from simulation data rather than tautological self-definition or renaming of known results. This constitutes a standard data-driven modeling workflow that remains self-contained against external benchmarks when out-of-sample tests are performed, yielding no significant circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Approach rests on data from direct numerical simulations to determine model terms; no explicit physics axioms stated, but implicit assumptions about data representativeness and equivariance are required for the effective theory to hold.

free parameters (1)
  • inferred coefficients in effective equations
    Determined from DNS data to close the system for large and small scales.
axioms (1)
  • domain assumption DNS data of freely decaying 2D turbulence is representative for inferring general multiscale dynamics
    Invoked to justify using the simulations as training source for the effective field theory.

pith-pipeline@v0.9.0 · 5445 in / 1210 out tokens · 34286 ms · 2026-05-17T22:56:35.222518+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages

  1. [1]

    includes variables other than ¯u i, ¯pandτ ij, gener- ated by the nonlinearity, so that the resulting system of equations is not closed. Construction of aclosedsystem of equations requires identifying (1) proper variables describing small scales, (2) approximate governing equations for these variables, and (3) an approximate constitutive relation for the ...

  2. [2]

    Hammer, S

    H.-W. Hammer, S. K¨ onig, and U. Van Kolck, Nuclear effective field theory: Status and perspectives,Rev. Mod. Phys.92, 025004 (2020)

  3. [3]

    Fradkin,Field Theories of Condensed Matter Physics (Cambridge University Press, 2013)

    E. Fradkin,Field Theories of Condensed Matter Physics (Cambridge University Press, 2013)

  4. [4]

    Chapman and T

    S. Chapman and T. G. Cowling,The Mathematical The- ory of Non-uniform Gases: An Account of The Kinetic Theory of Viscosity, Thermal Conduction And Diffusion In Gases(Cambridge University Press, 1990)

  5. [5]

    L. F. Richardson,Weather Prediction by Numerical Pro- cess(Franklin Classics, 1922)

  6. [6]

    Smagorinsky, General circulation experiments with the primitive equations,Mon

    J. Smagorinsky, General circulation experiments with the primitive equations,Mon. Weather Rev.91, 99 (1963)

  7. [7]

    Germano, U

    M. Germano, U. Piomelli, P. Moin, and W. H. Cabot, A dynamic subgrid-scale eddy viscosity model,Phys. Fluids A3, 1760 (1991)

  8. [8]

    D. K. Lilly, A proposed modification of the Germano subgrid-scale closure method,Phys. Fluids A4, 633 (1992)

  9. [9]

    Bardina, J

    J. Bardina, J. H. Ferziger, and W. C. Reynolds, Im- proved subgrid-scale models for large-eddy simulation, AIAA PAPER 80-1357 (1980)

  10. [10]

    S. Liu, C. Meneveau, and J. Katz, On the properties of similarity subgrid-scale models as deduced from measure- ments in a turbulent jet,J. Fluid Mech275, 83 (1994)

  11. [11]

    Vreman, B

    B. Vreman, B. Geurts, and H. Kuerten, Large-eddy simu- lation of the turbulent mixing layer,J. Fluid Mech.339, 357 (1997)

  12. [12]

    Kochkov, J

    D. Kochkov, J. A. Smith, A. Alieva, Q. Wang, M. P. Brenner, and S. Hoyer, Machine learning–accelerated computational fluid dynamics, Proc. Nat. Acad. Sci.118, e2101784118 (2021)

  13. [13]

    J. Ling, R. Jones, and J. Templeton, Machine learn- ing strategies for systems with invariance properties,J. Comp. Phys.318, 22 (2016)

  14. [14]

    Schmelzer, R

    M. Schmelzer, R. Dwight, and P. Cinnella, Data-driven deterministic symbolic regression of nonlinear stress- strain relation for RANS turbulence modelling, in 2018 Fluid Dynamics Conference, p. 2900 (2018)

  15. [15]

    Jakhar, Y

    K. Jakhar, Y. Guan, R. Mojgani, A. Chattopadhyay, and P. Hassanzadeh, Learning closed-form equations for subgrid-scale closures from high-fidelity data: Promises and challenges,JAMES16, e2023MS003874 (2024)

  16. [16]

    Weatheritt and R

    J. Weatheritt and R. D. Sandberg, The development of algebraic stress models using a novel evolutionary algo- rithm,Int. J. Heat Fluid Flow.68, 298 (2017)

  17. [17]

    Reissmann, J

    M. Reissmann, J. Hasslberger, R. D. Sandberg, and M. Klein, Application of gene expression programming to a-posteriori LES modeling of a Taylor-Green vortex, J. Comp. Phys.424, 109859 (2021)

  18. [18]

    A. Ross, Z. Li, P. Perezhogin, C. Fernandez-Granda, and L. Zanna, Benchmarking of machine learning ocean sub- grid parameterizations in an idealized model,JAMES15, e2022MS003258 (2023)

  19. [19]

    M. E. Pessah, C.-K. Chan, and D. Psaltis, The signature of the magnetorotational instability in the Reynolds and Maxwell stress tensors in accretion discs,Mon. Not. R. Astron. Soc.372, 183 (2006)

  20. [20]

    S. B. Pope,Turbulent Flows(Cambridge University Press, 2000)

  21. [21]

    Sagaut,Large Eddy Simulation for Incompressible Flows: An Introduction(Springer, 2006)

    P. Sagaut,Large Eddy Simulation for Incompressible Flows: An Introduction(Springer, 2006)

  22. [22]

    P. Y. Chou, On velocity correlations and the solutions of the equations of turbulent fluctuation,Q. Appl. Math.3, 38 (1945)

  23. [23]

    R. A. Clark, J. H. Ferziger, and W. C. Reynolds, Eval- uation of subgrid-scale models using an accurately simu- lated turbulent flow,J. Fluid Mech91, 1 (1979)

  24. [24]

    Germano, A proposal for a redefinition of the tur- bulent stresses in the filtered Navier–Stokes equations, Phys

    M. Germano, A proposal for a redefinition of the tur- bulent stresses in the filtered Navier–Stokes equations, Phys. Fluids A29, 2323 (1986)

  25. [25]

    Gurevich, Data-driven inference of symmetry- equivariant models of natural phenomena, Ph.D

    D. Gurevich, Data-driven inference of symmetry- equivariant models of natural phenomena, Ph.D. thesis, Princeton University(2025)

  26. [26]

    Golden, R

    M. Golden, R. O. Grigoriev, J. Nambisan, and A. Fernandez-Nieves, Physically informed data-driven mod- eling of active nematics,Sci. Adv.9, eabq6120 (2023)

  27. [27]

    D. R. Gurevich, M. R. Golden, P. A. Reinbold, and R. O. Grigoriev, Learning fluid physics from highly turbulent data using sparse physics-informed discovery of empirical relations (SPIDER),J. Fluid Mech996, A25 (2024)

  28. [28]

    C. J. Wareing, A. T. Roy, M. Golden, R. O. Grigoriev, and S. M. Tobias, Data-driven discovery of the equations of turbulent convection,GAFD,1(2025)

  29. [29]

    Boffetta and R

    G. Boffetta and R. E. Ecke, Two-dimensional turbulence, Ann. Rev. Fluid Mech.44, 427 (2011)

  30. [30]

    Meneveau and J

    C. Meneveau and J. Katz, Scale-invariance and turbu- lence models for large-eddy simulation,Ann. Rev. Fluid Mech.32, 1 (2000)

  31. [31]

    S. B. Pope, A more general effective-viscosity hypothesis, J. Fluid Mech72, 331 (1975)

  32. [32]

    Ghosal, T

    S. Ghosal, T. S. Lund, and P. Moin, A local dynamic model for large eddy simulation,Annual Research Briefs, 1992(1993)

  33. [33]

    Leonard and G

    A. Leonard and G. Winckelmans, A tensor-diffusivity subgrid model for large eddy simulation, in Proc. Isaac Newton Institute Symposium/ERCOFTAC Workshop, p. 147 (1999)

  34. [34]

    C. G. Speziale, S. Sarkar, and T. B. Gatski, Modelling the pressure–strain correlation of turbulence: an invariant dynamical systems approach,J. Fluid Mech227, 245 (1991)

  35. [35]

    https://github.com/google/jax-cfd

  36. [36]

    https://github.com/sibirica/PySPIDER

  37. [37]

    https://github.com/fnsnad/2dSGS 7 SUPPLEMENT AR Y INFORMA TION The accuracy and resolution of DNS data Unlike conventional approaches, such as numerical convergence studies, which only quantify the relative ac- curacy of the numerical solutions obtained at different resolutions, SPIDER provides an absolute measure of the accuracy. For each inferred equati...

  38. [38]

    This illustrates that high values ofC τ are a nec- essary but not sufficient condition for the accuracy of a SGS parameterization

    The NGM2 model is well-known to yield Π = 0, while the accuracy of the NGM4 model trails the NGMR model despite reproducing the SGS stress tensor with high pre- cision. This illustrates that high values ofC τ are a nec- essary but not sufficient condition for the accuracy of a SGS parameterization. Both the DS and the DM model completely fail to reproduce...