pith. sign in

arxiv: 1907.10035 · v1 · pith:LXOX263Mnew · submitted 2019-07-23 · ⚛️ physics.flu-dyn · physics.comp-ph

Unsupervised Machine Learning to Teach Fluid Dynamicists to Think in 15 Dimensions

Pith reviewed 2026-05-24 16:52 UTC · model grok-4.3

classification ⚛️ physics.flu-dyn physics.comp-ph
keywords autoencoderstratified turbulencereconstruction errorvertical velocitybleed-overrate-of-strain tensordensity gradientunsupervised learning
0
0 comments X

The pith

Reconstruction errors from an autoencoder on 15-variable stratified turbulence data indicate that vertical velocity marks key local turbulence features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

An autoencoder compresses and reconstructs a massive three-dimensional dataset of stably stratified turbulence that includes fifteen fluid variables on roughly 6.9 times 10 to the 10 grid points. The method looks for bleed-over, where information from one input variable appears in the reconstruction errors of several output variables. This bleed-over is shown to be robust when the number of layers in the network is changed. The errors consistently carry spatial information about vertical velocity into most components of the reconstructed rate-of-strain tensor and density gradient. The result matches what fluid dynamicists already know about the role of vertical velocity in SST and is offered as a proof-of-concept for using reconstruction errors to discover which variables matter in complex flows.

Core claim

An autoencoder is used to compress and then reconstruct three-dimensional stratified turbulence data in order to better understand fluid dynamics by studying the errors in the reconstruction. The original single data set is resolved on approximately 6.9 times 10 to the 10 grid points, and 15 fluid variables in three spatial dimensions are used. By observing flow features that appear in one input variable but then bleed over to multiple output variables, the errors in the reconstruction are shown to include information about the spatial variation of vertical velocity in most of the components of the reconstructed rate-of-strain tensor and density gradient, which suggests that verticalvelocity

What carries the argument

An autoencoder trained on fifteen fluid variables from stratified turbulence, where bleed-over of reconstruction errors from a single input field into multiple output reconstructions identifies physically important variables.

If this is right

  • Vertical velocity should be treated as a primary variable when analyzing local regimes in stably stratified turbulence.
  • Reconstruction-error bleed-over provides an unsupervised ranking of which input fields carry the most information about turbulence structures.
  • The same error-analysis procedure can be applied to other large fluid datasets to discover relevant variables without labeled data.
  • Detailed examination of reconstruction errors offers a route to understanding turbulence that supplements conventional statistical and dynamical approaches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The bleed-over technique could be tested on datasets with controlled changes in stratification strength to check whether vertical velocity remains the dominant marker.
  • Combining the autoencoder outputs with physical constraints on the rate-of-strain tensor might produce reduced-order models that preserve the identified role of vertical velocity.
  • If the same pattern appears in simulations of other turbulence classes, the method would supply a general tool for variable selection across fluid problems.

Load-bearing premise

That systematic bleed-over of reconstruction error from one input variable to multiple output variables reliably indicates the physical importance of that variable rather than arising from network architecture, normalization choices, or the statistics of this single dataset.

What would settle it

Repeating the autoencoder experiment on an independent stratified turbulence dataset in which vertical velocity produces no consistent bleed-over into the rate-of-strain tensor and density gradient components would falsify the claim that it is an important marker.

Figures

Figures reproduced from arXiv: 1907.10035 by D. J. Saunders, E. A. Rietman, G. D. Portwood, S. M. de Bruyn Kops.

Figure 1
Figure 1. Figure 1: Vertical velocity on a horizontal plane in stratified turbulence normalized by [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: A: Normalized common logarithm of the potential enstrophy on a vertical [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Diagram of convolutional autoencoder architecture with two encoding and [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of loss curves for CAE model #1 trained for 10K iterations [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Input data and reconstructions by networks of various depth. [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Slices of the code tensor activation by networks of various depth. 64 filters [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of the power spectra densities for the input data, a good recon [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

An autoencoder is used to compress and then reconstruct three-dimensional stratified turbulence data in order to better understand fluid dynamics by studying the errors in the reconstruction. The original single data set is resolved on approximately $6.9\times10^{10}$ grid points, and 15 fluid variables in three spatial dimensions are used, for a total of about $10^{12}$ input quantities in three dimensions. The objective is to understand which of the input variables contains the most relevant information about the local turbulence regimes in stably stratified turbulence (SST). This is accomplished by observing flow features that appear in one input variable but then `bleed over' to multiple output variables. The bleed over is shown to be robust with respect to the number of layers in the autoencoder. In this proof of concept, the errors in the reconstruction include information about the spatial variation of vertical velocity in most of the components of the reconstructed rate-of-strain tensor and density gradient, which suggests that vertical velocity is an important marker for turbulence features of interest in SST. This result is consistent with what fluid dynamicists already understand about SST and, therefore, suggests an approach to understanding turbulence based on more detailed analyses of the reconstruction on errors in an autoencoding algorithm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper presents a proof-of-concept study in which an autoencoder is trained to compress and reconstruct a single large 3D dataset of stably stratified turbulence (SST) involving 15 fluid variables on ~6.9e10 grid points. By examining patterns of reconstruction error, the authors observe that errors associated with vertical velocity systematically appear in multiple components of the reconstructed rate-of-strain tensor and density gradient; they interpret this 'bleed-over' as evidence that vertical velocity is an important marker for local turbulence regimes. The result is stated to be robust to the number of autoencoder layers and consistent with existing fluid-dynamics knowledge, suggesting the method as a general unsupervised route to identifying physically relevant variables in high-dimensional turbulence data.

Significance. If the bleed-over interpretation can be shown to be robust rather than an artifact of architecture or preprocessing, the approach would supply a data-driven, unsupervised technique for ranking the physical importance of input variables in complex flows without requiring labeled regimes or explicit feature engineering. The manuscript already demonstrates the method on an extremely large dataset and notes consistency with domain knowledge, which are positive attributes for a proof-of-concept.

major comments (3)
  1. [Abstract] Abstract and method description: the central claim rests on the observation of systematic 'bleed over' of reconstruction error from vertical velocity into multiple output fields, yet no quantitative definition of bleed-over (e.g., a threshold, correlation measure, or error metric) is supplied, nor is its statistical significance or robustness to data preprocessing reported.
  2. [Abstract] Robustness paragraph (mentioned in abstract): the only robustness test described is variation in the number of layers; the manuscript does not report controls for input normalization/scaling, alternative bottleneck sizes, loss weightings, or control datasets containing known null variables, leaving open the possibility that the observed pattern arises from network inductive bias or single-dataset statistics rather than physical importance.
  3. [Abstract] Interpretation of results: the inference that bleed-over encodes physical relevance for SST turbulence regimes is not supported by any falsification test (e.g., injecting a synthetic null variable or comparing against a shuffled-input baseline), so the mapping from error pattern to 'important marker' remains an untested assumption.
minor comments (1)
  1. [Abstract] Abstract, final sentence: 'analyses of the reconstruction on errors' appears to be a typographical error and should read 'analyses of the reconstruction errors'.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive report. The comments correctly identify areas where the proof-of-concept can be made more rigorous through quantitative metrics and additional controls. We address each point below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract] Abstract and method description: the central claim rests on the observation of systematic 'bleed over' of reconstruction error from vertical velocity into multiple output fields, yet no quantitative definition of bleed-over (e.g., a threshold, correlation measure, or error metric) is supplied, nor is its statistical significance or robustness to data preprocessing reported.

    Authors: We agree that the absence of a quantitative definition limits the clarity of the central claim. In revision we will introduce an explicit metric (cross-correlation between the vertical-velocity reconstruction-error field and the error fields of the other variables) together with a significance estimate derived from the large sample size. We will also report results under alternative preprocessing choices (different normalizations and scalings) to address robustness to data preparation. revision: yes

  2. Referee: [Abstract] Robustness paragraph (mentioned in abstract): the only robustness test described is variation in the number of layers; the manuscript does not report controls for input normalization/scaling, alternative bottleneck sizes, loss weightings, or control datasets containing known null variables, leaving open the possibility that the observed pattern arises from network inductive bias or single-dataset statistics rather than physical importance.

    Authors: The reported test (variation of layer count) was intended only as an initial check for this proof-of-concept. We accept that a broader set of controls is needed and will add results for alternative bottleneck dimensions, loss weightings, and at least one normalization variant. Full null-variable control datasets are computationally expensive on the 10^12-point grid; we will therefore include a limited but informative control (one synthetic null variable) rather than an exhaustive suite. revision: partial

  3. Referee: [Abstract] Interpretation of results: the inference that bleed-over encodes physical relevance for SST turbulence regimes is not supported by any falsification test (e.g., injecting a synthetic null variable or comparing against a shuffled-input baseline), so the mapping from error pattern to 'important marker' remains an untested assumption.

    Authors: The manuscript presents the bleed-over observation as consistent with established SST phenomenology rather than as a standalone proof of variable importance. To strengthen the mapping we will add a falsification experiment (shuffled vertical-velocity field and/or a synthetic null variable) and show that the systematic bleed-over pattern disappears under these controls. This will be reported in the revised version. revision: yes

Circularity Check

0 steps flagged

No circularity; interpretation relies on external domain knowledge

full rationale

The paper trains an autoencoder to reconstruct 15-variable stratified turbulence data and observes bleed-over of reconstruction errors from vertical velocity into rate-of-strain and density-gradient outputs. This pattern is interpreted as evidence of physical importance, but only by noting consistency with existing SST understanding rather than by any internal reduction. No derivation equates a claimed prediction to its own fitted inputs by construction, no self-citation chain supplies a uniqueness theorem, and the autoencoder objective does not smuggle in the final physical conclusion. The central inference therefore remains an external mapping from observed error structure to domain knowledge.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract provides no explicit free parameters, axioms, or invented entities; the method implicitly assumes that the autoencoder architecture and training procedure do not themselves induce the observed bleed-over pattern.

pith-pipeline@v0.9.0 · 5770 in / 1145 out tokens · 18194 ms · 2026-05-24T16:52:15.326974+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 1 internal anchor

  1. [1]

    Robust identification of dynamically distinct regions in stratified turbulence

    Portwood GD, de Bruyn Kops SM, Taylor JR, et al. Robust identification of dynamically distinct regions in stratified turbulence. J Fluid Mech. 2016;807:R2 (14 pages)

  2. [2]

    Wakes in stratified fluids: a review

    Lin JT, Pao YH. Wakes in stratified fluids: a review. Annu Rev Fluid Mech. 1979;11:317– 338

  3. [3]

    Stratified turbulence and the mesoscale variability of the atmosphere

    Lilly DK. Stratified turbulence and the mesoscale variability of the atmosphere. J Atmos Sci. 1983;40:749–761

  4. [4]

    Self-similarity of strongly stratified inviscid flows

    Billant P, Chomaz JM. Self-similarity of strongly stratified inviscid flows. Phys Fluids. 2001;13:1645–1651

  5. [5]

    Stratified turbulence: A possible interpretation of some geophysical turbulence measurements

    Riley JJ, Lindborg E. Stratified turbulence: A possible interpretation of some geophysical turbulence measurements. J Atmos Sci. 2008;65(7):2416–2424

  6. [6]

    The effects of stable stratification on the decay of initially isotropic homogeneous turbulence

    de Bruyn Kops SM, Riley JJ. The effects of stable stratification on the decay of initially isotropic homogeneous turbulence. J Fluid Mech. 2019;860:787821

  7. [7]

    Kinetic energy dynamics in forced, homogeneous, and axisymmetric stably stratified turbulence

    Almalkie S, de Bruyn Kops SM. Kinetic energy dynamics in forced, homogeneous, and axisymmetric stably stratified turbulence. J Turbul. 2012;13(29):1–29

  8. [8]

    Dynamics of turbulence strongly influenced by buoyancy

    Riley JJ, de Bruyn Kops SM. Dynamics of turbulence strongly influenced by buoyancy. Phys Fluids. 2003;15(7):2047–2059

  9. [9]

    Scaling analysis and simulation of strongly stratified turbulent flows

    Brethouwer G, Billant P, Lindborg E, et al. Scaling analysis and simulation of strongly stratified turbulent flows. J Fluid Mech. 2007;585:343–368. 16

  10. [10]

    Stratified turbulence at the buoyancy scale

    Waite ML. Stratified turbulence at the buoyancy scale. Phys Fluids. 2011 JUN; 23(6):066602

  11. [11]

    Energy spectra of stably stratified turbulence

    Kimura Y, Herring JR. Energy spectra of stably stratified turbulence. J Fluid Mech. 2012; 698:19–50

  12. [12]

    Sensitivity of stratified turbulence to buoyancy Reynolds number

    Bartello P, Tobias SM. Sensitivity of stratified turbulence to buoyancy Reynolds number. J Fluid Mech. 2013;725:1–22

  13. [13]

    Dynamics of stratified turbulence decaying from a high buoyancy Reynolds number

    Maffioli A, Davidson PA. Dynamics of stratified turbulence decaying from a high buoyancy Reynolds number. J Fluid Mech. 2016;786:210–233

  14. [14]

    Free-stream boundaries of turbulent flows

    Corrsin S, Kistler AL. Free-stream boundaries of turbulent flows. NACA Report. 1955; 1224:1033–1064

  15. [15]

    Turbulent/non-turbulent interfaces in wakes in stably stratified fluids

    Watanabe T, Riley JJ, de Bruyn Kops SM, et al. Turbulent/non-turbulent interfaces in wakes in stably stratified fluids. J Fluid Mech. 2016 6;797:R1

  16. [16]

    Atmospheric turbulence and radio wave propagation

    Lumley J, Yaglom A, Tatarski V. Atmospheric turbulence and radio wave propagation. Journal of Computational Chemistry. 1967;23(13):1236–1243

  17. [17]

    Deep learning

    Goodfellow I, Bengio Y, Courville A. Deep learning. MIT Press; 2016. http://www. deeplearningbook.org

  18. [18]

    Deep learning

    LeCun Y, Bengio Y, Hinton GE. Deep learning. Nature. 2015;521:436–444

  19. [19]

    Predictions of turbulent shear flows using deep neural networks

    Srinivasan P, Guastoni L, Azizpour H, et al. Predictions of turbulent shear flows using deep neural networks. Physical Review Fluids. 2019;4(5):054603

  20. [20]

    Online turbulence model classification for large eddy simulation using deep learning

    Maulik R, San O, Jacob JD, et al. Online turbulence model classification for large eddy simulation using deep learning. arXiv preprint arXiv:181211949. 2018

  21. [21]

    Autoencoders, unsupervised learning and deep architectures

    Baldi P. Autoencoders, unsupervised learning and deep architectures. In: Proceedings of the 2011 International Conference on Unsupervised and Transfer Learning Workshop - Volume 27. JMLR.org; 2011. p. 37–50; UTLW’11. Available from: http://dl.acm.org/ citation.cfm?id=3045796.3045801

  22. [22]

    Learning low-dimensional feature dynamics using deep con- volutional recurrent autoencoders

    Gonzalez FJ, Balajewicz M. Learning low-dimensional feature dynamics using deep con- volutional recurrent autoencoders. arXiv preprint arXiv:180801346. 2018

  23. [23]

    Compressed convolutional LSTM: An effi- cient deep learning framework to model high fidelity 3d turbulence

    Mohan A, Daniel D, Chertkov M, et al. Compressed convolutional LSTM: An effi- cient deep learning framework to model high fidelity 3d turbulence. arXiv preprint arXiv:190300033. 2019

  24. [24]

    From deep to physics-informed learning of turbulence: Diagnostics

    King R, Hennigh O, Mohan A, et al. From deep to physics-informed learning of turbulence: Diagnostics. arXiv preprint arXiv:181007785. 2018

  25. [25]

    Background modeling using deep-variational autoencoder

    Vijayan M, Mohan R. Background modeling using deep-variational autoencoder. In: In- ternational Conference on Intelligent Systems Design and Applications; Springer; 2018. p. 335–344

  26. [26]

    A novel method of low-dimensional representation for temporal behavior of flow fields using deep autoencoder

    Omata N, Shirayama S. A novel method of low-dimensional representation for temporal behavior of flow fields using deep autoencoder. AIP Advances. 2019;9(1):015006

  27. [27]

    A synthetic turbulent inflow generator using machine learning

    Fukami K, Kawai K, Fukagata K. A synthetic turbulent inflow generator using machine learning. arXiv preprint arXiv:180608903. 2018

  28. [28]

    Neural networks based subgrid scale modeling in large eddy simulations

    Sarghini F, De Felice G, Santini S. Neural networks based subgrid scale modeling in large eddy simulations. Computers & fluids. 2003;32(1):97–108

  29. [29]

    Deep neural networks for data-driven turbulence models

    Beck AD, Flad DG, Munz CD. Deep neural networks for data-driven turbulence models. ResearchGate preprint. 2018

  30. [30]

    Subgrid modelling for two-dimensional turbulence using neural networks

    Maulik R, San O, Rasheed A, et al. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics. 2019;858:122–144

  31. [31]

    Neural network-based modelling of unresolved stresses in a turbulent reacting flow with mean shear

    Nikolaou ZM, Chrysostomou C, Minamoto Y, et al. Neural network-based modelling of unresolved stresses in a turbulent reacting flow with mean shear. arXiv preprint arXiv:190408167. 2019

  32. [32]

    Reynolds averaged turbulence modelling using deep neural networks with embedded invariance

    Ling J, Kurzawski A, Templeton J. Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics. 2016;807:155– 166

  33. [33]

    A deep learning framework for turbulence modeling using data assimilation and feature extraction

    Moghaddam AA, Sadaghiyani A. A deep learning framework for turbulence modeling using data assimilation and feature extraction. arXiv preprint arXiv:180206106. 2018

  34. [34]

    Turbulence model development using cfd-driven 17 machine learning

    Zhao Y, Akolekar HD, Weatheritt J, et al. Turbulence model development using cfd-driven 17 machine learning. arXiv preprint arXiv:190209075. 2019

  35. [35]

    The nature of statistical learning theory

    Vapnik V. The nature of statistical learning theory. Springer science & business media; 2013

  36. [36]

    Th´ eorie analytique de la chaleur

    Boussinesq J. Th´ eorie analytique de la chaleur. Vol. 2, p. 172. Gauthier-Villars, Paris; 1903

  37. [37]

    A mathematical framework for forcing turbulence applied to horizontally homogeneous stratified flow

    Rao KJ, de Bruyn Kops SM. A mathematical framework for forcing turbulence applied to horizontally homogeneous stratified flow. Phys Fluids. 2011;23:065110

  38. [38]

    A deterministic forcing scheme for direct numerical simulations of turbulence

    Overholt MR, Pope SB. A deterministic forcing scheme for direct numerical simulations of turbulence. Computers & Fluids. 1998;27:11–28

  39. [39]

    The energy cascade in a strongly stratified fluid

    Lindborg E. The energy cascade in a strongly stratified fluid. J Fluid Mech. 2006;550:207– 242

  40. [40]

    Classical turbulence scaling and intermittency in stably stratified Boussinesq turbulence

    de Bruyn Kops SM. Classical turbulence scaling and intermittency in stably stratified Boussinesq turbulence. J Fluid Mech. 2015;775:436–463

  41. [41]

    The spectrum of turbulence

    Taylor GI. The spectrum of turbulence. P Roy Soc Lond A Mat. 1938

  42. [42]

    Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

    Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR. 2013;abs/1312.6034

  43. [43]

    Greedy layer-wise training of deep networks

    Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks. In: Proceedings of the 19th International Conference on Neural Information Processing Systems; Cambridge, MA, USA. MIT Press; 2006. p. 153–160; NIPS’06. Available from: http://dl.acm.org/citation.cfm?id=2976456.2976476

  44. [44]

    Deconvolution and checkerboard artifacts

    Odena A, Dumoulin V, Olah C. Deconvolution and checkerboard artifacts. Distill. 2016; Available from: http://distill.pub/2016/deconv-checkerboard. 18