Unsupervised Machine Learning to Teach Fluid Dynamicists to Think in 15 Dimensions
Pith reviewed 2026-05-24 16:52 UTC · model grok-4.3
The pith
Reconstruction errors from an autoencoder on 15-variable stratified turbulence data indicate that vertical velocity marks key local turbulence features.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An autoencoder is used to compress and then reconstruct three-dimensional stratified turbulence data in order to better understand fluid dynamics by studying the errors in the reconstruction. The original single data set is resolved on approximately 6.9 times 10 to the 10 grid points, and 15 fluid variables in three spatial dimensions are used. By observing flow features that appear in one input variable but then bleed over to multiple output variables, the errors in the reconstruction are shown to include information about the spatial variation of vertical velocity in most of the components of the reconstructed rate-of-strain tensor and density gradient, which suggests that verticalvelocity
What carries the argument
An autoencoder trained on fifteen fluid variables from stratified turbulence, where bleed-over of reconstruction errors from a single input field into multiple output reconstructions identifies physically important variables.
If this is right
- Vertical velocity should be treated as a primary variable when analyzing local regimes in stably stratified turbulence.
- Reconstruction-error bleed-over provides an unsupervised ranking of which input fields carry the most information about turbulence structures.
- The same error-analysis procedure can be applied to other large fluid datasets to discover relevant variables without labeled data.
- Detailed examination of reconstruction errors offers a route to understanding turbulence that supplements conventional statistical and dynamical approaches.
Where Pith is reading between the lines
- The bleed-over technique could be tested on datasets with controlled changes in stratification strength to check whether vertical velocity remains the dominant marker.
- Combining the autoencoder outputs with physical constraints on the rate-of-strain tensor might produce reduced-order models that preserve the identified role of vertical velocity.
- If the same pattern appears in simulations of other turbulence classes, the method would supply a general tool for variable selection across fluid problems.
Load-bearing premise
That systematic bleed-over of reconstruction error from one input variable to multiple output variables reliably indicates the physical importance of that variable rather than arising from network architecture, normalization choices, or the statistics of this single dataset.
What would settle it
Repeating the autoencoder experiment on an independent stratified turbulence dataset in which vertical velocity produces no consistent bleed-over into the rate-of-strain tensor and density gradient components would falsify the claim that it is an important marker.
Figures
read the original abstract
An autoencoder is used to compress and then reconstruct three-dimensional stratified turbulence data in order to better understand fluid dynamics by studying the errors in the reconstruction. The original single data set is resolved on approximately $6.9\times10^{10}$ grid points, and 15 fluid variables in three spatial dimensions are used, for a total of about $10^{12}$ input quantities in three dimensions. The objective is to understand which of the input variables contains the most relevant information about the local turbulence regimes in stably stratified turbulence (SST). This is accomplished by observing flow features that appear in one input variable but then `bleed over' to multiple output variables. The bleed over is shown to be robust with respect to the number of layers in the autoencoder. In this proof of concept, the errors in the reconstruction include information about the spatial variation of vertical velocity in most of the components of the reconstructed rate-of-strain tensor and density gradient, which suggests that vertical velocity is an important marker for turbulence features of interest in SST. This result is consistent with what fluid dynamicists already understand about SST and, therefore, suggests an approach to understanding turbulence based on more detailed analyses of the reconstruction on errors in an autoencoding algorithm.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a proof-of-concept study in which an autoencoder is trained to compress and reconstruct a single large 3D dataset of stably stratified turbulence (SST) involving 15 fluid variables on ~6.9e10 grid points. By examining patterns of reconstruction error, the authors observe that errors associated with vertical velocity systematically appear in multiple components of the reconstructed rate-of-strain tensor and density gradient; they interpret this 'bleed-over' as evidence that vertical velocity is an important marker for local turbulence regimes. The result is stated to be robust to the number of autoencoder layers and consistent with existing fluid-dynamics knowledge, suggesting the method as a general unsupervised route to identifying physically relevant variables in high-dimensional turbulence data.
Significance. If the bleed-over interpretation can be shown to be robust rather than an artifact of architecture or preprocessing, the approach would supply a data-driven, unsupervised technique for ranking the physical importance of input variables in complex flows without requiring labeled regimes or explicit feature engineering. The manuscript already demonstrates the method on an extremely large dataset and notes consistency with domain knowledge, which are positive attributes for a proof-of-concept.
major comments (3)
- [Abstract] Abstract and method description: the central claim rests on the observation of systematic 'bleed over' of reconstruction error from vertical velocity into multiple output fields, yet no quantitative definition of bleed-over (e.g., a threshold, correlation measure, or error metric) is supplied, nor is its statistical significance or robustness to data preprocessing reported.
- [Abstract] Robustness paragraph (mentioned in abstract): the only robustness test described is variation in the number of layers; the manuscript does not report controls for input normalization/scaling, alternative bottleneck sizes, loss weightings, or control datasets containing known null variables, leaving open the possibility that the observed pattern arises from network inductive bias or single-dataset statistics rather than physical importance.
- [Abstract] Interpretation of results: the inference that bleed-over encodes physical relevance for SST turbulence regimes is not supported by any falsification test (e.g., injecting a synthetic null variable or comparing against a shuffled-input baseline), so the mapping from error pattern to 'important marker' remains an untested assumption.
minor comments (1)
- [Abstract] Abstract, final sentence: 'analyses of the reconstruction on errors' appears to be a typographical error and should read 'analyses of the reconstruction errors'.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive report. The comments correctly identify areas where the proof-of-concept can be made more rigorous through quantitative metrics and additional controls. We address each point below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract and method description: the central claim rests on the observation of systematic 'bleed over' of reconstruction error from vertical velocity into multiple output fields, yet no quantitative definition of bleed-over (e.g., a threshold, correlation measure, or error metric) is supplied, nor is its statistical significance or robustness to data preprocessing reported.
Authors: We agree that the absence of a quantitative definition limits the clarity of the central claim. In revision we will introduce an explicit metric (cross-correlation between the vertical-velocity reconstruction-error field and the error fields of the other variables) together with a significance estimate derived from the large sample size. We will also report results under alternative preprocessing choices (different normalizations and scalings) to address robustness to data preparation. revision: yes
-
Referee: [Abstract] Robustness paragraph (mentioned in abstract): the only robustness test described is variation in the number of layers; the manuscript does not report controls for input normalization/scaling, alternative bottleneck sizes, loss weightings, or control datasets containing known null variables, leaving open the possibility that the observed pattern arises from network inductive bias or single-dataset statistics rather than physical importance.
Authors: The reported test (variation of layer count) was intended only as an initial check for this proof-of-concept. We accept that a broader set of controls is needed and will add results for alternative bottleneck dimensions, loss weightings, and at least one normalization variant. Full null-variable control datasets are computationally expensive on the 10^12-point grid; we will therefore include a limited but informative control (one synthetic null variable) rather than an exhaustive suite. revision: partial
-
Referee: [Abstract] Interpretation of results: the inference that bleed-over encodes physical relevance for SST turbulence regimes is not supported by any falsification test (e.g., injecting a synthetic null variable or comparing against a shuffled-input baseline), so the mapping from error pattern to 'important marker' remains an untested assumption.
Authors: The manuscript presents the bleed-over observation as consistent with established SST phenomenology rather than as a standalone proof of variable importance. To strengthen the mapping we will add a falsification experiment (shuffled vertical-velocity field and/or a synthetic null variable) and show that the systematic bleed-over pattern disappears under these controls. This will be reported in the revised version. revision: yes
Circularity Check
No circularity; interpretation relies on external domain knowledge
full rationale
The paper trains an autoencoder to reconstruct 15-variable stratified turbulence data and observes bleed-over of reconstruction errors from vertical velocity into rate-of-strain and density-gradient outputs. This pattern is interpreted as evidence of physical importance, but only by noting consistency with existing SST understanding rather than by any internal reduction. No derivation equates a claimed prediction to its own fitted inputs by construction, no self-citation chain supplies a uniqueness theorem, and the autoencoder objective does not smuggle in the final physical conclusion. The central inference therefore remains an external mapping from observed error structure to domain knowledge.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Robust identification of dynamically distinct regions in stratified turbulence
Portwood GD, de Bruyn Kops SM, Taylor JR, et al. Robust identification of dynamically distinct regions in stratified turbulence. J Fluid Mech. 2016;807:R2 (14 pages)
work page 2016
-
[2]
Wakes in stratified fluids: a review
Lin JT, Pao YH. Wakes in stratified fluids: a review. Annu Rev Fluid Mech. 1979;11:317– 338
work page 1979
-
[3]
Stratified turbulence and the mesoscale variability of the atmosphere
Lilly DK. Stratified turbulence and the mesoscale variability of the atmosphere. J Atmos Sci. 1983;40:749–761
work page 1983
-
[4]
Self-similarity of strongly stratified inviscid flows
Billant P, Chomaz JM. Self-similarity of strongly stratified inviscid flows. Phys Fluids. 2001;13:1645–1651
work page 2001
-
[5]
Stratified turbulence: A possible interpretation of some geophysical turbulence measurements
Riley JJ, Lindborg E. Stratified turbulence: A possible interpretation of some geophysical turbulence measurements. J Atmos Sci. 2008;65(7):2416–2424
work page 2008
-
[6]
The effects of stable stratification on the decay of initially isotropic homogeneous turbulence
de Bruyn Kops SM, Riley JJ. The effects of stable stratification on the decay of initially isotropic homogeneous turbulence. J Fluid Mech. 2019;860:787821
work page 2019
-
[7]
Kinetic energy dynamics in forced, homogeneous, and axisymmetric stably stratified turbulence
Almalkie S, de Bruyn Kops SM. Kinetic energy dynamics in forced, homogeneous, and axisymmetric stably stratified turbulence. J Turbul. 2012;13(29):1–29
work page 2012
-
[8]
Dynamics of turbulence strongly influenced by buoyancy
Riley JJ, de Bruyn Kops SM. Dynamics of turbulence strongly influenced by buoyancy. Phys Fluids. 2003;15(7):2047–2059
work page 2003
-
[9]
Scaling analysis and simulation of strongly stratified turbulent flows
Brethouwer G, Billant P, Lindborg E, et al. Scaling analysis and simulation of strongly stratified turbulent flows. J Fluid Mech. 2007;585:343–368. 16
work page 2007
-
[10]
Stratified turbulence at the buoyancy scale
Waite ML. Stratified turbulence at the buoyancy scale. Phys Fluids. 2011 JUN; 23(6):066602
work page 2011
-
[11]
Energy spectra of stably stratified turbulence
Kimura Y, Herring JR. Energy spectra of stably stratified turbulence. J Fluid Mech. 2012; 698:19–50
work page 2012
-
[12]
Sensitivity of stratified turbulence to buoyancy Reynolds number
Bartello P, Tobias SM. Sensitivity of stratified turbulence to buoyancy Reynolds number. J Fluid Mech. 2013;725:1–22
work page 2013
-
[13]
Dynamics of stratified turbulence decaying from a high buoyancy Reynolds number
Maffioli A, Davidson PA. Dynamics of stratified turbulence decaying from a high buoyancy Reynolds number. J Fluid Mech. 2016;786:210–233
work page 2016
-
[14]
Free-stream boundaries of turbulent flows
Corrsin S, Kistler AL. Free-stream boundaries of turbulent flows. NACA Report. 1955; 1224:1033–1064
work page 1955
-
[15]
Turbulent/non-turbulent interfaces in wakes in stably stratified fluids
Watanabe T, Riley JJ, de Bruyn Kops SM, et al. Turbulent/non-turbulent interfaces in wakes in stably stratified fluids. J Fluid Mech. 2016 6;797:R1
work page 2016
-
[16]
Atmospheric turbulence and radio wave propagation
Lumley J, Yaglom A, Tatarski V. Atmospheric turbulence and radio wave propagation. Journal of Computational Chemistry. 1967;23(13):1236–1243
work page 1967
-
[17]
Goodfellow I, Bengio Y, Courville A. Deep learning. MIT Press; 2016. http://www. deeplearningbook.org
work page 2016
- [18]
-
[19]
Predictions of turbulent shear flows using deep neural networks
Srinivasan P, Guastoni L, Azizpour H, et al. Predictions of turbulent shear flows using deep neural networks. Physical Review Fluids. 2019;4(5):054603
work page 2019
-
[20]
Online turbulence model classification for large eddy simulation using deep learning
Maulik R, San O, Jacob JD, et al. Online turbulence model classification for large eddy simulation using deep learning. arXiv preprint arXiv:181211949. 2018
work page 2018
-
[21]
Autoencoders, unsupervised learning and deep architectures
Baldi P. Autoencoders, unsupervised learning and deep architectures. In: Proceedings of the 2011 International Conference on Unsupervised and Transfer Learning Workshop - Volume 27. JMLR.org; 2011. p. 37–50; UTLW’11. Available from: http://dl.acm.org/ citation.cfm?id=3045796.3045801
-
[22]
Learning low-dimensional feature dynamics using deep con- volutional recurrent autoencoders
Gonzalez FJ, Balajewicz M. Learning low-dimensional feature dynamics using deep con- volutional recurrent autoencoders. arXiv preprint arXiv:180801346. 2018
work page 2018
-
[23]
Mohan A, Daniel D, Chertkov M, et al. Compressed convolutional LSTM: An effi- cient deep learning framework to model high fidelity 3d turbulence. arXiv preprint arXiv:190300033. 2019
work page 2019
-
[24]
From deep to physics-informed learning of turbulence: Diagnostics
King R, Hennigh O, Mohan A, et al. From deep to physics-informed learning of turbulence: Diagnostics. arXiv preprint arXiv:181007785. 2018
work page 2018
-
[25]
Background modeling using deep-variational autoencoder
Vijayan M, Mohan R. Background modeling using deep-variational autoencoder. In: In- ternational Conference on Intelligent Systems Design and Applications; Springer; 2018. p. 335–344
work page 2018
-
[26]
Omata N, Shirayama S. A novel method of low-dimensional representation for temporal behavior of flow fields using deep autoencoder. AIP Advances. 2019;9(1):015006
work page 2019
-
[27]
A synthetic turbulent inflow generator using machine learning
Fukami K, Kawai K, Fukagata K. A synthetic turbulent inflow generator using machine learning. arXiv preprint arXiv:180608903. 2018
work page 2018
-
[28]
Neural networks based subgrid scale modeling in large eddy simulations
Sarghini F, De Felice G, Santini S. Neural networks based subgrid scale modeling in large eddy simulations. Computers & fluids. 2003;32(1):97–108
work page 2003
-
[29]
Deep neural networks for data-driven turbulence models
Beck AD, Flad DG, Munz CD. Deep neural networks for data-driven turbulence models. ResearchGate preprint. 2018
work page 2018
-
[30]
Subgrid modelling for two-dimensional turbulence using neural networks
Maulik R, San O, Rasheed A, et al. Subgrid modelling for two-dimensional turbulence using neural networks. Journal of Fluid Mechanics. 2019;858:122–144
work page 2019
-
[31]
Neural network-based modelling of unresolved stresses in a turbulent reacting flow with mean shear
Nikolaou ZM, Chrysostomou C, Minamoto Y, et al. Neural network-based modelling of unresolved stresses in a turbulent reacting flow with mean shear. arXiv preprint arXiv:190408167. 2019
work page 2019
-
[32]
Reynolds averaged turbulence modelling using deep neural networks with embedded invariance
Ling J, Kurzawski A, Templeton J. Reynolds averaged turbulence modelling using deep neural networks with embedded invariance. Journal of Fluid Mechanics. 2016;807:155– 166
work page 2016
-
[33]
A deep learning framework for turbulence modeling using data assimilation and feature extraction
Moghaddam AA, Sadaghiyani A. A deep learning framework for turbulence modeling using data assimilation and feature extraction. arXiv preprint arXiv:180206106. 2018
work page 2018
-
[34]
Turbulence model development using cfd-driven 17 machine learning
Zhao Y, Akolekar HD, Weatheritt J, et al. Turbulence model development using cfd-driven 17 machine learning. arXiv preprint arXiv:190209075. 2019
work page 2019
-
[35]
The nature of statistical learning theory
Vapnik V. The nature of statistical learning theory. Springer science & business media; 2013
work page 2013
-
[36]
Th´ eorie analytique de la chaleur
Boussinesq J. Th´ eorie analytique de la chaleur. Vol. 2, p. 172. Gauthier-Villars, Paris; 1903
work page 1903
-
[37]
A mathematical framework for forcing turbulence applied to horizontally homogeneous stratified flow
Rao KJ, de Bruyn Kops SM. A mathematical framework for forcing turbulence applied to horizontally homogeneous stratified flow. Phys Fluids. 2011;23:065110
work page 2011
-
[38]
A deterministic forcing scheme for direct numerical simulations of turbulence
Overholt MR, Pope SB. A deterministic forcing scheme for direct numerical simulations of turbulence. Computers & Fluids. 1998;27:11–28
work page 1998
-
[39]
The energy cascade in a strongly stratified fluid
Lindborg E. The energy cascade in a strongly stratified fluid. J Fluid Mech. 2006;550:207– 242
work page 2006
-
[40]
Classical turbulence scaling and intermittency in stably stratified Boussinesq turbulence
de Bruyn Kops SM. Classical turbulence scaling and intermittency in stably stratified Boussinesq turbulence. J Fluid Mech. 2015;775:436–463
work page 2015
-
[41]
Taylor GI. The spectrum of turbulence. P Roy Soc Lond A Mat. 1938
work page 1938
-
[42]
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps
Simonyan K, Vedaldi A, Zisserman A. Deep inside convolutional networks: Visualising image classification models and saliency maps. CoRR. 2013;abs/1312.6034
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[43]
Greedy layer-wise training of deep networks
Bengio Y, Lamblin P, Popovici D, et al. Greedy layer-wise training of deep networks. In: Proceedings of the 19th International Conference on Neural Information Processing Systems; Cambridge, MA, USA. MIT Press; 2006. p. 153–160; NIPS’06. Available from: http://dl.acm.org/citation.cfm?id=2976456.2976476
-
[44]
Deconvolution and checkerboard artifacts
Odena A, Dumoulin V, Olah C. Deconvolution and checkerboard artifacts. Distill. 2016; Available from: http://distill.pub/2016/deconv-checkerboard. 18
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.