arxiv: 2605.07444 · v1 · submitted 2026-05-08 · 💻 cs.CE · cs.AI

Recognition: 2 theorem links

· Lean Theorem

Accelerated and data-efficient flow prediction in stirred tanks via physics-informed learning

Mahdi Naderibeni , Liang Wu , David M.J. Tax

Authors on Pith no claims yet

Pith reviewed 2026-05-11 01:53 UTC · model grok-4.3

classification 💻 cs.CE cs.AI

keywords physics-informed neural networksstirred tanksflow field predictionimplicit neural representationsRANS simulationsdata-efficient learningtracer transport

0 comments

The pith

Physics constraints on implicit neural representations improve accuracy and reduce variability in low-data flow predictions for stirred tanks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper generates steady flow datasets from RANS simulations across impeller speeds and liquid heights, then trains implicit neural representations of the velocity and pressure fields. It compares purely data-driven training against versions that add residuals of the steady RANS equations as soft constraints. The central result is that these constraints lower global error and cut run-to-run variability when the training set is small, produce more stable tracer transport, and still allow reasonable interpolation between operating conditions. The benefit shrinks once moderate amounts of data are available, while training becomes more expensive.

Core claim

Implicit neural representations of steady flow fields in an industrial stirred vessel achieve lower mean-squared error and more consistent downstream tracer transport when trained with added residuals of the steady RANS equations, particularly in low-data regimes; the same models also interpolate reasonably across impeller speeds and liquid heights.

What carries the argument

Implicit neural representations of flow fields regularized by soft constraints from the residuals of the steady Reynolds-Averaged Navier-Stokes equations.

If this is right

Prediction error falls with added training examples but shows clear diminishing returns past moderate dataset sizes.
Physics constraints reduce both absolute error and variability across independent training runs when data is scarce.
Tracer transport simulations driven by the learned fields remain more stable under the constrained models.
The representations support interpolation across different impeller speeds and fill levels without retraining.
Training cost rises when physics constraints are active.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could serve as a fast surrogate for real-time process monitoring or optimization loops in chemical engineering plants.
Similar constraint strategies may be worth testing on other expensive CFD domains where ground-truth data is limited.
The diminishing-returns curve suggests a practical stopping point for dataset generation in future stirred-tank studies.

Load-bearing premise

The RANS simulations supply sufficiently accurate ground truth and the chosen physics residuals are both correctly coded and sufficient to guide the network without adding new biases.

What would settle it

On a held-out set of impeller speeds or liquid heights, the physics-constrained model shows equal or higher error than the data-only model, or produces visibly more divergent tracer trajectories.

Figures

Figures reproduced from arXiv: 2605.07444 by David M.J. Tax, Liang Wu, Mahdi Naderibeni.

**Figure 2.** Figure 2: Learning curves for MLP, C-MLP, and CM-MLP models. (a) Test prediction [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Predictions made by MLP models (with 10 layers of 100 nodes) trained on [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Predictions from the MLP, C-MLP, and CM-MLP models (10 hidden layers, [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Effect of training-set size on tracer concentration predictions at the probe’s [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

read the original abstract

The simulation of fluid flows is computationally expensive due to the complexity of its governing partial differential equations. Machine learning models offer a potential surrogate, enabling learning from simulations and significantly faster predictions of flow fields. However, these models require large training datasets, which introduces a trade-off between dataset generation cost and predictive accuracy. In this work, we investigate the relationship between the size of the training-set and accuracy of the prediction when learning steady flow fields in an industrial-scale stirred vessel. A data set of steady flows is generated using Reynolds Averaged Navier Stokes (RANS) simulations in a range of realistic operating conditions, including impeller speeds and liquid heights. We train implicit neural representations of flow fields and compare purely data-driven and constrained variants. Model performance is evaluated using global mean squared error (MSE), qualitative spatial comparisons of predicted and reference flow fields, and tracer transport simulations. We find that the prediction error decreases monotonically with increasing training data, but also that it exhibits clear diminishing returns beyond moderate dataset sizes. Physics-based constraints significantly improve accuracy and reduce variability across training runs in low-data regimes, and they lead to more stable tracer-transport behavior. Furthermore, reasonable interpolation can be achieved over different impeller speeds and liquid heights. However, these benefits come with an increase in the complexity of training, and their relative advantage diminishes as the training set grows.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Physics constraints improve data efficiency and stability for stirred-tank flow surrogates in low-data regimes, but the residual formulation against RANS data needs explicit confirmation.

read the letter

The main takeaway is that physics constraints on implicit neural representations cut error and run-to-run variability when training data for stirred-tank flows is scarce, while also supporting reasonable interpolation across impeller speeds and liquid heights. The gains show up in global MSE, spatial fields, and a downstream tracer transport test, which is a practical check on whether the surrogate would actually be useful in process modeling.

Referee Report

2 major / 2 minor

Summary. The manuscript trains implicit neural representations on RANS-generated steady flow fields in stirred tanks and compares purely data-driven models against physics-constrained variants. It reports that prediction error decreases with dataset size but shows diminishing returns, that physics constraints improve accuracy and reduce run-to-run variability in low-data regimes, and that the constrained models yield more stable downstream tracer transport while permitting reasonable interpolation across impeller speeds and liquid heights.

Significance. If the central claims hold, the work provides evidence that physics-informed implicit representations can serve as data-efficient surrogates for industrial CFD, lowering the cost of generating large RANS datasets while preserving physical consistency in both field predictions and derived transport behavior. The systematic sweep over dataset sizes, use of multiple metrics (global MSE, spatial fields, tracer transport), and explicit comparison of constrained versus unconstrained models are concrete strengths that support the data-efficiency narrative.

major comments (2)

[Physics-informed loss formulation (Methods)] The central claim attributes accuracy gains and reduced variability to the addition of physics-based constraints. However, the training targets are steady RANS fields that embed a turbulence closure. The manuscript must explicitly state (and ideally show the equation for) whether the residual loss enforces the full steady RANS equations, including the modeled Reynolds stresses or effective viscosity, or only the laminar Navier-Stokes residuals. If the latter, the observed improvements may reflect generic regularization rather than consistent physics enforcement, undermining the interpretation that the constrained model is meaningfully physics-informed. This point is load-bearing for the paper's primary conclusion.
[Results, low-data regime] §4 (low-data regime results): The reported reduction in variability across training runs is a key supporting observation, yet no quantitative measure (standard deviation over multiple random seeds, or statistical test) is provided to establish that the difference between constrained and unconstrained models is significant rather than anecdotal.

minor comments (2)

[Abstract] Abstract: The phrase 'reasonable interpolation' over impeller speeds and liquid heights should be accompanied by quantitative error metrics on the held-out operating conditions rather than a qualitative statement.
[Evaluation metrics] Evaluation: Clarify whether the global MSE is computed on all velocity components and pressure, whether fields are normalized, and how the tracer-transport simulations are initialized and integrated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed report. The two major comments identify important points of clarification and evidence that strengthen the manuscript. We address each below and have revised the manuscript to incorporate the requested details.

read point-by-point responses

Referee: [Physics-informed loss formulation (Methods)] The central claim attributes accuracy gains and reduced variability to the addition of physics-based constraints. However, the training targets are steady RANS fields that embed a turbulence closure. The manuscript must explicitly state (and ideally show the equation for) whether the residual loss enforces the full steady RANS equations, including the modeled Reynolds stresses or effective viscosity, or only the laminar Navier-Stokes residuals. If the latter, the observed improvements may reflect generic regularization rather than consistent physics enforcement, undermining the interpretation that the constrained model is meaningfully physics-informed. This point is load-bearing for the paper's primary conclusion.

Authors: We appreciate this observation and agree that the exact form of the residual must be stated unambiguously. The physics-informed loss enforces the steady RANS equations under the Boussinesq approximation, using the effective viscosity (molecular viscosity plus turbulent viscosity) obtained from the same turbulence closure employed to generate the training data. The residual is formed from the divergence of the total stress tensor together with the continuity equation; no additional source terms from the turbulence transport equations are included. We have added the explicit residual equations as new Equation (5) in the revised Methods section and have clarified the consistency with the data-generating RANS model. This revision removes any ambiguity and supports the interpretation that the constraints are physically aligned rather than generic regularization. revision: yes
Referee: [Results, low-data regime] §4 (low-data regime results): The reported reduction in variability across training runs is a key supporting observation, yet no quantitative measure (standard deviation over multiple random seeds, or statistical test) is provided to establish that the difference between constrained and unconstrained models is significant rather than anecdotal.

Authors: We agree that a quantitative measure is required. In the revised manuscript we now report the standard deviation of the global MSE across five independent training runs (different random seeds) for both model variants at each dataset size. We have also added a two-sample t-test comparing the means of the constrained and unconstrained models; the difference is statistically significant (p < 0.05) for the two smallest dataset sizes. These statistics appear in the updated §4 text and in a new supplementary table. revision: yes

Circularity Check

0 steps flagged

No significant circularity; central claims rest on empirical held-out evaluation against independent RANS data

full rationale

The paper generates steady RANS flow fields as ground truth, trains implicit neural representations (data-driven vs. physics-constrained), and evaluates on held-out cases using MSE, field comparisons, and tracer transport. Physics constraints are stated as residuals of the governing equations (standard RANS/NS form) rather than being fitted to or defined from the prediction targets. No load-bearing self-citation chains, no self-definitional loops, and no renaming of known results as new derivations appear in the abstract or described methodology. The evaluation therefore does not reduce to a tautology by the paper's own equations.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The work rests on the standard RANS approximation for generating ground-truth data and on the assumption that implicit neural representations can faithfully encode steady incompressible flow fields; no new free parameters or invented physical entities are introduced beyond typical neural-network hyperparameters.

axioms (2)

domain assumption Reynolds-averaged Navier-Stokes equations with appropriate turbulence closure yield sufficiently accurate steady flow fields for the operating conditions studied
Used to generate all training and test data; invoked in the dataset-generation paragraph of the abstract.
domain assumption Implicit neural representations can represent continuous three-dimensional velocity and pressure fields without discretization artifacts
Core modeling choice stated in the methods description.

pith-pipeline@v0.9.0 · 5541 in / 1470 out tokens · 37114 ms · 2026-05-11T01:53:13.773687+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We train implicit neural representations of flow fields and compare purely data-driven and constrained variants... Lres = 1/M Σ ||R(˜x_j, ˜μ_j)||²₂ with Rcont=∇·u and Ru=(u_rel·∇)u + ... −∇·[(ν+ν_t)∇u] ... (SST k-ω closure)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

physics-based constraints significantly improve accuracy... in low-data regimes

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

20 extracted references · 20 canonical work pages · 2 internal anchors

[1]

1997 , publisher=

Mixing in the process industries , author=. 1997 , publisher=

work page 1997
[2]

Chemical Engineering Science , volume=

Mixing in large-scale vessels stirred with multiple radial or radial and axial up-pumping impellers: modelling and measurements , author=. Chemical Engineering Science , volume=. 2000 , publisher=

work page 2000
[3]

cerevisiae fermentation , author=

Euler-Lagrange analysis towards representative down-scaling of a 22 m3 aerobic S. cerevisiae fermentation , author=. Chemical Engineering Science , volume=. 2017 , publisher=

work page 2017
[4]

Advances in Neural Information Processing Systems , volume=

Fourier features let networks learn high frequency functions in low dimensional domains , author=. Advances in Neural Information Processing Systems , volume=

work page
[5]

Advances in neural information processing systems , volume=

Implicit neural representations with periodic activation functions , author=. Advances in neural information processing systems , volume=

work page
[6]

2024 , howpublished =

work page 2024
[7]

Journal of Computational physics , volume=

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations , author=. Journal of Computational physics , volume=. 2019 , publisher=

work page 2019
[8]

Advances in Neural Information Processing Systems , volume=

Signal processing for implicit neural representations , author=. Advances in Neural Information Processing Systems , volume=

work page
[9]

From data to functa: Your data point is a function and you can treat it like one.arXiv preprint arXiv:2201.12204, 2022

From data to functa: Your data point is a function and you can treat it like one , author=. arXiv preprint arXiv:2201.12204 , year=

work page arXiv
[10]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Implicit neural representation in medical imaging: A comparative survey , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page
[11]

2007 , publisher=

An introduction to computational fluid dynamics the finite volume method, 2/E , author=. 2007 , publisher=

work page 2007
[12]

Biochemical Engineering Journal , volume=

From industrial fermentor to CFD-guided downscaling: what have we learned? , author=. Biochemical Engineering Journal , volume=. 2018 , publisher=

work page 2018
[13]

Annual review of fluid mechanics , volume=

Machine learning for fluid mechanics , author=. Annual review of fluid mechanics , volume=. 2020 , publisher=

work page 2020
[14]

Fourier Neural Operator for Parametric Partial Differential Equations

Fourier neural operator for parametric partial differential equations , author=. arXiv preprint arXiv:2010.08895 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2010
[15]

DeepONet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators

Deeponet: Learning nonlinear operators for identifying differential equations based on the universal approximation theorem of operators , author=. arXiv preprint arXiv:1910.03193 , year=

work page internal anchor Pith review arXiv 1910
[16]

Journal of Computational Physics , volume=

A physics-informed diffusion model for high-fidelity flow field reconstruction , author=. Journal of Computational Physics , volume=. 2023 , publisher=

work page 2023
[17]

2024 , note =

ANSYS Fluent Theory Guide , author =. 2024 , note =

work page 2024
[18]

Naderibeni, Mahdi and Wu, Liang and Tax, David M. J. , title =. 2026 , publisher =

work page 2026
[19]

Naderibeni, Mahdi and Wu, Liang and Tax, David M. J. , title =. GitHub repository , howpublished =. 2026 , publisher =

work page 2026
[20]

In connection with cross-referencing and possible future hyperlinking it is not a good idea to collect more that one literature item in one + +

+ is cited as + ESG96 +. In connection with cross-referencing and possible future hyperlinking it is not a good idea to collect more that one literature item in one + +. The so-called Harvard or author-year style of referencing is enabled by the package natbib . With this package the literature can be cited as follows: enumerate [ ] Parenthetical: + WB96 ...

work page 1996