NORi: An ML-Augmented Ocean Boundary Layer Parameterization
Pith reviewed 2026-05-21 18:52 UTC · model grok-4.3
The pith
A physics-based Richardson number closure augmented by neural ODEs captures ocean boundary layer entrainment and remains stable over century-scale integrations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
NORi combines a Richardson-number-dependent local diffusivity and viscosity with neural ODEs that learn the additional entrainment flux needed at the base of the boundary layer. Trained a posteriori on short large-eddy simulation trajectories, the scheme reproduces observed entrainment under changing convective intensity, background stratification, rotation, and wind stress; it matches the seasonal cycle at Ocean Weather Station Papa as closely as a conventional k-ε closure; and it remains numerically stable for at least one hundred years inside a double-gyre configuration while allowing hourly time steps.
What carries the argument
The NORi closure, a hybrid model whose physical base is a Richardson-number-dependent diffusivity and viscosity and whose neural ODE component supplies the non-local entrainment flux across the mixed-layer base.
If this is right
- Climate models can adopt larger time steps without loss of stability.
- Training data volume and computational cost for developing new closures are sharply reduced because only short, high-resolution segments are required.
- Inference performance on quantities of direct interest (mixed-layer depth, entrainment rate) can be optimized as the primary training objective.
- Tracer conservation and realistic nonlinear thermodynamics are preserved by design rather than enforced after the fact.
Where Pith is reading between the lines
- The same hybrid architecture could be applied to other sub-grid processes such as cloud microphysics or gravity-wave drag where local closures miss non-local transport.
- Because stability emerges from the training process rather than from added constraints, similar methods may shorten the development cycle for parameterizations in other Earth-system components.
- Long integrations without retraining suggest the learned entrainment correction may remain valid across a wider range of ocean states than the original training set explicitly covered.
Load-bearing premise
Neural networks trained only on two-day large-eddy simulation segments will continue to produce physically consistent, non-drifting solutions when integrated for decades or centuries inside a different ocean model.
What would settle it
A multi-decade integration of a global ocean model using NORi that develops a systematic drift in temperature, salinity, or tracer inventories larger than the drift seen in a comparable k-ε simulation would falsify the long-term stability claim.
Figures
read the original abstract
NORi is a machine learning (ML) parameterization of ocean boundary layer turbulence that is physics-based and augmented with neural networks. NORi stands for neural ordinary differential equations (NODEs) Richardson number (Ri) closure. The physical parameterization is controlled by a Richardson number-dependent diffusivity and viscosity. The neural ODEs are trained to capture the entrainment through the base of the boundary layer, which cannot be represented with a local diffusive closure. The parameterization is trained using large-eddy simulations in an "a posteriori" fashion, where parameters are calibrated with a loss function that explicitly depends on the actual time-integrated variables of interest rather than the instantaneous subgrid fluxes, which are inherently noisy. NORi conserves tracers by design, uses realistic nonlinear thermodynamics, and demonstrates excellent prediction and generalization capabilities in capturing entrainment dynamics under different convective strengths, background stratifications, rotation, and wind forcings. NORi is shown to simulate the seasonal evolution of the boundary layer at Ocean Weather Station Papa with similar performance to the state-of-the-art two-equation $k$-$\epsilon$ closure. When implemented in a double-gyre simulation, it is numerically stable for at least 100 years, despite only being trained on two-day horizons, and can be run with time steps as long as one hour. The highly expressive neural networks, combined with a physically rigorous base closure, prove to be a robust paradigm for designing parameterizations for climate models: data required and training cost are drastically reduced, inference performance can be directly optimized as a primary objective, and numerical stability is implicitly promoted through training.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents NORi, a hybrid physics-ML parameterization for ocean boundary layer turbulence. It augments a local Richardson-number-dependent diffusivity and viscosity closure with neural ODEs that are trained a posteriori on short (two-day) LES trajectories to reproduce integrated entrainment at the base of the mixed layer. The approach conserves tracers by design, incorporates nonlinear thermodynamics, and is evaluated for generalization across convective forcing, stratification, rotation, and wind stress. Results include comparable performance to a k-ε closure in seasonal simulations at Ocean Weather Station Papa and numerical stability over 100 years in a double-gyre configuration, despite training only on short horizons and allowing time steps up to one hour.
Significance. If the long-term stability and lack of bias accumulation hold under broader testing, the work offers a concrete example of how a physically grounded base closure combined with targeted neural corrections can achieve data-efficient training, direct optimization of integrated quantities, and implicit numerical stability. This paradigm could meaningfully reduce the data and compute burden for developing hybrid parameterizations suitable for climate models while preserving conservation properties.
major comments (2)
- The central claim that short-horizon a posteriori training on integrated entrainment produces unbiased long-term evolution rests on the 100-year double-gyre stability result and the OWS Papa seasonal match. However, these tests do not directly address possible slow accumulation of entrainment or stratification biases over many turnover times, as the neural correction is non-local and learned rather than derived from an asymptotic limit. A quantitative assessment of mixed-layer depth drift or buoyancy flux bias relative to observations or a reference closure over multi-year periods would be required to support the extrapolation.
- The abstract and results sections report promising generalization across convective strengths, stratifications, rotation, and wind forcings, yet no quantitative error metrics (e.g., RMSE on entrainment rate or mixed-layer depth), baseline comparisons against purely local Ri closures or other ML parameterizations, or ablation studies on the neural ODE component are provided. This makes it difficult to judge the magnitude of improvement attributable to the NODE augmentation versus the underlying physical closure.
minor comments (2)
- Clarify the precise architecture and input features of the neural ODEs (e.g., which variables are passed to the network and how the entrainment flux is injected into the prognostic equations) to allow reproducibility.
- The manuscript would benefit from an explicit statement of the loss function weights and any regularization terms used during a posteriori training, as these choices directly affect the balance between short-term fidelity and long-term stability.
Simulated Author's Rebuttal
We thank the referee for the constructive comments, which help clarify the strengths and limitations of our work. We address each major comment below and indicate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: The central claim that short-horizon a posteriori training on integrated entrainment produces unbiased long-term evolution rests on the 100-year double-gyre stability result and the OWS Papa seasonal match. However, these tests do not directly address possible slow accumulation of entrainment or stratification biases over many turnover times, as the neural correction is non-local and learned rather than derived from an asymptotic limit. A quantitative assessment of mixed-layer depth drift or buoyancy flux bias relative to observations or a reference closure over multi-year periods would be required to support the extrapolation.
Authors: We agree that the existing tests, while supportive, do not explicitly quantify potential slow bias accumulation. The 100-year double-gyre run demonstrates stability with no visible drift, and the OWS Papa case matches the reference closure over a seasonal cycle, but a dedicated multi-year bias analysis would provide stronger evidence. We will add quantitative comparisons of mixed-layer depth and buoyancy flux drift relative to the k-ε closure over extended periods in the revised manuscript. revision: yes
-
Referee: The abstract and results sections report promising generalization across convective strengths, stratifications, rotation, and wind forcings, yet no quantitative error metrics (e.g., RMSE on entrainment rate or mixed-layer depth), baseline comparisons against purely local Ri closures or other ML parameterizations, or ablation studies on the neural ODE component are provided. This makes it difficult to judge the magnitude of improvement attributable to the NODE augmentation versus the underlying physical closure.
Authors: We acknowledge that the manuscript emphasizes qualitative generalization and visual agreement rather than explicit error metrics or controlled ablations. To address this, we will incorporate RMSE values for entrainment rates and mixed-layer depth in the generalization experiments, add direct baseline comparisons against the purely local Ri closure, and include ablation studies isolating the neural ODE contribution in the revised results section. revision: yes
Circularity Check
NN weights fitted a posteriori to LES-integrated entrainment; 'prediction' of entrainment dynamics reduces to the training fit for tested regimes
specific steps
-
fitted input called prediction
[Abstract]
"The neural ODEs are trained to capture the entrainment through the base of the boundary layer, which cannot be represented with a local diffusive closure. The parameterization is trained using large-eddy simulations in an 'a posteriori' fashion, where parameters are calibrated with a loss function that explicitly depends on the actual time-integrated variables of interest rather than the instantaneous subgrid fluxes"
The NN correction is calibrated directly against the time-integrated entrainment signal from the same LES data it is later asked to reproduce; for regimes inside the training distribution the reported 'prediction' of entrainment is therefore the fitted mapping rather than an independent derivation from the Ri base closure.
full rationale
The paper explicitly trains neural ODEs on short LES trajectories using a loss on time-integrated quantities to reproduce entrainment that a local Ri closure cannot capture. This is standard supervised ML augmentation rather than a first-principles derivation, so the entrainment 'prediction' in the coupled model is the learned correction by construction on the training distribution. However, the base Ri-dependent diffusivity/viscosity remains an independent physical closure, conservation is enforced by design, and generalization claims are supported by tests on varied forcings and a 100-year run. No self-citation load-bearing steps or self-definitional reductions appear in the provided text; the central result does not collapse to its inputs by definition.
Axiom & Free-Parameter Ledger
free parameters (1)
- Neural network weights and biases
axioms (1)
- domain assumption A local Richardson-number-dependent diffusivity and viscosity provides an adequate base representation of turbulent mixing.
invented entities (1)
-
Neural ODE component for entrainment
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The physical parameterization is controlled by a Richardson number-dependent diffusivity and viscosity. The neural ODEs are trained to capture the entrainment through the base of the boundary layer...
-
IndisputableMonolith/Foundation/ArrowOfTime.leanarrow_from_z unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
NORi is numerically stable for at least 100 years of integration time... despite only being trained on 2-day horizons
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
L., Hillier, A., Constantinou, N
Retrieved 2025-04-25, fromhttps://www.mdpi.com/2311-5521/6/10/360 doi: 10.3390/fluids6100360 –47– manuscript submitted toJournal of Advances in Modeling Earth Systems (JAMES) Wagner, G. L., Hillier, A., Constantinou, N. C., Silvestri, S., Souza, A., Burns, K. J., . . . Ferrari, R. (2025, April). Formulation and Calibration of CATKE, a One-Equation Paramet...
-
[2]
P., Zavala-Romero, O., Wan, X., & Cronin, M
Retrieved 2025-01-21, fromhttp://journals.ametsoc.org/doi/ 10.1175/2007JCLI1714.1doi: 10.1175/2007JCLI1714.1 Yuan, J., Liang, J., Chassignet, E. P., Zavala-Romero, O., Wan, X., & Cronin, M. F. (2024, September). The K-Profile Parameterization Augmented by Deep Neu- ral Networks (KPP dnn) in the General Ocean Turbulence Model (GOTM). Journal of Advances in...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.