pith. sign in

arxiv: 2509.21196 · v3 · pith:LQJVDKCBnew · submitted 2025-09-25 · 💻 cs.LG · cs.CV

Differential-Integral Neural Operator for Long-Term Turbulence Forecasting

Pith reviewed 2026-05-21 22:15 UTC · model grok-4.3

classification 💻 cs.LG cs.CV
keywords neural operatorsturbulence forecastinglong-term predictionKolmogorov flowoperator learningdifferential operatorsintegral operatorsphysics-informed machine learning
0
0 comments X

The pith

Decomposing turbulence into local differential and global integral operators allows accurate forecasts over hundreds of timesteps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes a neural operator framework that addresses the problem of error accumulation in long-term turbulence predictions by explicitly separating the local and global components of the dynamics. The method learns a local differential operator using a constrained convolutional network designed to converge to mathematical derivatives, while a Transformer-based integral operator captures non-local interactions through a learned kernel. By modeling these distinct physical structures in parallel branches, the approach maintains physical fidelity in both small-scale vorticity and large-scale energy spectra. A sympathetic reader would care because accurate long-term forecasts are essential for practical applications such as climate modeling and aerospace engineering, where existing deep learning methods typically break down after short times. The decomposition is motivated by first-principles operator analysis rather than purely data-driven fitting.

Core claim

We introduce the Differential-Integral Neural Operator (DINO) that models the evolution of turbulent flows by decomposing the governing operator into a local differential branch and a global integral branch. The differential branch uses a constrained convolutional network that is proven to converge to a derivative operator, while the integral branch employs a Transformer to learn a data-driven global kernel. This physics-informed separation provides stability in autoregressive forecasting, as validated on the 2D Kolmogorov flow where it outperforms prior methods by suppressing error growth over extended periods.

What carries the argument

The parallel differential-integral operator branches, with the differential part implemented via a provably convergent constrained convolution and the integral part via a Transformer-learned kernel.

If this is right

  • Suppresses error accumulation over hundreds of timesteps in autoregressive predictions.
  • Maintains high fidelity in vorticity fields and energy spectra.
  • Outperforms state-of-the-art neural operators on the 2D Kolmogorov flow benchmark.
  • Provides a new standard for physically consistent long-range turbulence forecasting.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This separation principle may extend to forecasting other chaotic systems such as atmospheric flows or ocean currents.
  • Combining DINO with traditional physics simulators could create more robust hybrid prediction systems.
  • Applying the method to three-dimensional turbulence datasets would reveal whether the decomposition scales to higher dimensions.

Load-bearing premise

The local dissipative effects and global non-local interactions in turbulence can be learned independently by separate operators without significant interference or violation of physical consistency.

What would settle it

If experiments on the 2D Kolmogorov flow show that DINO's prediction errors grow at a similar rate to standard neural operators after 100 or more timesteps, this would indicate that the decomposition does not provide the claimed stability advantage.

Figures

Figures reproduced from arXiv: 2509.21196 by Fan Xu, Fan Zhang, Hao Wu, Kun Wang, Qingsong Wen, Xian Wu, Xiaomeng Huang, Yuan Gao.

Figure 1
Figure 1. Figure 1: Architecture of DINO for a single forecast step. The model employs a sequential refinement pipeline. An initial Lifting Operator maps the input field ut to a latent space. The model then processes this representation with a Global Corrector, followed by a Local Refiner. The final forecast uˆt+∆t is produced via a residual update with a skip connection. 3.2 PHYSICS-DECOMPOSITION Governing PDEs, such as the … view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the experimental benchmark datasets. We employ three benchmarks with distinct physical characteristics to comprehensively evaluate model performance: (a) 2D Kolmogorov flow, a statistically stationary forced turbulence, tests for spectral fidelity. (b) 2D Isotropic isotropic turbulence, probes for long-term stability and the accurate modeling of physical dissipation. (c) The Prometheus benchmar… view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative visualization of DINO’s performance. (I) In long-term forecasting of 2D Kolmogorov flow (99 steps), DINO maintains high physical fidelity by preserving fine-scale vortices, effectively avoiding catastrophic failures like oversmoothing (FNO) and simulation collapse (SimVP). (II) For out-of-distribution generalization on the Prometheus fire simulation, DINO accurately captures sharp physical fron… view at source ↗
Figure 4
Figure 4. Figure 4: Comprehensive performance of DINO against state-of-the-art models across three distinct benchmarks. (a) High-correlation time on the 2D Kolmogorov flow, a rigorous test for long-term stability. DINO is the only model to maintain accuracy over the full 99-step rollout. (b) Performance on 2D isotropic turbulence, evaluating short-term physical fidelity. DINO sustains the highest accuracy (Corr > 0.9) through… view at source ↗
Figure 5
Figure 5. Figure 5: Enstrophy spectra for 2D Kolmogorov turbulence. DINO’s prediction accurately repro￾duces the spectrum across all wavenumbers, matching the Ground Truth and the theoretical k −3 scaling law, which demonstrates superior physical fidelity. In contrast, FNO loses energy at high wavenumbers due to oversmoothing, while SimVP generates non-physical energy artifacts from simulation collapse. (Note. All results are… view at source ↗
Figure 6
Figure 6. Figure 6: Robustness of Geo-DINO on sparse ocean forecasting. (Left) The plot compares 10-day forecast MSE across varying data densities, showing Geo-DINO consistently outperforms baselines. (Center) A 7-day forecast from 50% sparse data demonstrates that DINO’s prediction closely matches the ground truth, accurately capturing features like the Kuroshio Current. (Right) The forecast error map confirms high fidelity,… view at source ↗
read the original abstract

Accurately forecasting the long-term evolution of turbulence represents a grand challenge in scientific computing and is crucial for applications ranging from climate modeling to aerospace engineering. Existing deep learning methods, particularly neural operators, often fail in long-term autoregressive predictions, suffering from catastrophic error accumulation and a loss of physical fidelity. This failure stems from their inability to simultaneously capture the distinct mathematical structures that govern turbulent dynamics: local, dissipative effects and global, non-local interactions. In this paper, we propose the {\textbf{\underline{D}}}ifferential-{\textbf{\underline{I}}}ntegral {\textbf{\underline{N}}}eural {\textbf{\underline{O}}}perator (\method{}), a novel framework designed from a first-principles approach of operator decomposition. \method{} explicitly models the turbulent evolution through parallel branches that learn distinct physical operators: a local differential operator, realized by a constrained convolutional network that provably converges to a derivative, and a global integral operator, captured by a Transformer architecture that learns a data-driven global kernel. This physics-based decomposition endows \method{} with exceptional stability and robustness. Through extensive experiments on the challenging 2D Kolmogorov flow benchmark, we demonstrate that \method{} significantly outperforms state-of-the-art models in long-term forecasting. It successfully suppresses error accumulation over hundreds of timesteps, maintains high fidelity in both the vorticity fields and energy spectra, and establishes a new benchmark for physically consistent, long-range turbulence forecast.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces the Differential-Integral Neural Operator (DINO), a framework that decomposes turbulent evolution operators into parallel branches: a constrained convolutional network for local differential effects (claimed to converge to a derivative) and a Transformer for global integral effects. It applies this to autoregressive long-term forecasting on the 2D Kolmogorov flow benchmark, claiming significant outperformance over state-of-the-art neural operators through reduced error accumulation over hundreds of timesteps and preserved fidelity in vorticity fields and energy spectra.

Significance. If the decomposition demonstrably captures the required dynamics without missing cross-terms, the approach would represent a meaningful step toward physics-structured neural operators for long-horizon PDE forecasting, offering improved stability for applications in fluid dynamics, climate, and engineering. The explicit local-global separation and the constrained differential branch are strengths that could be leveraged more broadly if validated.

major comments (3)
  1. [Architecture and operator decomposition (likely §3)] The parallel-branch architecture assumes additive separability of local dissipative and global non-local operators. However, the Navier-Stokes advection term (u·∇)u is multiplicative. The manuscript should clarify how the independent differential and integral branches capture this coupling (e.g., via explicit analysis of the learned operators or targeted ablations) rather than relying on the Transformer's attention alone; this is load-bearing for the 'first-principles' stability claim.
  2. [Experimental results and tables] The abstract and results claim outperformance and error suppression over hundreds of timesteps on Kolmogorov flow, yet quantitative error bars, full ablation tables, and complete baseline comparisons are not provided. This prevents rigorous assessment of whether gains exceed those from generic regularization or attention mechanisms.
  3. [Differential operator branch definition] The constrained convolutional branch is described as 'provably convergent to a derivative.' The convergence statement, supporting theorem or derivation, and verification on discretized turbulent fields should be stated explicitly with reference to the relevant equation or appendix.
minor comments (2)
  1. [Abstract and notation] Ensure consistent use of the DINO acronym and notation for the differential and integral branches across the abstract, equations, and figures.
  2. [Figures] Energy spectra and vorticity visualizations would benefit from quantitative metrics (e.g., integrated error norms) alongside qualitative plots.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating planned revisions where appropriate to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Architecture and operator decomposition (likely §3)] The parallel-branch architecture assumes additive separability of local dissipative and global non-local operators. However, the Navier-Stokes advection term (u·∇)u is multiplicative. The manuscript should clarify how the independent differential and integral branches capture this coupling (e.g., via explicit analysis of the learned operators or targeted ablations) rather than relying on the Transformer's attention alone; this is load-bearing for the 'first-principles' stability claim.

    Authors: We appreciate the referee pointing out the distinction between additive operator decomposition and the multiplicative nature of the advection term. The differential branch is constrained to approximate local derivatives of the input fields, while the integral Transformer branch operates on features derived from the full state to learn non-local kernels; their parallel outputs are combined to approximate the composite evolution operator, allowing the attention mechanism to encode coupling effects. To make this explicit and substantiate the stability claim, we will add a new subsection with visualizations and quantitative analysis of the learned operators together with targeted ablations that disable one branch or the other and measure impact on advection-dominated regimes. revision: yes

  2. Referee: [Experimental results and tables] The abstract and results claim outperformance and error suppression over hundreds of timesteps on Kolmogorov flow, yet quantitative error bars, full ablation tables, and complete baseline comparisons are not provided. This prevents rigorous assessment of whether gains exceed those from generic regularization or attention mechanisms.

    Authors: We agree that the current presentation lacks sufficient statistical detail and exhaustive comparisons. In the revised manuscript we will augment all reported metrics with error bars (standard deviation over at least five independent random seeds), provide a complete ablation table that systematically removes or replaces each component, and expand the baseline section to include additional recent neural-operator variants so that readers can directly evaluate whether the observed gains are attributable to the differential-integral decomposition. revision: yes

  3. Referee: [Differential operator branch definition] The constrained convolutional branch is described as 'provably convergent to a derivative.' The convergence statement, supporting theorem or derivation, and verification on discretized turbulent fields should be stated explicitly with reference to the relevant equation or appendix.

    Authors: The convolutional kernels are constrained so that their weights correspond to finite-difference coefficients; under standard assumptions on grid spacing the operator converges to the continuous derivative in the appropriate norm. We will insert an explicit statement of this result (including the relevant equation and proof sketch) into Section 3 and add a short verification subsection in the appendix that applies the constrained branch to the discretized vorticity and velocity fields from the 2D Kolmogorov simulations. revision: yes

Circularity Check

0 steps flagged

No significant circularity in operator decomposition or long-term forecasting claims

full rationale

The paper motivates DINO via a first-principles decomposition of turbulent evolution into parallel local differential (constrained conv net claimed to converge to derivative) and global integral (Transformer kernel) branches, then validates long-term stability on 2D Kolmogorov flow via experiments. No equations or sections reduce the claimed predictions, stability over hundreds of timesteps, or physical fidelity to a fitted parameter chosen from the target data itself, a self-referential definition, or a load-bearing self-citation chain. The architectural separation is an explicit modeling choice whose correctness is tested externally rather than forced by construction.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the separability of local and global operators plus the provable convergence of the constrained CNN to a derivative; training hyperparameters and the choice of transformer kernel are learned from data.

free parameters (2)
  • network widths and depths
    Hyperparameters of the convolutional and transformer branches are chosen and optimized during training.
  • training loss weights
    Relative weighting between local and global branches is tuned to data.
axioms (2)
  • domain assumption Constrained convolutional network provably converges to a derivative operator
    Invoked in the abstract as the justification for the local branch.
  • domain assumption Turbulent evolution can be additively decomposed into local dissipative and global non-local components
    Stated as the first-principles motivation for the parallel-branch design.

pith-pipeline@v0.9.0 · 5803 in / 1315 out tokens · 23381 ms · 2026-05-21T22:15:04.179630+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages

  1. [1]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION format.date year duplicate empty "emp...

  2. [2]

    Accurate medium-range global weather forecasting with 3d neural networks

    Kaifeng Bi, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. Accurate medium-range global weather forecasting with 3d neural networks. Nature, 619 0 (7970): 0 533--538, 2023

  3. [3]

    Spherical fourier neural operators: Learning stable dynamics on the sphere

    Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, and Anima Anandkumar. Spherical fourier neural operators: Learning stable dynamics on the sphere. In International conference on machine learning, pp.\ 2806--2823. PMLR, 2023

  4. [4]

    Frequency spectrum of the flicker phenomenon in erythrocytes

    F Brochard and JF Lennon. Frequency spectrum of the flicker phenomenon in erythrocytes. Journal de Physique, 36 0 (11): 0 1035--1047, 1975

  5. [5]

    Physics-informed neural networks (pinns) for fluid mechanics: A review

    Shengze Cai, Zhiping Mao, Zhicheng Wang, Minglang Yin, and George Em Karniadakis. Physics-informed neural networks (pinns) for fluid mechanics: A review. Acta Mechanica Sinica, 37 0 (12): 0 1727--1738, 2021

  6. [6]

    Measuring and relieving the over-smoothing problem for graph neural networks from the topological view

    Deli Chen, Yankai Lin, Wei Li, Peng Li, Jie Zhou, and Xu Sun. Measuring and relieving the over-smoothing problem for graph neural networks from the topological view. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pp.\ 3438--3445, 2020

  7. [7]

    Diffusion models in vision: A survey

    Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. Diffusion models in vision: A survey. IEEE transactions on pattern analysis and machine intelligence, 45 0 (9): 0 10850--10869, 2023

  8. [8]

    An image is worth 16x16 words: Transformers for image recognition at scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2021. URL https...

  9. [9]

    Long-term 3d mhd simulations of black hole accretion discs formed in neutron star mergers

    Steven Fahlman and Rodrigo Fern \'a ndez. Long-term 3d mhd simulations of black hole accretion discs formed in neutron star mergers. Monthly Notices of the Royal Astronomical Society, 513 0 (2): 0 2689--2707, 2022

  10. [10]

    Long-term prediction of chaotic systems with machine learning

    Huawei Fan, Junjie Jiang, Chun Zhang, Xingang Wang, and Ying-Cheng Lai. Long-term prediction of chaotic systems with machine learning. Physical Review Research, 2 0 (1): 0 012080, 2020

  11. [11]

    Oneforecast: A universal framework for global and regional weather forecasting

    Yuan Gao, Hao Wu, Ruiqi Shu, Huanshuo Dong, Fan Xu, Rui Chen, Yibo Yan, Qingsong Wen, Xuming Hu, Kun Wang, et al. Oneforecast: A universal framework for global and regional weather forecasting. arXiv preprint arXiv:2502.00338, 2025

  12. [12]

    Simvp: Simpler yet better video prediction

    Zhangyang Gao, Cheng Tan, Lirong Wu, and Stan Z Li. Simvp: Simpler yet better video prediction. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp.\ 3170--3180, 2022

  13. [13]

    Hamiltonian neural networks

    Samuel Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. Advances in neural information processing systems, 32, 2019

  14. [14]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp.\ 770--778, 2016

  15. [15]

    Physically constrained generative adversarial networks for improving precipitation fields from earth system models

    Philipp Hess, Markus Dr \"u ke, Stefan Petri, Felix M Strnad, and Niklas Boers. Physically constrained generative adversarial networks for improving precipitation fields from earth system models. Nature Machine Intelligence, 4 0 (10): 0 828--839, 2022

  16. [16]

    Numerical simulations of non-spherical bubble collapse

    Eric Johnsen and TIM Colonius. Numerical simulations of non-spherical bubble collapse. Journal of fluid mechanics, 629: 0 231--262, 2009

  17. [17]

    Physics-informed machine learning

    George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and Liu Yang. Physics-informed machine learning. Nat. Rev. Phys., 2021

  18. [18]

    Learning skillful medium-range global weather forecasting

    Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, et al. Learning skillful medium-range global weather forecasting. Science, 382 0 (6677): 0 1416--1421, 2023

  19. [19]

    Fourier neural operator for parametric partial differential equations

    Zongyi Li, Nikola Borislavov Kovachki, Kamyar Azizzadenesheli, Burigede liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations. In ICLR, 2021

  20. [20]

    Fourier neural operator with learned deformations for pdes on general geometries

    Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, and Anima Anandkumar. Fourier neural operator with learned deformations for pdes on general geometries. Journal of Machine Learning Research, 24 0 (388): 0 1--26, 2023

  21. [21]

    Pde-refiner: Achieving accurate long rollouts with neural pde solvers

    Phillip Lippe, Bas Veeling, Paris Perdikaris, Richard Turner, and Johannes Brandstetter. Pde-refiner: Achieving accurate long rollouts with neural pde solvers. Advances in Neural Information Processing Systems, 36: 0 67398--67433, 2023

  22. [22]

    Neural operators with localized integral and differential kernels

    Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, and Anima Anandkumar. Neural operators with localized integral and differential kernels. arXiv preprint arXiv:2402.16845, 2024

  23. [23]

    Energy cascade and spatial fluxes in wall turbulence

    Nicoletta Marati, Carlo Massimo Casciola, and Renzo Piva. Energy cascade and spatial fluxes in wall turbulence. Journal of Fluid Mechanics, 521: 0 191--215, 2004

  24. [24]

    Direct numerical simulation: a tool in turbulence research

    Parviz Moin and Krishnan Mahesh. Direct numerical simulation: a tool in turbulence research. Annual review of fluid mechanics, 30 0 (1): 0 539--578, 1998

  25. [25]

    How reliable are climate models? Tellus A: Dynamic Meteorology and Oceanography, 59 0 (1): 0 2--29, 2007

    Jouni Ra \"a isa \"a nen. How reliable are climate models? Tellus A: Dynamic Meteorology and Oceanography, 59 0 (1): 0 2--29, 2007

  26. [26]

    Convolutional neural operators for robust and accurate learning of PDE s

    Bogdan Raonic, Roberto Molinaro, Tim De Ryck, Tobias Rohner, Francesca Bartolucci, Rima Alaifari, Siddhartha Mishra, and Emmanuel de Bezenac. Convolutional neural operators for robust and accurate learning of PDE s. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=MtekhXRP4h

  27. [27]

    U-net: Convolutional networks for biomedical image segmentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In Medical image computing and computer-assisted intervention--MICCAI 2015: 18th international conference, Munich, Germany, October 5-9, 2015, proceedings, part III 18, pp.\ 234--241. Springer, 2015

  28. [28]

    Direct numerical simulation of free-surface and interfacial flow

    Ruben Scardovelli and St \'e phane Zaleski. Direct numerical simulation of free-surface and interfacial flow. Annual review of fluid mechanics, 31 0 (1): 0 567--603, 1999

  29. [29]

    Toward data-driven weather and climate forecasting: Approximating a simple general circulation model with deep learning

    Sebastian Scher. Toward data-driven weather and climate forecasting: Approximating a simple general circulation model with deep learning. Geophysical Research Letters, 45 0 (22): 0 12--616, 2018

  30. [30]

    Lagrangian approach to structural collapse simulation

    Mettupalayam V Sivaselvan and Andrei M Reinhorn. Lagrangian approach to structural collapse simulation. Journal of Engineering mechanics, 132 0 (8): 0 795--805, 2006

  31. [31]

    The frequency spectrum of pulse width modulated signals

    Zukui Song and Dilip V Sarwate. The frequency spectrum of pulse width modulated signals. Signal Processing, 83 0 (10): 0 2227--2258, 2003

  32. [32]

    Methodology for long-term prediction of time series

    Antti Sorjamaa, Jin Hao, Nima Reyhani, Yongnan Ji, and Amaury Lendasse. Methodology for long-term prediction of time series. Neurocomputing, 70 0 (16-18): 0 2861--2869, 2007

  33. [33]

    An estimate of the lorenz energy cycle for the world ocean based on the storm/ncep simulation

    Jin-Song von Storch, Carsten Eden, Irina Fast, Helmuth Haak, Daniel Hern \'a ndez-Deckers, Ernst Maier-Reimer, Jochem Marotzke, and Detlef Stammer. An estimate of the lorenz energy cycle for the world ocean based on the storm/ncep simulation. Journal of physical oceanography, 42 0 (12): 0 2185--2205, 2012

  34. [34]

    Pdebench: An extensive benchmark for scientific machine learning

    Makoto Takamoto, Timothy Praditia, Raphael Leiteritz, Daniel MacKinlay, Francesco Alesiani, Dirk Pfl \"u ger, and Mathias Niepert. Pdebench: An extensive benchmark for scientific machine learning. Advances in Neural Information Processing Systems, 35: 0 1596--1611, 2022

  35. [35]

    Hamiltonian Generative Networks

    Peter Toth, Danilo Jimenez Rezende, Andrew Jaegle, S \'e bastien Racani \`e re, Aleksandar Botev, and Irina Higgins. Hamiltonian generative networks. arXiv preprint arXiv:1909.13789, 2019

  36. [36]

    Attention is all you need

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In NeurIPS, 2017

  37. [37]

    Solving high-dimensional pdes with latent spectral models

    Haixu Wu, Tengge Hu, Huakun Luo, Jianmin Wang, and Mingsheng Long. Solving high-dimensional pdes with latent spectral models. arXiv preprint arXiv:2301.12664, 2023

  38. [38]

    Pure: Prompt evolution with graph ode for out-of-distribution fluid dynamics modeling

    Hao Wu, Changhu Wang, Fan Xu, Jinbao Xue, Chong Chen, Xian-Sheng Hua, and Xiao Luo. Pure: Prompt evolution with graph ode for out-of-distribution fluid dynamics modeling. Advances in Neural Information Processing Systems, 37: 0 104965--104994, 2024 a

  39. [39]

    Prometheus: Out-of-distribution fluid dynamics modeling with disentangled graph ode

    Hao Wu, Huiyuan Wang, Kun Wang, Weiyan Wang, Yangyu Tao, Chong Chen, Xian-Sheng Hua, Xiao Luo, et al. Prometheus: Out-of-distribution fluid dynamics modeling with disentangled graph ode. In Forty-first International Conference on Machine Learning, 2024 b

  40. [40]

    Pastnet: Introducing physical inductive biases for spatio-temporal video prediction

    Hao Wu, Fan Xu, Chong Chen, Xian-Sheng Hua, Xiao Luo, and Haixin Wang. Pastnet: Introducing physical inductive biases for spatio-temporal video prediction. In Proceedings of the 32nd ACM International Conference on Multimedia, pp.\ 2917--2926, 2024 c

  41. [41]

    Neural manifold operators for learning the evolution of physical dynamics, 2024 d

    Hao Wu, Shuyi Zhou, Xiaomeng Huang, and Wei Xiong. Neural manifold operators for learning the evolution of physical dynamics, 2024 d . URL https://openreview.net/forum?id=SQnOmOzqAM

  42. [42]

    Advanced long-term earth system forecasting by learning the small-scale nature

    Hao Wu, Yuan Gao, Ruiqi Shu, Kun Wang, Ruijian Gou, Chuhan Wu, Xinliang Liu, Juncai He, Shuhao Cao, Junfeng Fang, Xingjian Shi, Feng Tao, Qi Song, Shengxuan Ji, Yanfei Xiang, Yuze Sun, Jiahao Li, Fan Xu, Huanshuo Dong, Haixin Wang, Fan Zhang, Penghao Zhao, Xian Wu, Qingsong Wen, Deliang Chen, and Xiaomeng Huang. Advanced long-term earth system forecasting...

  43. [43]

    Breaking the discretization barrier of continuous physics simulation learning

    Fan Xu, Hao Wu, Nan Wang, Lilan Peng, Kun Wang, Wei Gong, and Xibin Zhao. Breaking the discretization barrier of continuous physics simulation learning. arXiv preprint arXiv:2509.17955, 2025

  44. [44]

    Revisiting over-smoothing in deep gcns

    Chaoqi Yang, Ruijie Wang, Shuochao Yao, Shengzhong Liu, and Tarek Abdelzaher. Revisiting over-smoothing in deep gcns. arXiv preprint arXiv:2003.13663, 2020

  45. [45]

    @esa (Ref

    \@ifxundefined[1] #1\@undefined \@firstoftwo \@secondoftwo \@ifnum[1] #1 \@firstoftwo \@secondoftwo \@ifx[1] #1 \@firstoftwo \@secondoftwo [2] @ #1 \@temptokena #2 #1 @ \@temptokena \@ifclassloaded agu2001 natbib The agu2001 class already includes natbib coding, so you should not add it explicitly Type <Return> for now, but then later remove the command n...

  46. [46]

    \@lbibitem[] @bibitem@first@sw\@secondoftwo \@lbibitem[#1]#2 \@extra@b@citeb \@ifundefined br@#2\@extra@b@citeb \@namedef br@#2 \@nameuse br@#2\@extra@b@citeb \@ifundefined b@#2\@extra@b@citeb @num @parse #2 @tmp #1 NAT@b@open@#2 NAT@b@shut@#2 \@ifnum @merge>\@ne @bibitem@first@sw \@firstoftwo \@ifundefined NAT@b*@#2 \@firstoftwo @num @NAT@ctr \@secondoft...

  47. [47]

    @open @close @open @close and [1] URL: #1 \@ifundefined chapter * \@mkboth \@ifxundefined @sectionbib * \@mkboth * \@mkboth\@gobbletwo \@ifclassloaded amsart * \@ifclassloaded amsbook * \@ifxundefined @heading @heading NAT@ctr thebibliography [1] @ \@biblabel @NAT@ctr \@bibsetup #1 @NAT@ctr @ @openbib .11em \@plus.33em \@minus.07em 4000 4000 `\.\@m @bibit...