Spatiotemporal decoupled physics-informed Stone-Weierstrass neural operator for long-time prediction of time-dependent parametric PDEs

Guofeng Su; Hongxiang Ma; Lang Qin; Rui Yang; Shan Ding; Yongfu Tian

arxiv: 2605.15754 · v1 · pith:PLEFZJW5new · submitted 2026-05-15 · ⚛️ physics.comp-ph · cs.CE

Spatiotemporal decoupled physics-informed Stone-Weierstrass neural operator for long-time prediction of time-dependent parametric PDEs

Shan Ding , Yongfu Tian , Lang Qin , Hongxiang Ma , Guofeng Su , Rui Yang This is my paper

Pith reviewed 2026-05-19 18:51 UTC · model grok-4.3

classification ⚛️ physics.comp-ph cs.CE

keywords neural operatorsphysics-informed learningpartial differential equationslong-time predictionspatiotemporal decouplingStone-Weierstrass approximationtime-marching samplingparametric PDEs

0 comments

The pith

Encoding spatial and temporal information via separate subnetworks allows a physics-informed neural operator to avoid error accumulation in long-time predictions of parametric PDEs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes the physics-informed Stone-Weierstrass neural operator to solve time-dependent parametric PDEs over long time horizons where standard neural operators lose accuracy and stability. It separates the learning of spatial structure from temporal evolution by using two subnetworks that implement time-invariant spatial basis functions paired with time-varying coefficients, drawing on the Stone-Weierstrass theorem for the approximation guarantee. A time-marching batch-wise sampling method is added to handle memory limits when training across extended time intervals while preserving solution continuity. If the separation works as intended, neural operators become practical for simulating sustained physical processes without the rapid degradation that currently restricts their use.

Core claim

The PI-SWNO architecture encodes spatial information in one subnetwork to produce time-invariant basis functions and temporal information in a second subnetwork to produce time-varying coefficients; their combination approximates the solution operator for time-dependent parametric PDEs. This decoupling, justified by the Stone-Weierstrass theorem, is claimed to structurally limit error accumulation over long intervals. The time-marching batch sampling strategy then enables full-domain training without exceeding memory constraints, yielding continuous and convergent solutions across the entire time span.

What carries the argument

Spatiotemporal decoupling realized by two separate subnetworks that learn time-invariant spatial basis functions and time-varying evolution coefficients.

If this is right

Long-time predictions remain accurate without progressive degradation from accumulated approximation errors.
Memory usage during training drops enough to allow full-domain modeling of extended time sequences.
The framework applies directly to families of parametric time-dependent PDEs encountered in physics and engineering.
Training stability improves because the separation removes one source of compounding numerical drift.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same spatial-temporal split could be tested inside other neural operator families to check whether the error-mitigation benefit is architecture-specific or more general.
Application to stiff or multi-scale temporal problems would reveal whether the fixed spatial bases still capture the required dynamics without frequent retraining.
Hybrid models that combine this decoupling with classical numerical time-steppers could be examined for further gains in long-horizon accuracy.

Load-bearing premise

The assumption that time-invariant spatial basis functions combined with time-varying coefficients will inherently prevent error accumulation over long time intervals for time-dependent parametric PDEs.

What would settle it

A side-by-side run on a standard benchmark PDE such as the time-dependent Burgers equation or Navier-Stokes, comparing whether prediction error stays bounded or grows much more slowly with the decoupled architecture than with a conventional integrated neural operator when the time horizon is extended by factors of ten or more.

Figures

Figures reproduced from arXiv: 2605.15754 by Guofeng Su, Hongxiang Ma, Lang Qin, Rui Yang, Shan Ding, Yongfu Tian.

**Figure 2.** Figure 2: The schematic diagram of PI-SWNO framework details its core structural mod [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

**Figure 3.** Figure 3: The schematic diagram of time-marching batch-wise sampling strategy illustrates [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: 1D heat conduction equation: The first column shows the randomly sampled [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗

**Figure 6.** Figure 6: Statistical characteristics of mean squared error (MSE) and ANRL2E for PI [PITH_FULL_IMAGE:figures/full_fig_p024_6.png] view at source ↗

**Figure 7.** Figure 7: 2D heat conduction equation: The first plot shows the randomly sampled source [PITH_FULL_IMAGE:figures/full_fig_p026_7.png] view at source ↗

**Figure 8.** Figure 8: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗

**Figure 9.** Figure 9: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI-SWNO [PITH_FULL_IMAGE:figures/full_fig_p028_9.png] view at source ↗

**Figure 10.** Figure 10: 1D wave equation: The first column shows the randomly sampled initial con [PITH_FULL_IMAGE:figures/full_fig_p030_10.png] view at source ↗

**Figure 11.** Figure 11: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

**Figure 12.** Figure 12: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗

**Figure 13.** Figure 13: 2D wave equation: The first plot shows the randomly sampled initial condition; [PITH_FULL_IMAGE:figures/full_fig_p033_13.png] view at source ↗

**Figure 14.** Figure 14: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p034_14.png] view at source ↗

**Figure 15.** Figure 15: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI [PITH_FULL_IMAGE:figures/full_fig_p035_15.png] view at source ↗

**Figure 16.** Figure 16: 1D KdV equation: The first column shows the randomly sampled initial con [PITH_FULL_IMAGE:figures/full_fig_p038_16.png] view at source ↗

**Figure 17.** Figure 17: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p038_17.png] view at source ↗

**Figure 18.** Figure 18: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI [PITH_FULL_IMAGE:figures/full_fig_p039_18.png] view at source ↗

**Figure 19.** Figure 19: 1D Burgers equation: The first column shows the randomly sampled initial [PITH_FULL_IMAGE:figures/full_fig_p041_19.png] view at source ↗

**Figure 20.** Figure 20: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p041_20.png] view at source ↗

**Figure 21.** Figure 21: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI [PITH_FULL_IMAGE:figures/full_fig_p042_21.png] view at source ↗

**Figure 22.** Figure 22: 2D Burgers equation: The first plot shows the randomly sampled initial [PITH_FULL_IMAGE:figures/full_fig_p044_22.png] view at source ↗

**Figure 23.** Figure 23: Comparison of long-time ANRL2E growth trends between the baseline PI [PITH_FULL_IMAGE:figures/full_fig_p045_23.png] view at source ↗

**Figure 24.** Figure 24: Statistical characteristics of MSE and ANRL2E for PI-DeepONet and PI [PITH_FULL_IMAGE:figures/full_fig_p045_24.png] view at source ↗

**Figure 25.** Figure 25: Ablation study of the time-stepping batch-wise sampling strategy: We validate [PITH_FULL_IMAGE:figures/full_fig_p047_25.png] view at source ↗

read the original abstract

Driven by rapid advances in artificial intelligence and modern GPU computing capabilities, deep learning methods based on the optimization paradigm have provided new pathways to solve spatiotemporal physical problems, whose mathematical core lies in solving partial differential equations (PDEs). As an emerging class of function-space learning methods, neural operators (NOs) have exhibited great potential in efficient PDE solving. However, existing mainstream neural operator frameworks suffer from critical bottlenecks when modeling time-dependent PDEs over long time horizons, including accuracy degradation, insufficient stability, high training costs, and excessive memory consumption, which severely limit their practical deployment. To address these challenges in long-time prediction with neural operators, we propose a novel spatiotemporally decoupled physics-informed neural operator architecture, termed the physics-informed Stone-Weierstrass neural operator (PI-SWNO). The design is theoretically grounded in the decoupling paradigm combining time-invariant spatial basis functions with time-varying evolution coefficients, as well as the Stone-Weierstrass approximation theorem. By encoding spatial and temporal information via two separate subnetworks, the framework structurally mitigates the accumulation of errors over extended time intervals. Furthermore, we introduce a time-marching batch-wise sampling strategy to resolve the memory bottleneck of full-range modeling over extended time spans, ensuring continuity and convergence of full-time-domain solutions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The decoupling and batch sampling tackle memory use in long-time neural operators, but the architecture alone does not bound error growth under general parametric PDE dynamics.

read the letter

The paper's main move is to split the operator into a time-invariant spatial subnetwork and a separate time-varying coefficient subnetwork, justified by the Stone-Weierstrass theorem, then add a time-marching batch sampler so full trajectories do not have to fit in memory at once. This combination is presented as a direct response to accuracy drop-off and memory limits in existing neural operators for time-dependent parametric PDEs. The practical framing of the sampling strategy is useful; it is a straightforward way to keep training feasible over long horizons without changing the underlying PDE solver much. The authors also keep the physics-informed loss, which aligns with the target application area in computational physics. The central limitation is that separating the spatial basis from the coefficient evolution does not, by itself, control how small errors in the coefficient network get amplified by the dynamics. Stone-Weierstrass gives a density result on compact sets but supplies no contraction or Lipschitz control on the learned map for the coefficients. In problems with developing fine scales or sensitivity to perturbations, that separation can simply move the accumulation problem rather than remove it. The abstract states the theoretical grounding and the sampling fix but shows no error tables, baseline comparisons, or tests on stiff or chaotic cases, so the stability claim stays unverified at this level of detail. This is for readers already working on neural operators for engineering-scale time-dependent simulations who need concrete ideas for memory and horizon extension. A serious referee could usefully check the experiments and any supporting bounds once the full derivations and results are in front of them. I would send it to peer review.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the physics-informed Stone-Weierstrass neural operator (PI-SWNO) for long-time prediction of time-dependent parametric PDEs. It proposes a spatiotemporally decoupled architecture that encodes spatial information via time-invariant basis functions in one subnetwork and temporal evolution via time-varying coefficients in a second subnetwork, grounded in the Stone-Weierstrass approximation theorem. A time-marching batch-wise sampling strategy is added to address memory bottlenecks while maintaining continuity of the full-time-domain solution.

Significance. If the decoupling can be shown to control error growth without additional stability assumptions, the approach would offer a practical route to stable long-horizon neural-operator predictions at reduced memory cost, addressing a recognized limitation of existing operator-learning frameworks for evolutionary PDEs.

major comments (2)

[Abstract] Abstract: the assertion that separate subnetworks for spatial bases and temporal coefficients 'structurally mitigates the accumulation of errors' is not accompanied by a Lipschitz bound, contraction mapping, or stability estimate on the learned coefficient evolution; Stone-Weierstrass supplies only density on compact sets and does not control amplification of coefficient errors under the underlying PDE dynamics.
[Theoretical grounding] Theoretical grounding section: the decoupling replaces one source of temporal accumulation with another whose growth rate is not shown to be bounded by the separation alone; for PDEs that develop fine-scale structures or exhibit sensitivity to initial data, a small error in the coefficient subnetwork at step n can still be amplified even if the spatial basis remains fixed.

minor comments (1)

[Abstract] Abstract: inclusion of at least one schematic equation or diagram illustrating the two-subnetwork split would clarify the claimed decoupling for readers.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help clarify the theoretical scope of our claims regarding error mitigation in the PI-SWNO framework. We address each major comment below and describe the revisions planned for the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion that separate subnetworks for spatial bases and temporal coefficients 'structurally mitigates the accumulation of errors' is not accompanied by a Lipschitz bound, contraction mapping, or stability estimate on the learned coefficient evolution; Stone-Weierstrass supplies only density on compact sets and does not control amplification of coefficient errors under the underlying PDE dynamics.

Authors: We agree that the manuscript does not supply a Lipschitz bound, contraction mapping, or other stability estimate to rigorously prove control of error growth. The abstract phrasing is motivated by the architectural separation, in which time-invariant spatial basis functions are learned independently of the time-varying coefficients, thereby avoiding repeated spatial approximation errors at each time step. Stone-Weierstrass is invoked only for the density of the basis representation on compact sets. In the revised manuscript we will replace 'structurally mitigates' with 'is designed to mitigate' in the abstract and add a short paragraph in the theoretical section noting that a complete stability analysis under the PDE dynamics remains an open question for future work. revision: yes
Referee: [Theoretical grounding] Theoretical grounding section: the decoupling replaces one source of temporal accumulation with another whose growth rate is not shown to be bounded by the separation alone; for PDEs that develop fine-scale structures or exhibit sensitivity to initial data, a small error in the coefficient subnetwork at step n can still be amplified even if the spatial basis remains fixed.

Authors: The referee is correct that fixing the spatial basis does not by itself bound the growth rate of errors in the coefficient subnetwork, and that amplification remains possible for sensitive or fine-scale PDEs. The design choice reduces one pathway of error compounding (repeated spatial re-approximation) while leaving the temporal coefficient evolution to be learned; empirical results in the paper indicate improved long-time stability, yet no general bound is derived. We will revise the theoretical grounding section to state this limitation explicitly and to clarify that the separation provides a structural advantage rather than a proven contraction property. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation grounded in external theorem and design choice

full rationale

The paper's central architecture is presented as a novel combination of time-invariant spatial subnetworks and time-varying coefficient subnetworks, justified by appeal to the standard Stone-Weierstrass density theorem and a decoupling paradigm. No load-bearing step reduces by construction to a fitted parameter, self-citation chain, or renamed input; the error-mitigation claim is an asserted structural property rather than a tautological re-expression of training data. The time-marching batch strategy is a practical implementation detail without circular dependence on the predicted outputs. The derivation chain is therefore self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the applicability of the Stone-Weierstrass theorem to the decoupled representation and the effectiveness of the time-marching strategy for continuity; no free parameters or invented entities are explicitly named in the abstract.

axioms (2)

standard math Stone-Weierstrass approximation theorem can be used to justify the encoding of spatial and temporal information via separate subnetworks in the neural operator.
Invoked to ground the decoupling paradigm for function approximation in the operator.
domain assumption Time-invariant spatial basis functions combined with time-varying coefficients will maintain stability over long time horizons for the PDEs considered.
Core premise of the spatiotemporal decoupling that the architecture depends on.

pith-pipeline@v0.9.0 · 5778 in / 1397 out tokens · 34974 ms · 2026-05-19T18:51:22.695549+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The design is theoretically grounded in the decoupling paradigm combining time-invariant spatial basis functions with time-varying evolution coefficients, as well as the Stone-Weierstrass approximation theorem.
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

non-decreasing theorem of fitting error for fixed-parameter neural operators

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · 5 internal anchors

[1]

A. M. Vargas, Finite difference method for solving fractional differential equations at irregular meshes, Mathematics and Computers in Simulation 193 (2022) 204–216.doi:https: //doi.org/10.1016/j.matcom.2021.10.010. URLhttps://www.sciencedirect.com/science/article/pii/ S037847542100361X

work page doi:10.1016/j.matcom.2021.10.010 2022
[2]

Kergrene, I

K. Kergrene, I. Babuška, U. Banerjee, Stable generalized finite element method and associated iterative schemes; application to interface prob- lems, Computer Methods in Applied Mechanics and Engineering 305 (2016) 1–36.doi:https://doi.org/10.1016/j.cma.2016.02.030. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782516300603

work page doi:10.1016/j.cma.2016.02.030 2016
[3]

Buchmüller, J

P. Buchmüller, J. Dreher, C. Helzel, Finite volume weno methods for hyperbolic conservation laws on cartesian grids with adaptive mesh refinement, Applied Mathematics and Computation 272 (2016) 460–478.doi:https://doi.org/10.1016/j.amc.2015.03.078. URLhttps://www.sciencedirect.com/science/article/pii/ S0096300315003926

work page doi:10.1016/j.amc.2015.03.078 2016
[4]

Karniadakis

M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neu- ral networks: A deep learning framework for solving forward 60 and inverse problems involving nonlinear partial differential equa- tions, Journal of Computational Physics 378 (2019) 686–707. doi:https://doi.org/10.1016/j.jcp.2018.10.045. URLhttps://www.sciencedirect.com/science/article/pii/ S...

work page doi:10.1016/j.jcp.2018.10.045 2019
[5]

Z. Li, K. Meidani, A. B. Farimani, Transformer for partial differen- tial equations’ operator learning, Transactions on Machine Learning Re- search (2023). URLhttps://openreview.net/forum?id=EPPqt3uERT

work page 2023
[6]

N. T. Mücke, S. M. Bohté, C. W. Oosterlee, Reduced order modeling for parameterized time-dependent pdes using spatially and memory aware deep learning, Journal of Computational Science 53 (2021) 101408. doi:https://doi.org/10.1016/j.jocs.2021.101408. URLhttps://www.sciencedirect.com/science/article/pii/ S1877750321000934

work page doi:10.1016/j.jocs.2021.101408 2021
[7]

L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229.doi: 10.1038/s42256-021-00302-5. URLhttps://doi.org/10.1038/s42256-021-00302-5

work page doi:10.1038/s42256-021-00302-5 2021
[8]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Fourier neural operator for parametric partial dif- ferential equations (2021).arXiv:2010.08895. URLhttps://arxiv.org/abs/2010.08895

work page internal anchor Pith review Pith/arXiv arXiv 2021
[9]

Z. Li, H. Zheng, N. Kovachki, D. Jin, H. Chen, B. Liu, K. Azizzade- nesheli, A. Anandkumar, Physics-informed neural operator for learning partial differential equations (2023).arXiv:2111.03794. URLhttps://arxiv.org/abs/2111.03794

work page arXiv 2023
[10]

S. Wang, H. Wang, P. Perdikaris, Learning the solution operator of parametric partial differential equations with physics-informed deep- onets, Science Advances 7 (40) (2021) eabi8605.arXiv:https:// www.science.org/doi/pdf/10.1126/sciadv.abi8605,doi:10.1126/ sciadv.abi8605. URLhttps://www.science.org/doi/abs/10.1126/sciadv.abi8605 61

work page doi:10.1126/sciadv.abi8605 2021
[11]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

L. Mandl, D. Nayak, T. Ricken, S. Goswami, Physics-informed time- integrated deeponet: Temporal tangent space operator learning for high-accuracy inference (August 01, 2025 2025).doi:10.48550/arXiv. 2508.05190. URLhttps://ui.adsabs.harvard.edu/abs/2025arXiv250805190M

work page internal anchor Pith review doi:10.48550/arxiv 2025
[12]

D. N. Arnold, F. Brezzi, B. Cockburn, L. D. Marini, Unified analysis of discontinuous galerkin methods for elliptic problems, SIAM J. Numer. Anal. 39 (5) (2001) 1749–1779.doi:10.1137/S0036142901384162. URLhttps://doi.org/10.1137/S0036142901384162

work page doi:10.1137/s0036142901384162 2001
[13]

Villadsen, W

J. Villadsen, W. Stewart, Solution of boundary-value problems by orthogonal collocation, Chemical Engineering Science 50 (24) (1995) 3981–3996.doi:https://doi.org/10.1016/0009-2509(96)81831-8. URLhttps://www.sciencedirect.com/science/article/pii/ 0009250996818318

work page doi:10.1016/0009-2509(96)81831-8 1995
[14]

J. He, S. Kushwaha, J. Park, S. Koric, D. Abueidda, I. Jasiuk, Sequen- tial deep operator networks (s-deeponet) for predicting full-field solu- tions under time-dependent loads, Engineering Applications of Artifi- cial Intelligence 127 (2024) 107258.doi:https://doi.org/10.1016/ j.engappai.2023.107258

work page arXiv 2024
[15]

P. Jin, S. Meng, L. Lu, Mionet: Learning multiple-input operators via tensor product, SIAM Journal on Scientific Computing 44 (2022) A3490–A3514.doi:10.1137/22M1477751

work page doi:10.1137/22m1477751 2022
[16]

W. Diab, M. Al Kobaisi, Temporal neural operator for modeling time- dependent physical phenomena, Scientific Reports 15 (1) (2025) 32791. doi:10.1038/s41598-025-16922-5. URLhttps://doi.org/10.1038/s41598-025-16922-5

work page doi:10.1038/s41598-025-16922-5 2025
[17]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Graph kernel network for partial differential equations (2020).arXiv:2003.03485. URLhttps://arxiv.org/abs/2003.03485

work page internal anchor Pith review Pith/arXiv arXiv 2020
[18]

Raonić, R

B. Raonić, R. Molinaro, T. D. Ryck, T. Rohner, F. Bartolucci, R. Alai- fari, S. Mishra, E. de Bézenac, Convolutional neural operators for robust 62 and accurate learning of pdes (2023).arXiv:2302.01178. URLhttps://arxiv.org/abs/2302.01178

work page arXiv 2023
[19]

Karumuri, L

S. Karumuri, L. Graham-Brady, S. Goswami, Physics-informed latent neural operator for real-time predictions of time-dependent parametric pdes, Computer Methods in Applied Mechanics and Engineering 450 (2026) 118599.doi:https://doi.org/10.1016/j.cma.2025.118599. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782525008710

work page doi:10.1016/j.cma.2025.118599 2026
[20]

T. Wang, C. Wang, Latent neural operator pretraining for solving time- dependent pdes, in: M. Mahmud, M. Doborjeh, K. Wong, A. C. S. Leung, Z. Doborjeh, M. Tanveer (Eds.), Neural Information Processing, Springer Nature Singapore, Singapore, 2025, pp. 163–178

work page 2025
[21]

Koric, D

S. Koric, D. W. Abueidda, Data-driven and physics-informed deep learning operators for solution of heat conduction equa- tion with parametric heat source, International Journal of Heat and Mass Transfer 203 (2023) 123809.doi:https: //doi.org/10.1016/j.ijheatmasstransfer.2022.123809. URLhttps://www.sciencedirect.com/science/article/pii/ S0017931022012777

work page doi:10.1016/j.ijheatmasstransfer.2022.123809 2023
[22]

S. W. Cho, H. Son, Physics-informed deep inverse operator networks for solving pde inverse problems (2025).arXiv:2412.03161. URLhttps://arxiv.org/abs/2412.03161

work page arXiv 2025
[23]

S. Ding, Y. Tian, L. Qin, H. Ma, R. Yang, Physics-informed hierar- chical neural operator for solving inverse problem of unsteady heat conduction, International Journal of Heat and Mass Transfer 258 (2026) 128335.doi:https://doi.org/10.1016/j.ijheatmasstransfer. 2026.128335. URLhttps://www.sciencedirect.com/science/article/pii/ S0017931026000116

work page doi:10.1016/j.ijheatmasstransfer 2026
[24]

G. Lei, Z. Lei, L. Shi, Long-time integration of nonlinear wave equations with neural operators (2025).arXiv:2410.15617. URLhttps://arxiv.org/abs/2410.15617 63

work page arXiv 2025
[25]

S. W. Cho, J. Y. Lee, H. J. Hwang, Learning time-dependent pde via graph neural networks and deep operator network for robust accuracy on irregular grids, Journal of Computational Physics 544 (2026) 114430. doi:https://doi.org/10.1016/j.jcp.2025.114430. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125007120

work page doi:10.1016/j.jcp.2025.114430 2026
[26]

Nayak, S

D. Nayak, S. Goswami, Ti-deeponet: Learnable time integration for stable long-term extrapolation (2025).arXiv:2505.17341. URLhttps://arxiv.org/abs/2505.17341

work page arXiv 2025
[27]

T. Dao, A. Gu, Transformers are ssms: Generalized models and efficient algorithms through structured state space duality (2024).arXiv:2405. 21060. URLhttps://arxiv.org/abs/2405.21060

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

Z. Hu, N. A. Daryakenari, Q. Shen, K. Kawaguchi, G. E. Kar- niadakis, State-space models are accurate and efficient neu- ral operators for dynamical systems, Neural Networks (2025) 108496doi:https://doi.org/10.1016/j.neunet.2025.108496. URLhttps://www.sciencedirect.com/science/article/pii/ S0893608025013772

work page doi:10.1016/j.neunet.2025.108496 2025
[29]

A. Gu, T. Dao, Mamba: Linear-time sequence modeling with selective state spaces (2024). URLhttps://openreview.net/forum?id=AL1fq05o7H

work page 2024
[30]

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling (2014).arXiv: 1412.3555. URLhttps://arxiv.org/abs/1412.3555

work page internal anchor Pith review Pith/arXiv arXiv 2014
[31]

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, W. Zhang, In- former: Beyond efficient transformer for long sequence time-series fore- casting (2021).arXiv:2012.07436. URLhttps://arxiv.org/abs/2012.07436

work page arXiv 2021
[32]

Buitrago, T

R. Buitrago, T. Marwah, A. Gu, A. Risteski, On the benefits of memory for modeling time-dependent PDEs, in: The Thirteenth International 64 Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=o9kqa5K3tB

work page 2025
[33]

Michałowska, S

K. Michałowska, S. Goswami, G. E. Karniadakis, S. Riemer-Sørensen, Neural operator learning for long-time integration in dynamical systems with recurrent neural networks (2024).arXiv:2303.02243. URLhttps://arxiv.org/abs/2303.02243

work page arXiv 2024
[34]

Z. Hu, Q. Cao, K. Kawaguchi, G. E. Karniadakis, Deepomamba: State- space model for spatio-temporal pde neural operator learning, Journal of Computational Physics 540 (2025) 114272.doi:https://doi.org/ 10.1016/j.jcp.2025.114272

work page doi:10.1016/j.jcp.2025.114272 2025
[35]

W. Wang, M. Hakimzadeh, H. Ruan, S. Goswami, Time-marching neu- ral operator–fe coupling: Ai-accelerated physics modeling, Computer Methods in Applied Mechanics and Engineering 446 (2025) 118319. doi:https://doi.org/10.1016/j.cma.2025.118319. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782525005912

work page doi:10.1016/j.cma.2025.118319 2025
[36]

Y. Chen, Y. Lin, X. Sun, C. Yuan, Z. Gao, Tensor decomposition-based neural operator with dynamic mode decomposition for parameterized time-dependent problems, Journal of Computational Physics 533 (2025) 113996.doi:https://doi.org/10.1016/j.jcp.2025.113996. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125002797

work page doi:10.1016/j.jcp.2025.113996 2025
[37]

J. Chen, W. Xu, Z. Xu, N. Grande Gutiérrez, S. P. Narra, C. McComb, Enforcing the principle of locality for physical simulations with neural operators, Journal of Computational Physics 538 (2025) 114131. doi:https://doi.org/10.1016/j.jcp.2025.114131. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125004140

work page doi:10.1016/j.jcp.2025.114131 2025
[38]

T. Chen, H. Chen, Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE Trans Neural Netw 6 (4) (1995) 911–917

work page 1995
[39]

Rudin, Real and Complex Analysis, 3rd Edition, McGraw-Hill, New York, 1987

W. Rudin, Real and Complex Analysis, 3rd Edition, McGraw-Hill, New York, 1987. 65

work page 1987
[40]

Approximation by superpositions of a sigmoidal function.Mathematics of Control, Signals and Systems, 2:303–314, 1989

G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems 2 (4) (1989) 303–314.doi: 10.1007/BF02551274. URLhttps://doi.org/10.1007/BF02551274

work page doi:10.1007/bf02551274 1989
[41]

URLhttps://www.sciencedirect.com/science/article/pii/ S0045782524009538 66

D.W.Abueidda, P.Pantidis, M.E.Mobasher, Deepokan: Deepoperator network based on kolmogorov arnold networks for mechanics problems, Computer Methods in Applied Mechanics and Engineering 436 (2025) 117699.doi:https://doi.org/10.1016/j.cma.2024.117699. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782524009538 66

work page doi:10.1016/j.cma.2024.117699 2025

[1] [1]

A. M. Vargas, Finite difference method for solving fractional differential equations at irregular meshes, Mathematics and Computers in Simulation 193 (2022) 204–216.doi:https: //doi.org/10.1016/j.matcom.2021.10.010. URLhttps://www.sciencedirect.com/science/article/pii/ S037847542100361X

work page doi:10.1016/j.matcom.2021.10.010 2022

[2] [2]

Kergrene, I

K. Kergrene, I. Babuška, U. Banerjee, Stable generalized finite element method and associated iterative schemes; application to interface prob- lems, Computer Methods in Applied Mechanics and Engineering 305 (2016) 1–36.doi:https://doi.org/10.1016/j.cma.2016.02.030. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782516300603

work page doi:10.1016/j.cma.2016.02.030 2016

[3] [3]

Buchmüller, J

P. Buchmüller, J. Dreher, C. Helzel, Finite volume weno methods for hyperbolic conservation laws on cartesian grids with adaptive mesh refinement, Applied Mathematics and Computation 272 (2016) 460–478.doi:https://doi.org/10.1016/j.amc.2015.03.078. URLhttps://www.sciencedirect.com/science/article/pii/ S0096300315003926

work page doi:10.1016/j.amc.2015.03.078 2016

[4] [4]

Karniadakis

M. Raissi, P. Perdikaris, G. Karniadakis, Physics-informed neu- ral networks: A deep learning framework for solving forward 60 and inverse problems involving nonlinear partial differential equa- tions, Journal of Computational Physics 378 (2019) 686–707. doi:https://doi.org/10.1016/j.jcp.2018.10.045. URLhttps://www.sciencedirect.com/science/article/pii/ S...

work page doi:10.1016/j.jcp.2018.10.045 2019

[5] [5]

Z. Li, K. Meidani, A. B. Farimani, Transformer for partial differen- tial equations’ operator learning, Transactions on Machine Learning Re- search (2023). URLhttps://openreview.net/forum?id=EPPqt3uERT

work page 2023

[6] [6]

N. T. Mücke, S. M. Bohté, C. W. Oosterlee, Reduced order modeling for parameterized time-dependent pdes using spatially and memory aware deep learning, Journal of Computational Science 53 (2021) 101408. doi:https://doi.org/10.1016/j.jocs.2021.101408. URLhttps://www.sciencedirect.com/science/article/pii/ S1877750321000934

work page doi:10.1016/j.jocs.2021.101408 2021

[7] [7]

L. Lu, P. Jin, G. Pang, Z. Zhang, G. E. Karniadakis, Learning nonlinear operators via deeponet based on the universal approximation theorem of operators, Nature Machine Intelligence 3 (3) (2021) 218–229.doi: 10.1038/s42256-021-00302-5. URLhttps://doi.org/10.1038/s42256-021-00302-5

work page doi:10.1038/s42256-021-00302-5 2021

[8] [8]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Fourier neural operator for parametric partial dif- ferential equations (2021).arXiv:2010.08895. URLhttps://arxiv.org/abs/2010.08895

work page internal anchor Pith review Pith/arXiv arXiv 2021

[9] [9]

Z. Li, H. Zheng, N. Kovachki, D. Jin, H. Chen, B. Liu, K. Azizzade- nesheli, A. Anandkumar, Physics-informed neural operator for learning partial differential equations (2023).arXiv:2111.03794. URLhttps://arxiv.org/abs/2111.03794

work page arXiv 2023

[10] [10]

S. Wang, H. Wang, P. Perdikaris, Learning the solution operator of parametric partial differential equations with physics-informed deep- onets, Science Advances 7 (40) (2021) eabi8605.arXiv:https:// www.science.org/doi/pdf/10.1126/sciadv.abi8605,doi:10.1126/ sciadv.abi8605. URLhttps://www.science.org/doi/abs/10.1126/sciadv.abi8605 61

work page doi:10.1126/sciadv.abi8605 2021

[11] [11]

You are given a context below. Your task is to generate 15 diverse questions and answers based on this context:\n\n

L. Mandl, D. Nayak, T. Ricken, S. Goswami, Physics-informed time- integrated deeponet: Temporal tangent space operator learning for high-accuracy inference (August 01, 2025 2025).doi:10.48550/arXiv. 2508.05190. URLhttps://ui.adsabs.harvard.edu/abs/2025arXiv250805190M

work page internal anchor Pith review doi:10.48550/arxiv 2025

[12] [12]

D. N. Arnold, F. Brezzi, B. Cockburn, L. D. Marini, Unified analysis of discontinuous galerkin methods for elliptic problems, SIAM J. Numer. Anal. 39 (5) (2001) 1749–1779.doi:10.1137/S0036142901384162. URLhttps://doi.org/10.1137/S0036142901384162

work page doi:10.1137/s0036142901384162 2001

[13] [13]

Villadsen, W

J. Villadsen, W. Stewart, Solution of boundary-value problems by orthogonal collocation, Chemical Engineering Science 50 (24) (1995) 3981–3996.doi:https://doi.org/10.1016/0009-2509(96)81831-8. URLhttps://www.sciencedirect.com/science/article/pii/ 0009250996818318

work page doi:10.1016/0009-2509(96)81831-8 1995

[14] [14]

J. He, S. Kushwaha, J. Park, S. Koric, D. Abueidda, I. Jasiuk, Sequen- tial deep operator networks (s-deeponet) for predicting full-field solu- tions under time-dependent loads, Engineering Applications of Artifi- cial Intelligence 127 (2024) 107258.doi:https://doi.org/10.1016/ j.engappai.2023.107258

work page arXiv 2024

[15] [15]

P. Jin, S. Meng, L. Lu, Mionet: Learning multiple-input operators via tensor product, SIAM Journal on Scientific Computing 44 (2022) A3490–A3514.doi:10.1137/22M1477751

work page doi:10.1137/22m1477751 2022

[16] [16]

W. Diab, M. Al Kobaisi, Temporal neural operator for modeling time- dependent physical phenomena, Scientific Reports 15 (1) (2025) 32791. doi:10.1038/s41598-025-16922-5. URLhttps://doi.org/10.1038/s41598-025-16922-5

work page doi:10.1038/s41598-025-16922-5 2025

[17] [17]

Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stu- art, A. Anandkumar, Neural operator: Graph kernel network for partial differential equations (2020).arXiv:2003.03485. URLhttps://arxiv.org/abs/2003.03485

work page internal anchor Pith review Pith/arXiv arXiv 2020

[18] [18]

Raonić, R

B. Raonić, R. Molinaro, T. D. Ryck, T. Rohner, F. Bartolucci, R. Alai- fari, S. Mishra, E. de Bézenac, Convolutional neural operators for robust 62 and accurate learning of pdes (2023).arXiv:2302.01178. URLhttps://arxiv.org/abs/2302.01178

work page arXiv 2023

[19] [19]

Karumuri, L

S. Karumuri, L. Graham-Brady, S. Goswami, Physics-informed latent neural operator for real-time predictions of time-dependent parametric pdes, Computer Methods in Applied Mechanics and Engineering 450 (2026) 118599.doi:https://doi.org/10.1016/j.cma.2025.118599. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782525008710

work page doi:10.1016/j.cma.2025.118599 2026

[20] [20]

T. Wang, C. Wang, Latent neural operator pretraining for solving time- dependent pdes, in: M. Mahmud, M. Doborjeh, K. Wong, A. C. S. Leung, Z. Doborjeh, M. Tanveer (Eds.), Neural Information Processing, Springer Nature Singapore, Singapore, 2025, pp. 163–178

work page 2025

[21] [21]

Koric, D

S. Koric, D. W. Abueidda, Data-driven and physics-informed deep learning operators for solution of heat conduction equa- tion with parametric heat source, International Journal of Heat and Mass Transfer 203 (2023) 123809.doi:https: //doi.org/10.1016/j.ijheatmasstransfer.2022.123809. URLhttps://www.sciencedirect.com/science/article/pii/ S0017931022012777

work page doi:10.1016/j.ijheatmasstransfer.2022.123809 2023

[22] [22]

S. W. Cho, H. Son, Physics-informed deep inverse operator networks for solving pde inverse problems (2025).arXiv:2412.03161. URLhttps://arxiv.org/abs/2412.03161

work page arXiv 2025

[23] [23]

S. Ding, Y. Tian, L. Qin, H. Ma, R. Yang, Physics-informed hierar- chical neural operator for solving inverse problem of unsteady heat conduction, International Journal of Heat and Mass Transfer 258 (2026) 128335.doi:https://doi.org/10.1016/j.ijheatmasstransfer. 2026.128335. URLhttps://www.sciencedirect.com/science/article/pii/ S0017931026000116

work page doi:10.1016/j.ijheatmasstransfer 2026

[24] [24]

G. Lei, Z. Lei, L. Shi, Long-time integration of nonlinear wave equations with neural operators (2025).arXiv:2410.15617. URLhttps://arxiv.org/abs/2410.15617 63

work page arXiv 2025

[25] [25]

S. W. Cho, J. Y. Lee, H. J. Hwang, Learning time-dependent pde via graph neural networks and deep operator network for robust accuracy on irregular grids, Journal of Computational Physics 544 (2026) 114430. doi:https://doi.org/10.1016/j.jcp.2025.114430. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125007120

work page doi:10.1016/j.jcp.2025.114430 2026

[26] [26]

Nayak, S

D. Nayak, S. Goswami, Ti-deeponet: Learnable time integration for stable long-term extrapolation (2025).arXiv:2505.17341. URLhttps://arxiv.org/abs/2505.17341

work page arXiv 2025

[27] [27]

T. Dao, A. Gu, Transformers are ssms: Generalized models and efficient algorithms through structured state space duality (2024).arXiv:2405. 21060. URLhttps://arxiv.org/abs/2405.21060

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

Z. Hu, N. A. Daryakenari, Q. Shen, K. Kawaguchi, G. E. Kar- niadakis, State-space models are accurate and efficient neu- ral operators for dynamical systems, Neural Networks (2025) 108496doi:https://doi.org/10.1016/j.neunet.2025.108496. URLhttps://www.sciencedirect.com/science/article/pii/ S0893608025013772

work page doi:10.1016/j.neunet.2025.108496 2025

[29] [29]

A. Gu, T. Dao, Mamba: Linear-time sequence modeling with selective state spaces (2024). URLhttps://openreview.net/forum?id=AL1fq05o7H

work page 2024

[30] [30]

Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Empirical evaluation of gated recurrent neural networks on sequence modeling (2014).arXiv: 1412.3555. URLhttps://arxiv.org/abs/1412.3555

work page internal anchor Pith review Pith/arXiv arXiv 2014

[31] [31]

H. Zhou, S. Zhang, J. Peng, S. Zhang, J. Li, H. Xiong, W. Zhang, In- former: Beyond efficient transformer for long sequence time-series fore- casting (2021).arXiv:2012.07436. URLhttps://arxiv.org/abs/2012.07436

work page arXiv 2021

[32] [32]

Buitrago, T

R. Buitrago, T. Marwah, A. Gu, A. Risteski, On the benefits of memory for modeling time-dependent PDEs, in: The Thirteenth International 64 Conference on Learning Representations, 2025. URLhttps://openreview.net/forum?id=o9kqa5K3tB

work page 2025

[33] [33]

Michałowska, S

K. Michałowska, S. Goswami, G. E. Karniadakis, S. Riemer-Sørensen, Neural operator learning for long-time integration in dynamical systems with recurrent neural networks (2024).arXiv:2303.02243. URLhttps://arxiv.org/abs/2303.02243

work page arXiv 2024

[34] [34]

Z. Hu, Q. Cao, K. Kawaguchi, G. E. Karniadakis, Deepomamba: State- space model for spatio-temporal pde neural operator learning, Journal of Computational Physics 540 (2025) 114272.doi:https://doi.org/ 10.1016/j.jcp.2025.114272

work page doi:10.1016/j.jcp.2025.114272 2025

[35] [35]

W. Wang, M. Hakimzadeh, H. Ruan, S. Goswami, Time-marching neu- ral operator–fe coupling: Ai-accelerated physics modeling, Computer Methods in Applied Mechanics and Engineering 446 (2025) 118319. doi:https://doi.org/10.1016/j.cma.2025.118319. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782525005912

work page doi:10.1016/j.cma.2025.118319 2025

[36] [36]

Y. Chen, Y. Lin, X. Sun, C. Yuan, Z. Gao, Tensor decomposition-based neural operator with dynamic mode decomposition for parameterized time-dependent problems, Journal of Computational Physics 533 (2025) 113996.doi:https://doi.org/10.1016/j.jcp.2025.113996. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125002797

work page doi:10.1016/j.jcp.2025.113996 2025

[37] [37]

J. Chen, W. Xu, Z. Xu, N. Grande Gutiérrez, S. P. Narra, C. McComb, Enforcing the principle of locality for physical simulations with neural operators, Journal of Computational Physics 538 (2025) 114131. doi:https://doi.org/10.1016/j.jcp.2025.114131. URLhttps://www.sciencedirect.com/science/article/pii/ S0021999125004140

work page doi:10.1016/j.jcp.2025.114131 2025

[38] [38]

T. Chen, H. Chen, Universal approximation to nonlinear operators by neural networks with arbitrary activation functions and its application to dynamical systems, IEEE Trans Neural Netw 6 (4) (1995) 911–917

work page 1995

[39] [39]

Rudin, Real and Complex Analysis, 3rd Edition, McGraw-Hill, New York, 1987

W. Rudin, Real and Complex Analysis, 3rd Edition, McGraw-Hill, New York, 1987. 65

work page 1987

[40] [40]

Approximation by superpositions of a sigmoidal function.Mathematics of Control, Signals and Systems, 2:303–314, 1989

G. Cybenko, Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals and Systems 2 (4) (1989) 303–314.doi: 10.1007/BF02551274. URLhttps://doi.org/10.1007/BF02551274

work page doi:10.1007/bf02551274 1989

[41] [41]

URLhttps://www.sciencedirect.com/science/article/pii/ S0045782524009538 66

D.W.Abueidda, P.Pantidis, M.E.Mobasher, Deepokan: Deepoperator network based on kolmogorov arnold networks for mechanics problems, Computer Methods in Applied Mechanics and Engineering 436 (2025) 117699.doi:https://doi.org/10.1016/j.cma.2024.117699. URLhttps://www.sciencedirect.com/science/article/pii/ S0045782524009538 66

work page doi:10.1016/j.cma.2024.117699 2025