Semigroup Consistency as a Diagnostic for Learned Physics Simulators
Pith reviewed 2026-06-29 22:17 UTC · model grok-4.3
The pith
Semigroup error in learned simulators tracks how much accuracy they lose over long rollouts.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For autonomous, state-complete dynamical systems the exact solution map satisfies the semigroup property that evolution over s+t equals evolution over s followed by evolution over t; learned predictors can be diagnosed by measuring how far their direct and composed outputs diverge, and this normalized semigroup error is positively associated with rollout degradation on heat and Burgers dynamics.
What carries the argument
Normalized semigroup error, which quantifies the discrepancy between a model's direct long-step prediction and the result of composing two shorter steps.
If this is right
- Semigroup error can be computed post hoc without access to ground-truth long trajectories.
- The measure flags models likely to degrade on long rollouts for time-conditioned ConvNet and FNO architectures on 1D heat and Burgers.
- Regularization toward semigroup consistency during training does not reliably improve rollout performance.
- The diagnostic applies specifically to autonomous state-complete dynamics.
Where Pith is reading between the lines
- The same check could be applied to select among candidate models before expensive long-rollout testing.
- If the correlation holds across higher-dimensional or chaotic systems, semigroup error could become a standard sanity check for learned simulators.
- Partial observability or external forcing would likely break the diagnostic, suggesting a need to test extensions that relax the autonomy assumption.
Load-bearing premise
The systems under study are autonomous and state-complete so that the true solution obeys the semigroup law.
What would settle it
Finding no positive association between semigroup error and rollout degradation on another autonomous system with complete state would falsify the diagnostic claim.
Figures
read the original abstract
Learned physics simulators are often evaluated by one-step or short-horizon prediction error, but these metrics can miss failures in temporal composition and long-horizon rollout. For autonomous, state-complete systems, exact solution maps satisfy a semigroup law: direct evolution over $s+t$ should agree with evolution over $s$ followed by $t$. We propose normalized semigroup error as a post hoc, model-agnostic diagnostic comparing these direct and composed learned predictions. On one-dimensional heat and Burgers dynamics with time-conditioned ConvNet and FNO baselines, semigroup error is positively associated with rollout degradation, with trajectory-level Spearman correlation $\rho = 0.635$ and $95%$ CI $[0.621, 0.649]$. Semigroup regularization has mixed effects, supporting semigroup consistency primarily as an evaluation diagnostic rather than a universally beneficial training objective.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes normalized semigroup error as a post-hoc, model-agnostic diagnostic for learned physics simulators on autonomous, state-complete systems. It reports that this error is positively associated with rollout degradation on 1D heat and Burgers dynamics using time-conditioned ConvNet and FNO baselines, with a trajectory-level Spearman correlation of ρ = 0.635 (95% CI [0.621, 0.649]), and finds mixed effects from semigroup regularization during training.
Significance. If the reported association holds under fuller experimental controls, the diagnostic could offer a lightweight way to flag temporal composition failures in learned simulators without full rollouts. The work correctly scopes its claims to autonomous systems and distinguishes evaluation from training use; the quantitative correlation on standard PDE benchmarks is a concrete contribution.
major comments (3)
- [Abstract] Abstract and §1: The central claim that semigroup error serves as a diagnostic for rollout degradation rests on experiments conducted exclusively on autonomous, state-complete systems where the ground-truth operator satisfies the semigroup law by construction. No counterexamples on non-autonomous (time-dependent forcing) or partially observed systems are reported, so it remains possible that the observed ρ = 0.635 reflects generic error accumulation rather than specific detection of semigroup violations.
- [§4–5] Experimental details (throughout §4–5): The manuscript provides no information on data splits, number of trajectories, normalization procedure for the semigroup error (including the free normalization factor), or whether the 95% CI accounts for multiple comparisons across equations and architectures. These omissions prevent assessment of the statistical robustness of the reported correlation.
- [§5] Results on regularization (abstract and §5): The statement that 'semigroup regularization has mixed effects' is presented without quantitative metrics, tables, or specific comparisons showing how regularization alters either the semigroup error or the rollout degradation correlation.
minor comments (2)
- [§3] Clarify the precise mathematical definition of normalized semigroup error, including how the normalization factor is chosen and whether it is fixed or data-dependent.
- [§4] Add a brief discussion of how time-conditioning is handled identically in the direct (s+t) versus composed (s then t) prediction paths for the time-conditioned baselines.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We respond point-by-point to the major comments below.
read point-by-point responses
-
Referee: [Abstract] Abstract and §1: The central claim that semigroup error serves as a diagnostic for rollout degradation rests on experiments conducted exclusively on autonomous, state-complete systems where the ground-truth operator satisfies the semigroup law by construction. No counterexamples on non-autonomous (time-dependent forcing) or partially observed systems are reported, so it remains possible that the observed ρ = 0.635 reflects generic error accumulation rather than specific detection of semigroup violations.
Authors: The manuscript explicitly scopes its claims and experiments to autonomous, state-complete systems, as correctly noted in the referee summary; the semigroup law holds by construction only in this setting, which is required to define the diagnostic. We agree that testing on non-autonomous or partially observed systems would be a valuable extension, but it lies outside the current scope. The reported correlation is between semigroup error and rollout degradation within these systems, providing evidence for the diagnostic's utility where the property is well-defined. We will add a sentence in the discussion reinforcing this scope. revision: partial
-
Referee: [§4–5] Experimental details (throughout §4–5): The manuscript provides no information on data splits, number of trajectories, normalization procedure for the semigroup error (including the free normalization factor), or whether the 95% CI accounts for multiple comparisons across equations and architectures. These omissions prevent assessment of the statistical robustness of the reported correlation.
Authors: We will include these details in the revised manuscript: data splits, number of trajectories, the normalization procedure for semigroup error (including determination of the free normalization factor), and clarification on the 95% CI computation regarding multiple comparisons. revision: yes
-
Referee: [§5] Results on regularization (abstract and §5): The statement that 'semigroup regularization has mixed effects' is presented without quantitative metrics, tables, or specific comparisons showing how regularization alters either the semigroup error or the rollout degradation correlation.
Authors: We will expand §5 with quantitative metrics, tables, and specific comparisons showing how regularization affects semigroup error and the rollout degradation correlation to substantiate the mixed-effects claim. revision: yes
Circularity Check
No significant circularity; diagnostic is a direct definition and correlation is an empirical measurement
full rationale
The normalized semigroup error is defined directly as the normalized discrepancy between a model's direct (s+t) prediction and its composed (s then t) prediction. The reported Spearman correlation (ρ=0.635) with rollout degradation is an observed statistical association computed on experimental trajectories from autonomous 1D heat/Burgers systems; it is not obtained by fitting a parameter to the target quantity or by renaming an input. No self-citations, uniqueness theorems, or ansatzes from prior author work appear in the derivation. The autonomy/state-completeness premise is an explicit modeling assumption required for the ground-truth semigroup law to hold, but the diagnostic itself and the measured association do not reduce to that premise by construction. The paper is therefore self-contained against external benchmarks with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
free parameters (1)
- normalization factor for semigroup error
axioms (1)
- domain assumption Exact solution maps of autonomous state-complete systems satisfy the semigroup law.
Reference graph
Works this paper leans on
-
[1]
Azizzadenesheli, K., Kovachki, N., Li, Z., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neu- ral operators for accelerating scientific simulations and design.Nature Reviews Physics, 6:320–328, 2024. doi: 10.1038/s42254-024-00712-5
-
[2]
Brunton, S. L., Noack, B. R., and Koumout- sakos, P. Machine learning for fluid mechan- ics.Annual Review of Fluid Mechanics, 52 (1):477–508, January 2020. ISSN 1545-4479. doi: 10.1146/annurev-fluid-010719-060214. URL http://dx.doi.org/10.1146/ annurev-fluid-010719-060214
-
[3]
Chen, J. and Wu, K. Deep-osg: Deep learning of oper- ators in semigroup.Journal of Computational Physics, 493:112498, 2023. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2023.112498. URL https: //www.sciencedirect.com/science/ article/pii/S0021999123005934
-
[4]
Chen, R. T. Q., Rubanova, Y ., Bettencourt, J., and Du- venaud, D. Neural ordinary differential equations. In Advances in Neural Information Processing Systems, volume 31, 2018
2018
-
[5]
C.Partial Differential Equations
Evans, L. C.Partial Differential Equations. American Mathematical Society, 2 edition, 2010
2010
-
[6]
Physics-informed machine learning
Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics- informed machine learning.Nature Reviews Physics, 3:422–440, 2021. doi: 10.1038/s42254-021-00314-5
-
[7]
Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learn- ing–accelerated computational fluid dynamics.Pro- ceedings of the National Academy of Sciences, 7 Semigroup Consistency as a Diagnostic for Learned Physics Simulators 118(21):e2101784118, 2021. doi: 10.1073/pnas. 2101784118. URL https://www.pnas.org/ doi/abs/10.1...
-
[8]
Apebench: A benchmark for autoregres- sive neural emulators of pdes, 2024
Koehler, F., Niedermayr, S., Westermann, R., and Thuerey, N. Apebench: A benchmark for autoregres- sive neural emulators of pdes, 2024. URL https: //arxiv.org/abs/2411.00180
-
[9]
Neural operator: Learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(89):1–97, 2023
Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(89):1–97, 2023. URL http: //jmlr.org/papers/v24/21-1524.html
2023
-
[10]
e3nn: Euclidean neural networks,
Krishnapriyan, A. S., Gholami, A., Zhe, S., Kirby, R. M., and Mahoney, M. W. Characterizing pos- sible failure modes in physics-informed neural net- works.Advances in Neural Information Processing Systems (NeurIPS), 34, 2021. doi: 10.48550/arXiv. 2109.01050. URL https://arxiv.org/abs/ 2109.01050
work page internal anchor Pith review doi:10.48550/arxiv 2021
-
[11]
Kurth, T., Subramanian, S., Harrington, P., Pathak, J., Mardani, M., Hall, D., Miele, A., Kashinath, K., and Anandkumar, A. Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators. InProceedings of the Plat- form for Advanced Scientific Computing Conference, PASC ’23, New York, NY , USA, 2023. Associati...
-
[12]
Koopman Theory for Partial Differential Equations
Kutz, J. N., Proctor, J. L., and Brunton, S. L. Koopman theory for partial differential equations, 2016. URL https://arxiv.org/abs/1607.07076
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[13]
Neural Operator: Graph Kernel Network for Partial Differential Equations
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial dif- ferential equations, 2020. URL https://arxiv. org/abs/2003.03485
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[14]
Multipole graph neural operator for parametric par- tial differential equations, 2020
Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Multipole graph neural operator for parametric par- tial differential equations, 2020. URL https:// arxiv.org/abs/2006.09535
-
[15]
B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A
Li, Z., Kovachki, N. B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial dif- ferential equations. InInternational Conference on Learning Representations, 2021. URL https:// openreview.net/forum?id=c8P9NQVtmnO
2021
-
[16]
Learning dissipative dynamics in chaotic systems, 2022
Li, Z., Liu-Schiaffini, M., Kovachki, N., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., and Anandkumar, A. Learning dissipative dynamics in chaotic systems, 2022. URL https://arxiv. org/abs/2106.06898
-
[17]
Li, Z., Meidani, K., and Farimani, A. B. Transformer for partial differential equations’ operator learning. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview. net/forum?id=EPPqt3uERT
2023
-
[19]
Physics-informed neural operator for learning partial differential equations, 2023
Li, Z., Zheng, H., Kovachki, N., Jin, D., Chen, H., Liu, B., Azizzadenesheli, K., and Anandkumar, A. Physics-informed neural operator for learning partial differential equations, 2023. URLhttps://arxiv. org/abs/2111.03794
-
[20]
Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021. doi: 10.1038/s42256-021-00302-5. URL https://doi. org/10.1038/s42256-021-00302-5
-
[21]
V AMP- nets for deep learning of molecular kinetics.Na- ture Communications, 9(1):5, 2018
Mardt, A., Pasquali, L., Wu, H., and No´e, F. V AMP- nets for deep learning of molecular kinetics.Na- ture Communications, 9(1):5, 2018. doi: 10.1038/ s41467-017-02388-1
2018
-
[22]
Mousavi, S., Mishra, S., and Lorenzis, L. D. Imposing boundary conditions on neural operators via learned function extensions, 2026. URL https://arxiv. org/abs/2602.04923
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[23]
Nguyen, B. D. and Sandfeld, S. Out-of-distribution generalization of deep-learning surrogates for 2d pde-generated dynamics in the small-data regime,
-
[24]
URL https://arxiv.org/abs/2601. 08404
-
[25]
J., Beneitez, M., Berger, M., Burkhart, B., Burns, K., Dalziel, S
Ohana, R., McCabe, M., Meyer, L., Morel, R., Agocs, F. J., Beneitez, M., Berger, M., Burkhart, B., Burns, K., Dalziel, S. B., Fielding, D. B., Fortunato, D., Goldberg, J. A., Hirashima, K., Jiang, Y .-F., Kerswell, R. R., Maddu, S., Miller, J., Mukhopadhyay, P., Nixon, S. S., Shen, J., Watteaux, R., Blancard, B. R.-S., Rozet, F., Parker, L. H., Cranmer, M...
-
[26]
Ap- plied Mathematical Sciences
Pazy, A.Semigroups of Linear Operators and Ap- plications to Partial Differential Equations. Ap- plied Mathematical Sciences. Springer New York,
-
[27]
URL https://books.google.com/ books?id=DQvpBwAAQBAJ
-
[28]
Qin, T., Wu, K., and Xiu, D. Data driven gov- erning equations approximation using deep neural networks.Journal of Computational Physics, 395: 620–635, 2019. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2019.06.042. URL https: //www.sciencedirect.com/science/ article/pii/S0021999119304504
-
[29]
Raissi, M., Perdikaris, P., and Karniadakis, G. Physics-informed neural networks: A deep learn- ing framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378: 686–707, 2019. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2018.10.045. URL https: //www.sciencedirect.com/s...
-
[30]
Raissi, M., Yazdani, A., and Karniadakis, G. E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations.Science, 367(6481):1026– 1030, 2020. doi: 10.1126/science.aaw4741
-
[31]
D., Rohner, T., Bar- tolucci, F., Alaifari, R., Mishra, S., and de Bezenac, E
Raonic, B., Molinaro, R., Ryck, T. D., Rohner, T., Bar- tolucci, F., Alaifari, R., Mishra, S., and de Bezenac, E. Convolutional neural operators for robust and ac- curate learning of PDEs. InThirty-seventh Confer- ence on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id= MtekhXRP4h
2023
-
[32]
SIMSHIFT: A bench- mark for adapting neural surrogates to distribution shifts, 2025
Setinek, P., Galletti, G., Gross, T., Schn¨urer, D., Brand- stetter, J., and Zellinger, W. SIMSHIFT: A bench- mark for adapting neural surrogates to distribution shifts, 2025. URL https://openreview.net/ forum?id=Eo4cRmb1yn
2025
-
[33]
Shih, B., Peyvan, A., Zhang, Z., and Karniadakis, G. E. Transformers as neural operators for solutions of differential equations with finite regularity.Computer Methods in Applied Mechanics and Engineering, 434:117560, 2025. ISSN 0045-7825. doi: https: //doi.org/10.1016/j.cma.2024.117560. URL https: //www.sciencedirect.com/science/ article/pii/S0045782524008144
-
[34]
Diagnosing failure modes of neural op- erators across diverse PDE families.Transactions on Machine Learning Research, 2026
Shikhman, L. Diagnosing failure modes of neural op- erators across diverse PDE families.Transactions on Machine Learning Research, 2026. ISSN 2835-8856. URL https://openreview.net/forum?id= 0S1LWZHQYn
2026
-
[35]
One operator to rule them all? on boundary-indexed operator families in neural PDE solvers
Shikhman, L. One operator to rule them all? on boundary-indexed operator families in neural PDE solvers. InAI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations, 2026. URLhttps:// openreview.net/forum?id=lDjWQ9UxRy
2026
-
[36]
B., Kochkov, D., Cran- mer, M., Pfaff, T., Godwin, J., Cui, C., Ho, S., Battaglia, P., and Sanchez-Gonzalez, A
Stachenfeld, K., Fielding, D. B., Kochkov, D., Cran- mer, M., Pfaff, T., Godwin, J., Cui, C., Ho, S., Battaglia, P., and Sanchez-Gonzalez, A. Learned sim- ulators for turbulence. InInternational Conference on Learning Representations, 2022. URL https:// openreview.net/forum?id=msRBojTz-Nh
2022
-
[37]
PDEBench: An extensive benchmark for scientific ma- chine learning
Takamoto, M., Praditia, T., Leiteritz, R., MacKin- lay, D., Alesiani, F., Pfl ¨uger, D., and Niepert, M. PDEBench: An extensive benchmark for scientific ma- chine learning. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Bench- marks Track, 2022. URL https://openreview. net/forum?id=dh_MkX0QfrK
2022
-
[38]
Wei, Z., Ooi, C. C., Wong, J. C., Gupta, A., Chiu, P.- H., and Ong, Y .-S. Out-of-distribution generalization for neural physics solvers, 2026. URL https:// arxiv.org/abs/2601.19091
-
[39]
Wu, K. and Xiu, D. Data-driven deep learn- ing of partial differential equations in modal space.Journal of Computational Physics, 408: 109307, 2020. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2020.109307. URL https: //www.sciencedirect.com/science/ article/pii/S0021999120300814
-
[40]
Zhou, K., Liu, Z., Qiao, Y ., Xiang, T., and Loy, C. C. Domain generalization: A survey.IEEE Transactions on Pattern Analysis and Machine In- telligence, pp. 1–20, 2022. ISSN 1939-3539. doi: 10.1109/tpami.2022.3195549. URL http://dx. doi.org/10.1109/TPAMI.2022.3195549. A. Numerical Solvers All reference trajectories are generated before training using det...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.