pith. sign in

arxiv: 2605.26324 · v1 · pith:Q3KBCR4Gnew · submitted 2026-05-25 · 💻 cs.LG · cs.AI· cs.NA· math.NA

Semigroup Consistency as a Diagnostic for Learned Physics Simulators

Pith reviewed 2026-06-29 22:17 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NAmath.NA
keywords semigroup consistencylearned physics simulatorsrollout degradationheat equationBurgers equationConvNetFNOtemporal composition
0
0 comments X

The pith

Semigroup error in learned simulators tracks how much accuracy they lose over long rollouts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that for autonomous systems whose true evolution obeys the semigroup law, a simple consistency check can reveal when a learned model will fail at long-horizon prediction. The check compares a model's direct prediction over interval s+t with the result of applying it first over s and then over t; the normalized difference is called semigroup error. On one-dimensional heat and Burgers equations, this error correlates with actual rollout degradation at a trajectory-level Spearman coefficient of 0.635. Adding the same consistency term during training produces mixed outcomes, indicating the measure works best as an evaluation tool rather than a training objective.

Core claim

For autonomous, state-complete dynamical systems the exact solution map satisfies the semigroup property that evolution over s+t equals evolution over s followed by evolution over t; learned predictors can be diagnosed by measuring how far their direct and composed outputs diverge, and this normalized semigroup error is positively associated with rollout degradation on heat and Burgers dynamics.

What carries the argument

Normalized semigroup error, which quantifies the discrepancy between a model's direct long-step prediction and the result of composing two shorter steps.

If this is right

  • Semigroup error can be computed post hoc without access to ground-truth long trajectories.
  • The measure flags models likely to degrade on long rollouts for time-conditioned ConvNet and FNO architectures on 1D heat and Burgers.
  • Regularization toward semigroup consistency during training does not reliably improve rollout performance.
  • The diagnostic applies specifically to autonomous state-complete dynamics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same check could be applied to select among candidate models before expensive long-rollout testing.
  • If the correlation holds across higher-dimensional or chaotic systems, semigroup error could become a standard sanity check for learned simulators.
  • Partial observability or external forcing would likely break the diagnostic, suggesting a need to test extensions that relax the autonomy assumption.

Load-bearing premise

The systems under study are autonomous and state-complete so that the true solution obeys the semigroup law.

What would settle it

Finding no positive association between semigroup error and rollout degradation on another autonomous system with complete state would falsify the diagnostic claim.

Figures

Figures reproduced from arXiv: 2605.26324 by Lennon J. Shikhman.

Figure 1
Figure 1. Figure 1: Evaluation pipeline for semigroup consistency. A learned simulator is trained on PDE trajectories, then evaluated by com￾paring direct and composed learned evolution on held-out states. ment between direct and autoregressive simulation. Since it requires no architectural change or retraining, it can be applied to neural operators, autoregressive simulators, and continuous-time models (28; 2). It also appli… view at source ↗
Figure 3
Figure 3. Figure 3: Seen and unseen semigroup error averaged across evalu￾ated regimes. Bars show model-level means. Unseen composition pairs generally increase semigroup error, consistent with time￾composition shift. 6. Discussion The results support semigroup error primarily as a diag￾nostic for learned physics simulators. Across systems, ar￾chitectures, and variants, higher unseen semigroup error is associated with worse r… view at source ↗
Figure 2
Figure 2. Figure 2: Relationship between unseen semigroup error and roll￾out AUC error across systems, architectures, and training variants. Each point shows a model-level mean, while the annotated ρ re￾ports the global trajectory-level Spearman correlation computed across all held-out evaluations. whether temporal inconsistency becomes more visible un￾der time-composition shift. Averaged across all evaluated settings, unseen… view at source ↗
read the original abstract

Learned physics simulators are often evaluated by one-step or short-horizon prediction error, but these metrics can miss failures in temporal composition and long-horizon rollout. For autonomous, state-complete systems, exact solution maps satisfy a semigroup law: direct evolution over $s+t$ should agree with evolution over $s$ followed by $t$. We propose normalized semigroup error as a post hoc, model-agnostic diagnostic comparing these direct and composed learned predictions. On one-dimensional heat and Burgers dynamics with time-conditioned ConvNet and FNO baselines, semigroup error is positively associated with rollout degradation, with trajectory-level Spearman correlation $\rho = 0.635$ and $95%$ CI $[0.621, 0.649]$. Semigroup regularization has mixed effects, supporting semigroup consistency primarily as an evaluation diagnostic rather than a universally beneficial training objective.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes normalized semigroup error as a post-hoc, model-agnostic diagnostic for learned physics simulators on autonomous, state-complete systems. It reports that this error is positively associated with rollout degradation on 1D heat and Burgers dynamics using time-conditioned ConvNet and FNO baselines, with a trajectory-level Spearman correlation of ρ = 0.635 (95% CI [0.621, 0.649]), and finds mixed effects from semigroup regularization during training.

Significance. If the reported association holds under fuller experimental controls, the diagnostic could offer a lightweight way to flag temporal composition failures in learned simulators without full rollouts. The work correctly scopes its claims to autonomous systems and distinguishes evaluation from training use; the quantitative correlation on standard PDE benchmarks is a concrete contribution.

major comments (3)
  1. [Abstract] Abstract and §1: The central claim that semigroup error serves as a diagnostic for rollout degradation rests on experiments conducted exclusively on autonomous, state-complete systems where the ground-truth operator satisfies the semigroup law by construction. No counterexamples on non-autonomous (time-dependent forcing) or partially observed systems are reported, so it remains possible that the observed ρ = 0.635 reflects generic error accumulation rather than specific detection of semigroup violations.
  2. [§4–5] Experimental details (throughout §4–5): The manuscript provides no information on data splits, number of trajectories, normalization procedure for the semigroup error (including the free normalization factor), or whether the 95% CI accounts for multiple comparisons across equations and architectures. These omissions prevent assessment of the statistical robustness of the reported correlation.
  3. [§5] Results on regularization (abstract and §5): The statement that 'semigroup regularization has mixed effects' is presented without quantitative metrics, tables, or specific comparisons showing how regularization alters either the semigroup error or the rollout degradation correlation.
minor comments (2)
  1. [§3] Clarify the precise mathematical definition of normalized semigroup error, including how the normalization factor is chosen and whether it is fixed or data-dependent.
  2. [§4] Add a brief discussion of how time-conditioning is handled identically in the direct (s+t) versus composed (s then t) prediction paths for the time-conditioned baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments. We respond point-by-point to the major comments below.

read point-by-point responses
  1. Referee: [Abstract] Abstract and §1: The central claim that semigroup error serves as a diagnostic for rollout degradation rests on experiments conducted exclusively on autonomous, state-complete systems where the ground-truth operator satisfies the semigroup law by construction. No counterexamples on non-autonomous (time-dependent forcing) or partially observed systems are reported, so it remains possible that the observed ρ = 0.635 reflects generic error accumulation rather than specific detection of semigroup violations.

    Authors: The manuscript explicitly scopes its claims and experiments to autonomous, state-complete systems, as correctly noted in the referee summary; the semigroup law holds by construction only in this setting, which is required to define the diagnostic. We agree that testing on non-autonomous or partially observed systems would be a valuable extension, but it lies outside the current scope. The reported correlation is between semigroup error and rollout degradation within these systems, providing evidence for the diagnostic's utility where the property is well-defined. We will add a sentence in the discussion reinforcing this scope. revision: partial

  2. Referee: [§4–5] Experimental details (throughout §4–5): The manuscript provides no information on data splits, number of trajectories, normalization procedure for the semigroup error (including the free normalization factor), or whether the 95% CI accounts for multiple comparisons across equations and architectures. These omissions prevent assessment of the statistical robustness of the reported correlation.

    Authors: We will include these details in the revised manuscript: data splits, number of trajectories, the normalization procedure for semigroup error (including determination of the free normalization factor), and clarification on the 95% CI computation regarding multiple comparisons. revision: yes

  3. Referee: [§5] Results on regularization (abstract and §5): The statement that 'semigroup regularization has mixed effects' is presented without quantitative metrics, tables, or specific comparisons showing how regularization alters either the semigroup error or the rollout degradation correlation.

    Authors: We will expand §5 with quantitative metrics, tables, and specific comparisons showing how regularization affects semigroup error and the rollout degradation correlation to substantiate the mixed-effects claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity; diagnostic is a direct definition and correlation is an empirical measurement

full rationale

The normalized semigroup error is defined directly as the normalized discrepancy between a model's direct (s+t) prediction and its composed (s then t) prediction. The reported Spearman correlation (ρ=0.635) with rollout degradation is an observed statistical association computed on experimental trajectories from autonomous 1D heat/Burgers systems; it is not obtained by fitting a parameter to the target quantity or by renaming an input. No self-citations, uniqueness theorems, or ansatzes from prior author work appear in the derivation. The autonomy/state-completeness premise is an explicit modeling assumption required for the ground-truth semigroup law to hold, but the diagnostic itself and the measured association do not reduce to that premise by construction. The paper is therefore self-contained against external benchmarks with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that the systems are autonomous and state-complete (domain_assumption) and on the existence of a well-defined normalization for the error (free parameter). No new entities are postulated.

free parameters (1)
  • normalization factor for semigroup error
    The abstract refers to 'normalized' semigroup error without specifying the exact scaling; this choice affects the reported correlation magnitude.
axioms (1)
  • domain assumption Exact solution maps of autonomous state-complete systems satisfy the semigroup law.
    Invoked in the first paragraph of the abstract as the justification for the diagnostic.

pith-pipeline@v0.9.1-grok · 5672 in / 1311 out tokens · 20476 ms · 2026-06-29T22:17:19.885224+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

39 extracted references · 23 canonical work pages · 4 internal anchors

  1. [1]

    Neu- ral operators for accelerating scientific simulations and design.Nature Reviews Physics, 6:320–328, 2024

    Azizzadenesheli, K., Kovachki, N., Li, Z., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neu- ral operators for accelerating scientific simulations and design.Nature Reviews Physics, 6:320–328, 2024. doi: 10.1038/s42254-024-00712-5

  2. [2]

    L., Noack, B

    Brunton, S. L., Noack, B. R., and Koumout- sakos, P. Machine learning for fluid mechan- ics.Annual Review of Fluid Mechanics, 52 (1):477–508, January 2020. ISSN 1545-4479. doi: 10.1146/annurev-fluid-010719-060214. URL http://dx.doi.org/10.1146/ annurev-fluid-010719-060214

  3. [3]

    and Wu, K

    Chen, J. and Wu, K. Deep-osg: Deep learning of oper- ators in semigroup.Journal of Computational Physics, 493:112498, 2023. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2023.112498. URL https: //www.sciencedirect.com/science/ article/pii/S0021999123005934

  4. [4]

    Chen, R. T. Q., Rubanova, Y ., Bettencourt, J., and Du- venaud, D. Neural ordinary differential equations. In Advances in Neural Information Processing Systems, volume 31, 2018

  5. [5]

    C.Partial Differential Equations

    Evans, L. C.Partial Differential Equations. American Mathematical Society, 2 edition, 2010

  6. [6]

    Physics-informed machine learning

    Karniadakis, G. E., Kevrekidis, I. G., Lu, L., Perdikaris, P., Wang, S., and Yang, L. Physics- informed machine learning.Nature Reviews Physics, 3:422–440, 2021. doi: 10.1038/s42254-021-00314-5

  7. [7]

    2406783121

    Kochkov, D., Smith, J. A., Alieva, A., Wang, Q., Brenner, M. P., and Hoyer, S. Machine learn- ing–accelerated computational fluid dynamics.Pro- ceedings of the National Academy of Sciences, 7 Semigroup Consistency as a Diagnostic for Learned Physics Simulators 118(21):e2101784118, 2021. doi: 10.1073/pnas. 2101784118. URL https://www.pnas.org/ doi/abs/10.1...

  8. [8]

    Apebench: A benchmark for autoregres- sive neural emulators of pdes, 2024

    Koehler, F., Niedermayr, S., Westermann, R., and Thuerey, N. Apebench: A benchmark for autoregres- sive neural emulators of pdes, 2024. URL https: //arxiv.org/abs/2411.00180

  9. [9]

    Neural operator: Learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(89):1–97, 2023

    Kovachki, N., Li, Z., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Learning maps between function spaces with applications to pdes.Journal of Machine Learning Research, 24(89):1–97, 2023. URL http: //jmlr.org/papers/v24/21-1524.html

  10. [10]

    e3nn: Euclidean neural networks,

    Krishnapriyan, A. S., Gholami, A., Zhe, S., Kirby, R. M., and Mahoney, M. W. Characterizing pos- sible failure modes in physics-informed neural net- works.Advances in Neural Information Processing Systems (NeurIPS), 34, 2021. doi: 10.48550/arXiv. 2109.01050. URL https://arxiv.org/abs/ 2109.01050

  11. [11]

    Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators

    Kurth, T., Subramanian, S., Harrington, P., Pathak, J., Mardani, M., Hall, D., Miele, A., Kashinath, K., and Anandkumar, A. Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators. InProceedings of the Plat- form for Advanced Scientific Computing Conference, PASC ’23, New York, NY , USA, 2023. Associati...

  12. [12]

    Koopman Theory for Partial Differential Equations

    Kutz, J. N., Proctor, J. L., and Brunton, S. L. Koopman theory for partial differential equations, 2016. URL https://arxiv.org/abs/1607.07076

  13. [13]

    Neural Operator: Graph Kernel Network for Partial Differential Equations

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Neural operator: Graph kernel network for partial dif- ferential equations, 2020. URL https://arxiv. org/abs/2003.03485

  14. [14]

    Multipole graph neural operator for parametric par- tial differential equations, 2020

    Li, Z., Kovachki, N., Azizzadenesheli, K., Liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Multipole graph neural operator for parametric par- tial differential equations, 2020. URL https:// arxiv.org/abs/2006.09535

  15. [15]

    B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A

    Li, Z., Kovachki, N. B., Azizzadenesheli, K., liu, B., Bhattacharya, K., Stuart, A., and Anandkumar, A. Fourier neural operator for parametric partial dif- ferential equations. InInternational Conference on Learning Representations, 2021. URL https:// openreview.net/forum?id=c8P9NQVtmnO

  16. [16]

    Learning dissipative dynamics in chaotic systems, 2022

    Li, Z., Liu-Schiaffini, M., Kovachki, N., Liu, B., Azizzadenesheli, K., Bhattacharya, K., Stuart, A., and Anandkumar, A. Learning dissipative dynamics in chaotic systems, 2022. URL https://arxiv. org/abs/2106.06898

  17. [17]

    Li, Z., Meidani, K., and Farimani, A. B. Transformer for partial differential equations’ operator learning. Transactions on Machine Learning Research, 2023. ISSN 2835-8856. URL https://openreview. net/forum?id=EPPqt3uERT

  18. [19]

    Physics-informed neural operator for learning partial differential equations, 2023

    Li, Z., Zheng, H., Kovachki, N., Jin, D., Chen, H., Liu, B., Azizzadenesheli, K., and Anandkumar, A. Physics-informed neural operator for learning partial differential equations, 2023. URLhttps://arxiv. org/abs/2111.03794

  19. [20]

    Lu, L., Jin, P., Pang, G., Zhang, Z., and Karniadakis, G. E. Learning nonlinear operators via deeponet based on the universal approximation theorem of operators. Nature Machine Intelligence, 3(3):218–229, 2021. doi: 10.1038/s42256-021-00302-5. URL https://doi. org/10.1038/s42256-021-00302-5

  20. [21]

    V AMP- nets for deep learning of molecular kinetics.Na- ture Communications, 9(1):5, 2018

    Mardt, A., Pasquali, L., Wu, H., and No´e, F. V AMP- nets for deep learning of molecular kinetics.Na- ture Communications, 9(1):5, 2018. doi: 10.1038/ s41467-017-02388-1

  21. [22]

    Mousavi, S., Mishra, S., and Lorenzis, L. D. Imposing boundary conditions on neural operators via learned function extensions, 2026. URL https://arxiv. org/abs/2602.04923

  22. [23]

    Nguyen, B. D. and Sandfeld, S. Out-of-distribution generalization of deep-learning surrogates for 2d pde-generated dynamics in the small-data regime,

  23. [24]

    URL https://arxiv.org/abs/2601. 08404

  24. [25]

    J., Beneitez, M., Berger, M., Burkhart, B., Burns, K., Dalziel, S

    Ohana, R., McCabe, M., Meyer, L., Morel, R., Agocs, F. J., Beneitez, M., Berger, M., Burkhart, B., Burns, K., Dalziel, S. B., Fielding, D. B., Fortunato, D., Goldberg, J. A., Hirashima, K., Jiang, Y .-F., Kerswell, R. R., Maddu, S., Miller, J., Mukhopadhyay, P., Nixon, S. S., Shen, J., Watteaux, R., Blancard, B. R.-S., Rozet, F., Parker, L. H., Cranmer, M...

  25. [26]

    Ap- plied Mathematical Sciences

    Pazy, A.Semigroups of Linear Operators and Ap- plications to Partial Differential Equations. Ap- plied Mathematical Sciences. Springer New York,

  26. [27]

    URL https://books.google.com/ books?id=DQvpBwAAQBAJ

  27. [28]

    Data driven gov- erning equations approximation using deep neural networks.Journal of Computational Physics, 395: 620–635, 2019

    Qin, T., Wu, K., and Xiu, D. Data driven gov- erning equations approximation using deep neural networks.Journal of Computational Physics, 395: 620–635, 2019. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2019.06.042. URL https: //www.sciencedirect.com/science/ article/pii/S0021999119304504

  28. [29]

    Raissi, M., Perdikaris, P., and Karniadakis, G. Physics-informed neural networks: A deep learn- ing framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378: 686–707, 2019. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2018.10.045. URL https: //www.sciencedirect.com/s...

  29. [30]

    Raissi, M., Yazdani, A., and Karniadakis, G. E. Hidden fluid mechanics: Learning velocity and pressure fields from flow visualizations.Science, 367(6481):1026– 1030, 2020. doi: 10.1126/science.aaw4741

  30. [31]

    D., Rohner, T., Bar- tolucci, F., Alaifari, R., Mishra, S., and de Bezenac, E

    Raonic, B., Molinaro, R., Ryck, T. D., Rohner, T., Bar- tolucci, F., Alaifari, R., Mishra, S., and de Bezenac, E. Convolutional neural operators for robust and ac- curate learning of PDEs. InThirty-seventh Confer- ence on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id= MtekhXRP4h

  31. [32]

    SIMSHIFT: A bench- mark for adapting neural surrogates to distribution shifts, 2025

    Setinek, P., Galletti, G., Gross, T., Schn¨urer, D., Brand- stetter, J., and Zellinger, W. SIMSHIFT: A bench- mark for adapting neural surrogates to distribution shifts, 2025. URL https://openreview.net/ forum?id=Eo4cRmb1yn

  32. [33]

    Shih, B., Peyvan, A., Zhang, Z., and Karniadakis, G. E. Transformers as neural operators for solutions of differential equations with finite regularity.Computer Methods in Applied Mechanics and Engineering, 434:117560, 2025. ISSN 0045-7825. doi: https: //doi.org/10.1016/j.cma.2024.117560. URL https: //www.sciencedirect.com/science/ article/pii/S0045782524008144

  33. [34]

    Diagnosing failure modes of neural op- erators across diverse PDE families.Transactions on Machine Learning Research, 2026

    Shikhman, L. Diagnosing failure modes of neural op- erators across diverse PDE families.Transactions on Machine Learning Research, 2026. ISSN 2835-8856. URL https://openreview.net/forum?id= 0S1LWZHQYn

  34. [35]

    One operator to rule them all? on boundary-indexed operator families in neural PDE solvers

    Shikhman, L. One operator to rule them all? on boundary-indexed operator families in neural PDE solvers. InAI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations, 2026. URLhttps:// openreview.net/forum?id=lDjWQ9UxRy

  35. [36]

    B., Kochkov, D., Cran- mer, M., Pfaff, T., Godwin, J., Cui, C., Ho, S., Battaglia, P., and Sanchez-Gonzalez, A

    Stachenfeld, K., Fielding, D. B., Kochkov, D., Cran- mer, M., Pfaff, T., Godwin, J., Cui, C., Ho, S., Battaglia, P., and Sanchez-Gonzalez, A. Learned sim- ulators for turbulence. InInternational Conference on Learning Representations, 2022. URL https:// openreview.net/forum?id=msRBojTz-Nh

  36. [37]

    PDEBench: An extensive benchmark for scientific ma- chine learning

    Takamoto, M., Praditia, T., Leiteritz, R., MacKin- lay, D., Alesiani, F., Pfl ¨uger, D., and Niepert, M. PDEBench: An extensive benchmark for scientific ma- chine learning. InThirty-sixth Conference on Neural Information Processing Systems Datasets and Bench- marks Track, 2022. URL https://openreview. net/forum?id=dh_MkX0QfrK

  37. [38]

    C., Wong, J

    Wei, Z., Ooi, C. C., Wong, J. C., Gupta, A., Chiu, P.- H., and Ong, Y .-S. Out-of-distribution generalization for neural physics solvers, 2026. URL https:// arxiv.org/abs/2601.19091

  38. [39]

    and Xiu, D

    Wu, K. and Xiu, D. Data-driven deep learn- ing of partial differential equations in modal space.Journal of Computational Physics, 408: 109307, 2020. ISSN 0021-9991. doi: https: //doi.org/10.1016/j.jcp.2020.109307. URL https: //www.sciencedirect.com/science/ article/pii/S0021999120300814

  39. [40]

    Zhou, K., Liu, Z., Qiao, Y ., Xiang, T., and Loy, C. C. Domain generalization: A survey.IEEE Transactions on Pattern Analysis and Machine In- telligence, pp. 1–20, 2022. ISSN 1939-3539. doi: 10.1109/tpami.2022.3195549. URL http://dx. doi.org/10.1109/TPAMI.2022.3195549. A. Numerical Solvers All reference trajectories are generated before training using det...