pith. machine review for the scientific record. sign in

arxiv: 2512.14471 · v2 · submitted 2025-12-16 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

Kinetic-Mamba: Mamba-Assisted Predictions of Stiff Chemical Kinetics

Authors on Pith no claims yet

Pith reviewed 2026-05-16 21:37 UTC · model grok-4.3

classification 💻 cs.LG
keywords chemical kineticsMambaneural operatorsstiff ODEscombustion modelingtime-series predictionlatent dynamics
0
0 comments X

The pith

Mamba-based neural models predict the full time evolution of stiff chemical kinetics from initial thermochemical states alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops Kinetic-Mamba, a family of Mamba architectures that learn to advance thermochemical state variables forward in time for combustion reaction mechanisms. It demonstrates that these models can deliver high-fidelity predictions of complex, stiff dynamics without integrating traditional stiff ODE solvers, relying solely on the starting values of temperature and species concentrations. Variants enforce mass conservation, separate temperature regimes, or evolve the dynamics in a reduced latent space before reconstruction. A sympathetic reader would care because accurate kinetics govern heat release and pollutant formation in engines and flames, so replacing expensive integration steps with fast forward passes could make large-scale combustion simulations practical.

Core claim

Kinetic-Mamba integrates Mamba sequence models with neural operators to learn the temporal evolution of thermochemical states; the standalone version maps initial conditions directly to future states, the constrained version adds mass-conservation penalties, the regime-informed version deploys separate Mamba models for distinct temperature ranges, and the latent version advances dynamics in a compressed space before lifting back to the physical manifold. Experiments on Syngas and GRI-Mech 3.0 mechanisms confirm high accuracy under both time-decomposition and recursive prediction, including on out-of-distribution initial conditions.

What carries the argument

Mamba sequence model that processes thermochemical state trajectories to predict future states while respecting the underlying reaction manifold.

If this is right

  • The standalone Mamba model produces direct state predictions from initial conditions without any differential-equation integration.
  • The constrained variant maintains species mass balance throughout the predicted trajectory.
  • The regime-informed pair of Mamba models separately captures low- and high-temperature kinetic regimes.
  • The latent-space version evolves a compressed representation and reconstructs the full physical state on the manifold.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If recursive stability holds, the same architecture could replace stiff integrators in other multi-scale physical systems such as atmospheric chemistry or plasma kinetics.
  • Successful out-of-distribution performance suggests the models may generalize to untested reaction mechanisms once trained on a sufficiently diverse set of mechanisms.
  • The latent variant implies that dimensionality reduction can be combined with Mamba dynamics to further lower the cost of long-horizon predictions.

Load-bearing premise

The learned Mamba dynamics remain stable and accurate when rolled out recursively over long times and on initial conditions outside the training distribution without accumulating errors from the stiff timescales.

What would settle it

Compare recursive Mamba rollouts against reference stiff-ODE solutions on GRI-Mech 3.0 trajectories started from initial conditions that differ markedly in temperature or composition from the training set; divergence beyond a small tolerance at any time step would falsify the claim.

Figures

Figures reproduced from arXiv: 2512.14471 by Additi Pandey, George Em Karniadakis, Hessam Babaee, Liang Wei.

Figure 1
Figure 1. Figure 1: Schematic representation of the Mamba block [24]: The entire Mamba block consists of SSM related block and a skip connection block, combined via element-wise multiplication. The SSM related block consists of linear projection, convolution, non-linear activation function (SiLU) and SSM block, while the skip connection block consists of a linear transformation followed by non-linear activation. In the figure… view at source ↗
Figure 2
Figure 2. Figure 2: Schematic representation of latent Kinetic-Mamba: Corresponding to the GRI dataset, we input the initial conditions corresponding to 24 state variables (temperature, mass fractions of 23 active species) into our latent-KM model, and obtain their temporal evolution up to the 100th time step. Recall that we have already performed time decomposition on our dataset before we send it to the latent-KM model. Ins… view at source ↗
Figure 3
Figure 3. Figure 3: Sample from Syngas A mechanism: We can see good agreement between the predicted dynamics of the state variables and the ground truth values of the dynamics for an arbitrary test sample [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Violin plot for the Syngas A mechanism: This figure represents the percentage relative 𝐿2 error obtained when the 𝐿2 norm is taken with respect to the temporal dimension for both predicted and true test dataset. The plot in the log scale represents good accuracy across all samples for all 13 state variables. 12 [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Violin plot for the Syngas A mechanism using mass-conserving KM model: This figure represents the percentage relative 𝐿2 error obtained when the 𝐿2 norm is taken with respect to the temporal dimension for both predicted and true test dataset. The plot represents good accuracy across all samples for all 13 state variables, as seen above in the log scale. The total relative 𝐿2 error is 0.017%. 3.1.3 Ignition… view at source ↗
Figure 6
Figure 6. Figure 6: Sample from Mass-conserving Kinetic-Mamba framework for Syngas A mechanism: Plot of an arbitrary sample showing good agreement with ground truth values from the syngas A dataset when obtained using mass-conserving Kinetic-Mamba standalone model. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Sample 1 from Syngas B mechanism: We can see good agreement between the predicted dynamics of the state variables and the ground truth values of the dynamics for an arbitrary test sample lying in the non-ignition regime. We have expanded the y-axis in this figure to improve the visualization by removing small fluctuations due to constant values [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Sample 2 from Syngas B mechanism: We can see good agreement between the predicted dynamics of the state variables and the ground truth values of the dynamics for an arbitrary test sample lying within the ignition regime. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Violin plot for the Syngas B mechanism: This figure represents the percentage relative 𝐿2 error obtained when the 𝐿2 norm is taken with respect to the temporal dimension for both predicted and true test dataset. The plot is in log-scale on the y-axis. The plot refers to the Ignition Regime-informed Kinetic-Mamba standalone model which takes the ignition regime in the dataset into consideration. 3.1.4 Laten… view at source ↗
Figure 10
Figure 10. Figure 10: Violin plot for the GRI mechanism: This figure represents the percentage relative 𝐿2 error obtained when the 𝐿2 norm is taken with respect to the temporal dimension for both predicted and true test dataset. The plot represents good accuracy across all samples for all 24 state variables. (The plot has error values depicted in the log-scale on y-axis.) [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Sample from GRI-Mech 3.0 mechanism: We can see good agreement between the predicted dynamics of the state variables and the ground truth values of the dynamics for an arbitrary test sample. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Extrapolation Sample 1 for Syngas B: Plot of predicted vs true data of an arbitrary sample from extrapolation dataset Item 1 [PITH_FULL_IMAGE:figures/full_fig_p020_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Extrapolation Sample 2 for Syngas B: Plot of predicted vs true data of an arbitrary sample from extrapolation dataset Item 2. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Extrapolation Sample 3 for Syngas B: Plot of predicted vs true data of an arbitrary sample from extrapolation dataset Item 3 [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Extrapolation Sample 4 for Syngas B: Plot of predicted vs true data of an arbitrary sample from extrapolation dataset Item 4. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Extrapolation for Syngas B KM model: Violin plot obtained when pre-trained ignition regime￾informed KM model is employed on extrapolation datasets Item 1, Item 2, Item 3, Item 4 containing 20,30,32, and 34 samples respectively. Here, the error values are plotted in log-scale. 3.3.1 Syngas A Problem with Kinetic-Mamba Framework For the recursive predictions, we use our pre-trained standalone Mamba model as… view at source ↗
Figure 17
Figure 17. Figure 17: Violin plot of recursive predictions on Syngas A dataset: In this figure, we show the percentage relative 𝐿2 error in log-scale for all samples of the test dataset which consists of 520 samples for each of the 13 state variables. The plot is in log-scale in the y-axis [PITH_FULL_IMAGE:figures/full_fig_p023_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Vioin plot of recursive prediction on GRI-Mech 3.0 scheme: The violin plot shows the percentage relative 𝐿2 error computed across all time steps when predictions were obtained recursively using latent Kinetic-Mamba model trained to take variables from 12-dimensional latent manifold as input and output on 24-dimensional manifold. The plot is in log-scale in the y-axis. 3.3.3 Directions for Improvement As a… view at source ↗
Figure 19
Figure 19. Figure 19: Adaptive time step prediction using latent KM model on GRI-Mech 3.0: We used latent KM model to make recursive predictions at intervals of different lengths using last time step of previous interval as input to next time step. 24 [PITH_FULL_IMAGE:figures/full_fig_p024_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Adaptive time step prediction using latent KM model on GRI-Mech 3.0: We used latent KM model to make recursive predictions at intervals of different lengths using last time step of previous interval as input to next time step. 4 Conclusion In this paper, we propose a framework called Kinetic-Mamba. The Kinetic-Mamba framework is capable of accurately predicting the dynamics of stiff chemical systems using… view at source ↗
Figure 21
Figure 21. Figure 21: Syngas B revised error metric during extrapolation: Relative 𝐿2 error for Syngas B extrapolation dataset when extrapolation was performed using pre-trained model with error values depicted in log scale on y-axis using the clipped denominator error metric. clip the denominator to max(𝑑𝑒𝑛, 𝜖), where 𝜖 = 10−3 × 𝜏 where 𝜏 is the sample mean of the relative 𝐿2 error of true values with respect to time. When we… view at source ↗
Figure 22
Figure 22. Figure 22: Syngas A revised error metric during recursion: Relative 𝐿2 error for Syngas A test dataset when recursion was performed on the whole time domain of length 101 ∗ 99 with error values depicted in log scale on y-axis using the clipped denominator error metric. 31 [PITH_FULL_IMAGE:figures/full_fig_p031_22.png] view at source ↗
read the original abstract

Accurate chemical kinetics modeling is essential for combustion simulations, as it governs the evolution of complex reaction pathways and thermochemical states. In this work, we introduce Kinetic-Mamba, a Mamba-based neural operator framework that integrates the expressive power of neural operators with the efficient temporal modeling capabilities of Mamba architectures. The framework comprises three complementary models: (i) a standalone Mamba model that predicts the time evolution of thermochemical state variables from given initial conditions; (ii) a constrained Mamba model that enforces mass conservation while learning the state dynamics; and (iii) a regime-informed architecture employing two standalone Mamba models to capture dynamics across temperature-dependent regimes. We additionally develop a latent Kinetic-Mamba variant that evolves dynamics in a reduced latent space and reconstructs the full state on the physical manifold. The accuracy and robustness of Kinetic-Mamba was evaluated using both time-decomposition and recursive-prediction strategies. We further assess the extrapolation capabilities of the model on varied out-of-distribution datasets. Computational experiments on Syngas and GRI-Mech 3.0 reaction mechanisms demonstrate that our framework achieves high fidelity in predicting complex kinetic behavior using only the initial conditions of the state variables.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces Kinetic-Mamba, a Mamba-based neural operator framework for predicting the time evolution of thermochemical states in stiff chemical kinetics. It comprises a standalone Mamba model, a mass-constrained variant enforcing conservation, a regime-informed architecture using two Mamba models for temperature-dependent regimes, and a latent-space version that evolves in reduced space before reconstruction. The models are trained on simulation data and evaluated on Syngas and GRI-Mech 3.0 mechanisms via time-decomposition and recursive-prediction strategies, with additional tests on out-of-distribution initial conditions; the central claim is that these achieve high fidelity using only initial state variables.

Significance. If the recursive predictions prove stable, the framework could supply efficient learned surrogates for stiff ODE integration in combustion modeling, where traditional solvers like CVODE are computationally expensive. The explicit incorporation of mass conservation and regime splitting represents a constructive effort to embed domain knowledge, and the latent-space variant offers a path toward dimensionality reduction. These elements, if quantitatively validated, would strengthen the case for sequence models in multi-timescale physical systems.

major comments (2)
  1. [Section 4 (recursive-prediction results)] The recursive-prediction evaluation central to the 'only initial conditions' claim lacks any analysis demonstrating that the learned step operator remains contractive on the fast subspace or that local truncation errors remain bounded by the reference stiff integrator over long horizons. Given eigenvalues spanning 6–10 orders of magnitude in stiff kinetics, this omission leaves the robustness claim on autoregressive rollouts unverified.
  2. [Section 4 and associated figures] No quantitative error metrics (e.g., time-averaged L2 norms, maximum pointwise deviations, or integrated error growth rates) or explicit error-bar analysis are supplied for the recursive trajectories on either mechanism, despite the abstract asserting 'high fidelity.' This absence prevents direct assessment of whether predictions remain accurate on slow manifolds after many steps.
minor comments (3)
  1. [Abstract] The abstract and introduction would benefit from explicit numerical thresholds (e.g., 'relative error below 1%') rather than the qualitative phrase 'high fidelity.'
  2. [Section 3 (Methods)] Training details (optimizer, learning-rate schedule, batch size, number of epochs, and data-split ratios) are not reported, hindering reproducibility of the supervised learning procedure.
  3. [Figures in Section 4] Figure captions for prediction plots should include the exact time horizon, number of recursive steps, and reference integrator used for comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below and will incorporate the suggested enhancements to strengthen the quantitative validation of the recursive predictions.

read point-by-point responses
  1. Referee: [Section 4 (recursive-prediction results)] The recursive-prediction evaluation central to the 'only initial conditions' claim lacks any analysis demonstrating that the learned step operator remains contractive on the fast subspace or that local truncation errors remain bounded by the reference stiff integrator over long horizons. Given eigenvalues spanning 6–10 orders of magnitude in stiff kinetics, this omission leaves the robustness claim on autoregressive rollouts unverified.

    Authors: We acknowledge the value of a formal stability analysis for stiff systems. Our empirical recursive rollouts on both Syngas and GRI-Mech 3.0 remain stable over the tested horizons without divergence, consistent with the reference integrator. In the revised manuscript we will add a supplementary discussion that estimates the effective Lipschitz constant of the learned step operator (via finite differences on the training trajectories) and compares local truncation error growth against CVODE over multiple characteristic time scales, thereby addressing contractivity on the fast subspace. revision: yes

  2. Referee: [Section 4 and associated figures] No quantitative error metrics (e.g., time-averaged L2 norms, maximum pointwise deviations, or integrated error growth rates) or explicit error-bar analysis are supplied for the recursive trajectories on either mechanism, despite the abstract asserting 'high fidelity.' This absence prevents direct assessment of whether predictions remain accurate on slow manifolds after many steps.

    Authors: We agree that explicit quantitative metrics would improve clarity. The present figures rely primarily on visual overlay; the revised version will report time-averaged L2 norms, maximum pointwise deviations, and integrated error growth rates for the recursive trajectories on both mechanisms. We will also include error bars derived from an ensemble of out-of-distribution initial conditions to quantify variability on the slow manifolds. revision: yes

Circularity Check

0 steps flagged

No circularity: standard supervised training on external simulation data

full rationale

The paper trains Mamba-based neural operators on trajectories generated by standard stiff ODE solvers (CVODE/LSODA) for Syngas and GRI-Mech 3.0 mechanisms. Predictions are evaluated via time-decomposition and recursive rollout on held-out and OOD initial conditions. No derivation step equates a claimed prediction to a fitted parameter by construction, no self-citation supplies a uniqueness theorem or ansatz that the central claim depends on, and the architecture is presented as an explicit sequence model without internal redefinition of its outputs. The framework remains a conventional data-driven surrogate whose fidelity claims rest on empirical validation rather than algebraic identity with its training inputs.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the empirical assumption that Mamba can learn the solution operator for stiff ODE systems from data alone, without explicit reaction mechanisms or stability guarantees.

free parameters (1)
  • neural network weights and architecture hyperparameters
    All model parameters are fitted to training trajectories from Syngas and GRI-Mech 3.0 simulations.
axioms (1)
  • domain assumption Mamba state-space models can stably approximate the temporal evolution operator of stiff chemical kinetics
    Invoked by the design of standalone, constrained, and regime-informed Mamba models.

pith-pipeline@v0.9.0 · 5514 in / 1150 out tokens · 38481 ms · 2026-05-16T21:37:13.549140+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 4 internal anchors

  1. [1]

    Springer Nature, 2023

    Nedunchezhian Swaminathan and Alessandro Parente.Machine learning and its application to reacting flows: ML and combustion. Springer Nature, 2023

  2. [2]

    License: Cc by 4.0.IEA: Paris, France, 2023

    IEA Global EV Outlook. License: Cc by 4.0.IEA: Paris, France, 2023

  3. [3]

    S.B. Pope. Computationally efficient implementation of combustion chemistry using in situ adaptive tabulation.Combustion Theory and Modelling, 1(1):41–63, 1997. doi: 10.1080/713665229. URLhttps://doi.org/10.1080/713665229

  4. [4]

    John Wiley & Sons, 2005

    Robert J Kee, Michael E Coltrin, and Peter Glarborg.Chemically reacting flow: theory and practice. John Wiley & Sons, 2005. 26

  5. [5]

    CHEMSODE: a stiff ODE solver for the equations of chemical kinetics

    Colin J Aro. CHEMSODE: a stiff ODE solver for the equations of chemical kinetics. Computer physics communications, 97(3):304–314, 1996

  6. [6]

    A stiff ODE preconditioner based on newton linearization.Applied numerical mathematics, 21(4):335–352, 1996

    Colin J Aro. A stiff ODE preconditioner based on newton linearization.Applied numerical mathematics, 21(4):335–352, 1996

  7. [7]

    A high performance chemical kinetics algorithm for 3-d atmospheric models.International Journal of High Performance Computing Applications, 13(1):3–15, 1999

    Colin J Aro, Douglas A Rotman, and Garry H Rodrigue. A high performance chemical kinetics algorithm for 3-d atmospheric models.International Journal of High Performance Computing Applications, 13(1):3–15, 1999

  8. [8]

    Extrapolation techniques used in the solution of stiff ODEs associated with chemical kinetics of air quality models.Atmospheric Environment, 29(3):403–410, 1995

    Donald Dabdub and John H Seinfeld. Extrapolation techniques used in the solution of stiff ODEs associated with chemical kinetics of air quality models.Atmospheric Environment, 29(3):403–410, 1995

  9. [9]

    Improved traditional rosenbrock–wanner methods for stiff ODEs and DAEs

    Joachim Rang. Improved traditional rosenbrock–wanner methods for stiff ODEs and DAEs. Journal of Computational and Applied Mathematics, 286:128–144, 2015. ISSN 0377-0427. doi: https://doi.org/10.1016/j.cam.2015.03.010. URL https://www.sciencedirect. com/science/article/pii/S0377042715001569

  10. [10]

    Deep learning for scalable chemical kinetics

    Alisha J Sharma, Ryan F Johnson, David A Kessler, and Adam Moses. Deep learning for scalable chemical kinetics. InAIAA scitech 2020 forum, page 0181, 2020

  11. [11]

    Brown, Harbir Antil, Rainald L¨ohner, Fumiya Togashi, and Deepanshu Verma

    Thomas S. Brown, Harbir Antil, Rainald L¨ohner, Fumiya Togashi, and Deepanshu Verma. Novel DNNs for Stiff ODEs with Applications to Chemically Reacting Flows. In Heike Jagode, Hartwig Anzt, Hatem Ltaief, and Piotr Luszczek, editors,High Performance Computing, pages 23–39, Cham, 2021. Springer International Publishing. ISBN 978-3- 030-90539-2

  12. [12]

    Maziar Raissi, Paris Perdikaris, and George E Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations.Journal of Computational Physics, 378:686–707, 2019

  13. [13]

    Stiff-PINN: Physics-Informed Neural Network for Stiff Chemical Kinetics.The Journal of Physical Chemistry A, 125(36): 8098–8106, 2021

    Weiqi Ji, Weilun Qiu, Zhiyu Shi, Shaowu Pan, and Sili Deng. Stiff-PINN: Physics-Informed Neural Network for Stiff Chemical Kinetics.The Journal of Physical Chemistry A, 125(36): 8098–8106, 2021. doi: 10.1021/acs.jpca.1c05102. URL https://doi.org/10.1021/ acs.jpca.1c05102. PMID: 34463510

  14. [14]

    Hybrid chemical and data-driven model for stiff chemical kinetics using a physics-informed neural network

    Matthew Frankel, Mario De Florio, Enrico Schiassi, and Lina Sela. Hybrid chemical and data-driven model for stiff chemical kinetics using a physics-informed neural network. Engineering Proceedings, 69(1):40, 2024

  15. [15]

    Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, mar 2021

    Lu Lu, Pengzhan Jin, Guofei Pang, Zhongqiang Zhang, and George Em Karniadakis. Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators.Nature Machine Intelligence, 3(3):218–229, mar 2021. ISSN 2522-5839. doi: 10.1038/s42256-021-00302-5. URL http://dx.doi.org/10.1038/ s42256-021-00302-5

  16. [16]

    Fourier Neural Operator for Parametric Partial Differential Equations

    Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, and Anima Anandkumar. Fourier neural operator for parametric partial differential equations.arXiv preprint arXiv:2010.08895, 2020

  17. [17]

    KAN: Kolmogorov-Arnold Networks

    Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljaˇci´c, Thomas Y Hou, and Max Tegmark. KAN: Kolmogorov-arnold networks.arXiv preprint arXiv:2404.19756, 2024

  18. [18]

    Khemraj Shukla, Juan Diego Toscano, Zhicheng Wang, Zongren Zou, and George Em Karniadakis. A comprehensive and FAIR comparison between MLP and KAN rep- resentations for differential equations and operator networks.Computer Methods in Applied Mechanics and Engineering, 431:117290, 2024. ISSN 0045-7825. doi: https://doi.org/10.1016/j.cma.2024.117290. URLhtt...

  19. [19]

    Jagtap, Hessam Babaee, Bryan T

    Somdatta Goswami, Ameya D. Jagtap, Hessam Babaee, Bryan T. Susi, and George Em Kar- niadakis. Learning stiff chemical kinetics using extended deep neural operators.Computer Methods in Applied Mechanics and Engineering, 419:116674, 2024. ISSN 0045-7825. doi: https://doi.org/10.1016/j.cma.2023.116674. URLhttps://www.sciencedirect.com/ science/article/pii/S0...

  20. [20]

    Combustion chemistry acceleration with DeepONets.Fuel, 365:131212, 2024

    Anuj Kumar and Tarek Echekki. Combustion chemistry acceleration with DeepONets.Fuel, 365:131212, 2024. ISSN 0016-2361. doi: https://doi.org/10.1016/j.fuel.2024.131212. URL https://www.sciencedirect.com/science/article/pii/S0016236124003582

  21. [21]

    Chen, and Dezhi Zhou

    Yuting Weng, Han Li, Hao Zhang, Zhi X. Chen, and Dezhi Zhou. Extended Fourier Neural Operators to learn stiff chemical kinetics under unseen conditions.Combus- tion and Flame, 272:113847, 2025. ISSN 0010-2180. doi: https://doi.org/10.1016/ j.combustflame.2024.113847. URL https://www.sciencedirect.com/science/ article/pii/S001021802400556X

  22. [22]

    Susi, Hessam Babaee, and George Em Karniadakis

    Kamaljyoti Nath, Additi Pandey, Bryan T. Susi, Hessam Babaee, and George Em Karniadakis. AMORE: Adaptive multi-output operator network for stiff chemical kinetics, 2025. URL https://arxiv.org/abs/2510.12999

  23. [23]

    Efficiently Modeling Long Sequences with Structured State Spaces

    Albert Gu, Karan Goel, and Christopher R ´e. Efficiently modeling long sequences with structured state spaces, 2022. URLhttps://arxiv.org/abs/2111.00396

  24. [24]

    Mamba: Linear-time sequence modeling with selective state spaces

    Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. arXiv e-prints, pages arXiv–2312, 2023

  25. [25]

    State-space models are accurate and efficient neural operators for dynamical systems.arXiv preprint arXiv:2409.03231, 2024

    Zheyuan Hu, Nazanin Ahmadi Daryakenari, Qianli Shen, Kenji Kawaguchi, and George Em Karniadakis. State-space models are accurate and efficient neural operators for dynamical systems.arXiv preprint arXiv:2409.03231, 2024

  26. [26]

    A Mamba-based foundation model for chemistry

    Emilio Vital Brazil, Eduardo Soares, Victor Yukio Shirasuna, Renato Cerqueira, Dmitry Zubarev, and Kristin Schmidt. A Mamba-based foundation model for chemistry. InAI for Accelerated Materials Design-NeurIPS 2024, 2024

  27. [27]

    Molecular generation with state space sequence models

    Anri Lombard, Shane Acton, Ulrich Armel Mbou Sob, and Jan Buys. Molecular generation with state space sequence models. InNeurIPS 2024 Workshop on AI for New Drug Modalities, 2024. URLhttps://openreview.net/forum?id=1ib5oyTQIb

  28. [28]

    SMILES-Mamba: Chemical mamba foundation models for drug ADMET prediction

    Bohao Xu, Yingzhou Lu, Chenhao Li, Ling Yue, Xiao Wang, Nan Hao, Tianfan Fu, and Jim Chen. SMILES-Mamba: Chemical mamba foundation models for drug ADMET prediction. arXiv preprint arXiv:2408.05696, 2024

  29. [29]

    Protein-Mamba: Biological Mamba Models for Protein Function Prediction.arXiv preprint arXiv:2409.14617, 2024

    Bohao Xu, Yingzhou Lu, Yoshitaka Inoue, Namkyeong Lee, Tianfan Fu, and Jintai Chen. Protein-Mamba: Biological Mamba Models for Protein Function Prediction.arXiv preprint arXiv:2409.14617, 2024

  30. [30]

    GRI 3.0 mechanism.Gas Research Institute, 1999

    Gregory P Smith, David M Golden, Michael Frenklach, Nigel W Moriarty, Boris Eiteneer, Mikhail Goldenberg, C Thomas Bowman, Ronald K Hanson, Soonho Song, William C Gardiner Jr, et al. GRI 3.0 mechanism.Gas Research Institute, 1999

  31. [31]

    Cantera: An Object-oriented Software Toolkit for Chemical Kinetics, Thermodynamics, and Transport Processes

    David G Goodwin, Harry K Moffat, and Raymond L Speth. Cantera: An Object-oriented Software Toolkit for Chemical Kinetics, Thermodynamics, and Transport Processes. Version 2.3.0.Zenodo, 2017. doi: 10.5281/zenodo.170284

  32. [32]

    Deepomamba: State-space model for spatio-temporal pde neural operator learning.J

    Zheyuan Hu, Qianying Cao, Kenji Kawaguchi, and George Em Karniadakis. Deepomamba: State-space model for spatio-temporal pde neural operator learning.J. Comput. Phys., 540: 114272, 2025. URLhttps://api.semanticscholar.org/CorpusID:280547144

  33. [33]

    Oxford University Press, 2005

    George Karniadakis and Spencer J Sherwin.Spectral/hp element methods for computational fluid dynamics. Oxford University Press, 2005

  34. [34]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014. Appendices In the following sections, we shall touch upon several relevant aspects corresponding to the data, implementation and results that we could not include in the main body of the paper. 28 A Range of Data In this section, we report the ...

  35. [35]

    In the problem corresponding to the Syngas A dataset Section 3.1.1, we observe that the total training time is 161315.47 seconds (44.81 hours) and the prediction time is 2.77 seconds

  36. [36]

    Moreover, it takes 142.72 seconds to go from the𝑚-dimensional manifold to𝑚+1-dimensional manifold

    When we transition to the mass-conserving Kinetic-Mamba framework on the Syngas A dataset Section 3.1.2, we observe that the total training time is 161677.78 seconds (44.91 hours) and the prediction time is 2.74 seconds. Moreover, it takes 142.72 seconds to go from the𝑚-dimensional manifold to𝑚+1-dimensional manifold

  37. [37]

    When we use the Syngas B dataset with our Kinetic-Mamba framework which takes the ignition regimes into consideration Section 3.1.3, we note that the training time is 12743.51 seconds (3.54 hours) and 185055.27 seconds (51.40 hours) corresponding to the data lying below 𝜏K and above 𝜏K and the prediction times are 0.78 seconds and 4.38 seconds, respectively

  38. [38]

    C Additional Results on Extrapolation and Recursive Prediction We mentioned inflation in the relative 𝐿2 error observed in violin plot Fig

    When considering the latent Kinetic-Mamba model on the GRI dataset Section 3.1.4, the training time is 45764.18 seconds (12.71 hours) and the prediction time is 0.92 seconds. C Additional Results on Extrapolation and Recursive Prediction We mentioned inflation in the relative 𝐿2 error observed in violin plot Fig. 16 due to small value of the denominator (...