Quantizing Time-Series Models As Dynamical Systems: Trajectory-Based Quantization Sensitivity Score

Elizsveta Semenova; Harrison Bo Hua Zhu; Mariya Pavlova; Yingzhen Li

arxiv: 2606.13300 · v1 · pith:2WGDJ7KNnew · submitted 2026-06-11 · 💻 cs.LG

Quantizing Time-Series Models As Dynamical Systems: Trajectory-Based Quantization Sensitivity Score

Mariya Pavlova , Harrison Bo Hua Zhu , Elizsveta Semenova , Yingzhen Li This is my paper

Pith reviewed 2026-06-27 07:19 UTC · model grok-4.3

classification 💻 cs.LG

keywords post-training quantizationtime-series modelsdynamical systemssensitivity analysismixed-precisionerror propagationblack-box models

0 comments

The pith

Treating a model's rollout as a discrete-time dynamical system yields a quantization sensitivity score independent of bit-width or quantizer choice.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that quantization sensitivity for time-series models can be estimated ahead of time by viewing their sequential output as a dynamical system whose stability governs how small errors grow over many steps. This separation of sensitivity analysis from the actual quantization procedure matters because it works even when the model is black-box or has fused operators that block conventional calibration. A reader would care if the method lets practitioners allocate a limited precision budget across layers without running quantization trials or collecting calibration data. The work then builds TQS-PTQ, a mixed-precision framework that uses the score directly.

Core claim

By modeling the network's rollout as a discrete-time dynamical system, the Trajectory-based Quantization Sensitivity Score (TQS) characterizes how quantization-induced errors propagate and amplify over the rollout horizon. Unlike conventional PTQ methods where sensitivity analysis is coupled to the quantization procedure, TQS enables a priori sensitivity estimation decoupled from quantizer selection and bit-width assignment. This separation allows for quantization budget planning even for black-box or compiled networks with fused operators. Building on this, TQS-PTQ is a flexible mixed-precision framework that requires no calibration data or costly second-order approximations.

What carries the argument

The Trajectory-based Quantization Sensitivity Score (TQS), obtained by treating the network rollout as a discrete-time dynamical system to quantify error amplification over the prediction horizon.

If this is right

Quantization budget planning becomes possible without access to model internals or calibration datasets.
Mixed-precision assignment works for compiled networks containing fused operators.
Sensitivity estimation no longer requires running the quantizer or second-order approximations.
Low-precision deployment of time-series models gains a pathway based on rollout stability rather than per-layer heuristics.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same stability view could be tested on autoregressive models outside time series, such as token-by-token generation where errors also accumulate.
If the dynamical approximation holds, training objectives that penalize unstable rollouts might improve a model's inherent quantizability.
Independence from calibration data opens the door to privacy-preserving or on-device quantization workflows.

Load-bearing premise

Modeling the network's sequential predictions as a discrete-time dynamical system accurately describes how quantization errors grow across steps.

What would settle it

Measure whether TQS ranks of layers or operations fail to predict which parts cause the largest accuracy drop when quantized independently on a held-out time-series validation set.

Figures

Figures reproduced from arXiv: 2606.13300 by Elizsveta Semenova, Harrison Bo Hua Zhu, Mariya Pavlova, Yingzhen Li.

**Figure 1.** Figure 1: TQS-PTQ extends the low-precision accuracy–compression frontier across three forecasting models. TQS-PTQ reaches competitive or improved error at substantially higher compression ratios on TimesFM-2.5 weather, Aurora-small 2 m temperature, and Pangu-Weather. Because the sensitivity ranking is computed once and reused across targets, TQS-PTQ extends the Pareto frontier without requiring a new calibration ru… view at source ↗

**Figure 2.** Figure 2: Sensitivity concentrates at the I/O boundary. γ-rank percentiles by role bucket across TimesFM-2.5, Aurora-small, and Pangu-Weather. Boxes show IQR; dots are layers; higher percentile means higher sensitivity. I/O buckets are consistently most sensitive, while body blocks are least sensitive. Bucket definitions in Appendix A.8. TQS reveals structured layer-level heterogeneity. Perlayer γ varies substantia… view at source ↗

**Figure 3.** Figure 3: Cumulative γ-shift share vs. layer share. All models show moderately heavy-tailed sensitivity. A.9. Cross-Model γ-Concentration Statistics [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Per-layer agreement between Hessian curvature score tr(Hℓ) (GPTQ-style) and TQS γ quant ℓ (TQS) on the n = 70 Aurora layers where both metrics are defined. Axes use sign(x) log(1 + |x|) to handle the wide dynamic range. Hessian agrees moderately with Task-TQS-Quant (ρ = 0.47, p = 0.005, n = 34), agrees weakly with TQS-Quant, and is mildly anticorrelated with both Gaussian-probe TQS variants (ρ ≈ −0.12). •… view at source ↗

**Figure 6.** Figure 6: Layer-architecture audit at C = 16, greedy allocator: tier distribution per Aurora architectural block, comparing TQS gauss / TQS quant against the Hessian-quant / Hessian-gauss baselines. Hessian methods place all output heads uniformly at BF16; TQS quant promotes five atmospheric heads to FP32, identifying the heads producing the most chaotic upper-air variables; TQS gauss protects the input-side positio… view at source ↗

**Figure 7.** Figure 7: Effective error-growth rate λeff per method × variable at W2-equivalent compression. Negative values indicate the quantized model’s ERA5-MAE decreases over the rollout (a known plateau / damping effect at long horizons in some variables). based methods provide, we compare γ against the QEP error amplification ratio H∆/H across matched layers. The Spearman rank correlation is weak (ρ = 0.30, p = 0.43, n = … view at source ↗

**Figure 8.** Figure 8: Aurora-small TQS-PTQ remains close to full precision. Per-variable ERA5-MAE across C ∈ [8, 32] with 95% bootstrap confidence intervals over the 120-step rollout. TQS-PTQ overlaps the unquantized reference across all nine variables while reaching up to 32× compression. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

read the original abstract

We introduce the Trajectory-based Quantization Sensitivity Score (TQS), a metric that reframes post-training quantization (PTQ) through the lens of dynamical-systems stability. By modeling the network's rollout as a discrete-time dynamical system, TQS characterizes how quantization-induced errors propagate and amplify over the rollout horizon. Unlike conventional PTQ methods, where sensitivity analysis is often coupled to the quantization procedure, TQS enables a priori sensitivity estimation decoupled from quantizer selection and bit-width assignment. This separation allows for quantization budget planning even for black-box or compiled networks with fused operators. Building on this, we present TQS-PTQ, a flexible mixed-precision framework that requires no calibration data or costly second-order approximations. Our experiments show that a dynamical-systems perspective provides a robust, high-performing pathway for low-precision deployment in resource-constrained settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TQS reframes PTQ sensitivity via dynamical systems for time-series models, which is a clean separation if it holds, but the paper leaves the black-box construction and error propagation details thin.

read the letter

The main point is that this work treats the network rollout as a discrete dynamical system to compute a sensitivity score ahead of any quantization choice. That decoupling from the quantizer and bit-width assignment is the actual novelty, and it targets a practical pain point in mixed-precision PTQ for compiled or black-box time-series networks.

What stands out is the avoidance of calibration data and second-order approximations. Standard sensitivity methods often require running the quantizer or access to internals; if TQS really works from trajectories alone, that flexibility matters for deployment on constrained hardware.

The soft spot is the weakest assumption flagged in the stress-test: whether the state-transition map built from the rollout accurately tracks how discrete, state-dependent quantization errors interact with non-linearities and fused operators. The abstract asserts the score remains predictive and independent, but without the derivation, bounds, or explicit construction for black-box cases, it is not obvious this holds. Experiments are claimed to show robustness, yet the lack of visible baselines, effect sizes, or ablation on the dynamical modeling step makes it difficult to judge whether the gains are real or setup-specific.

The paper is aimed at engineers doing post-training quantization for sequential models on edge devices. A reader already working on PTQ variants could extract the TQS-PTQ framework and test it directly. It is coherent on its own terms and engages the literature enough to deserve referee time, even if the central claim needs stronger verification on the math and the black-box applicability.

Referee Report

2 major / 1 minor

Summary. The paper introduces the Trajectory-based Quantization Sensitivity Score (TQS) that reframes post-training quantization (PTQ) of time-series models by modeling network rollout as a discrete-time dynamical system to characterize propagation and amplification of quantization-induced errors over the rollout horizon. It claims this yields a priori sensitivity estimation decoupled from quantizer selection and bit-width assignment, enabling budget planning for black-box or compiled networks with fused operators, and presents the TQS-PTQ mixed-precision framework requiring no calibration data or second-order approximations.

Significance. If the dynamical-systems modeling and perturbation analysis hold, the result would be significant for PTQ in time-series settings by enabling sensitivity analysis without internal model access or calibration data, which is valuable for proprietary or compiled deployments; the separation of sensitivity from quantizer choice could simplify mixed-precision planning in resource-constrained environments.

major comments (2)

[Abstract] Abstract: the central claim that TQS enables a priori sensitivity estimation decoupled from quantizer selection requires that the state-transition map be constructible without internal access and that the trajectory sensitivity remain predictive for discrete, state-dependent quantization errors; no derivation or bound is supplied showing validity once non-linearities and fused operators are present.
[Abstract] Abstract: the modeling of rollout as a discrete-time dynamical system is asserted to characterize error propagation, but the stress-test concern that this may fail to capture interactions with non-linearities and state-dependent activation paths is not addressed by any provided analysis or counter-example.

minor comments (1)

The abstract references experiments demonstrating robust performance but supplies no information on datasets, baselines, or metrics, making it impossible to evaluate empirical support.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their comments highlighting important aspects of the abstract claims. We address each point below, providing clarifications from the manuscript and indicating where revisions will strengthen the presentation.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that TQS enables a priori sensitivity estimation decoupled from quantizer selection requires that the state-transition map be constructible without internal access and that the trajectory sensitivity remain predictive for discrete, state-dependent quantization errors; no derivation or bound is supplied showing validity once non-linearities and fused operators are present.

Authors: Section 3 constructs the state-transition map via finite-difference approximations on observed rollout trajectories, requiring only black-box forward passes and thus applicable to compiled networks with fused operators. The decoupling follows directly from computing trajectory sensitivity independently of any specific quantizer. While no general closed-form bound is derived for arbitrary discrete state-dependent errors (the analysis relies on local linearization of the perturbation dynamics), predictive validity is demonstrated empirically across non-linear models in Section 5. We will revise the abstract to note the empirical validation and add a limitations paragraph discussing the scope of the linearization assumption. revision: partial
Referee: [Abstract] Abstract: the modeling of rollout as a discrete-time dynamical system is asserted to characterize error propagation, but the stress-test concern that this may fail to capture interactions with non-linearities and state-dependent activation paths is not addressed by any provided analysis or counter-example.

Authors: The discrete-time dynamical system model and its error-propagation analysis are formalized in Section 3 using the Jacobian of the state-transition map. Section 5 reports results on multiple time-series architectures containing non-linear activations and state-dependent paths (e.g., LSTMs, GRUs), where TQS remains correlated with observed quantization error. To address the referee's concern directly, we will add a dedicated subsection in the revision that discusses potential breakdown cases for highly state-dependent non-linearities and includes a counter-example together with mitigation via trajectory sampling. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The abstract presents TQS as a modeling reframing of PTQ via discrete-time dynamical systems without any visible equations, fitting procedures, or self-citations that reduce the central claim to its inputs by construction. The decoupling from quantizer selection is stated as a direct consequence of the modeling choice rather than derived from a fitted parameter or prior self-result. No load-bearing steps are identifiable from the provided text that would trigger any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no information on free parameters, axioms, or invented entities; ledger left empty.

pith-pipeline@v0.9.1-grok · 5681 in / 1104 out tokens · 24225 ms · 2026-06-27T07:19:19.324489+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 4 linked inside Pith

[1]

Proceedings of the 17th International Conference on Machine Learning , editor =

Langley, Pat , title =. Proceedings of the 17th International Conference on Machine Learning , editor =. 2000 , address =

2000
[2]

, title =

Mitchell, Tom M. , title =. 1980 , address =

1980
[3]

, title =

Kearns, Michael J. , title =
[4]

I , publisher =

Machine Learning: An Artificial Intelligence Approach, Vol. I , publisher =. 1983 , address =

1983
[5]

and Hart, Peter E

Duda, Richard O. and Hart, Peter E. and Stork, David G. , title =
[6]

, title =

Newell, Allen and Rosenbloom, Paul S. , title =. Cognitive Skills and Their Acquisition , editor =. 1981 , address =

1981
[7]

, title =

Samuel, Arthur L. , title =. IBM Journal of Research and Development , volume =
[8]

Inverse Problems , volume =

Stable architectures for deep neural networks , author =. Inverse Problems , volume =. 2018 , doi =

2018
[9]

Communications in Mathematics and Statistics , volume =

E, Weinan , title =. Communications in Mathematics and Statistics , volume =
[10]

Chen, Ricky T. Q. and Rubanova, Yulia and Bettencourt, Jesse and Duvenaud, David K. , title =. Advances in Neural Information Processing Systems , volume =
[11]

Finite-Time

Storm, Lina and Linander, Hampus and Bec, J. Finite-Time. Physical Review Letters , volume =. 2024 , doi =

2024
[12]

arXiv preprint arXiv:2403.02579 , year =

Geometric Dynamics of Signal Propagation Predict Trainability of Transformers , author =. arXiv preprint arXiv:2403.02579 , year =

arXiv
[13]

Tracking Finite-Time

W. Tracking Finite-Time. arXiv preprint arXiv:2602.09613 , year =

arXiv
[14]

arXiv preprint arXiv:2602.16864 , year =

Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling , author =. arXiv preprint arXiv:2602.16864 , year =

Pith/arXiv arXiv
[15]

arXiv preprint arXiv:2502.05656 , year =

Flowing Through Layers: A Continuous Dynamical Systems Perspective on Transformers , author =. arXiv preprint arXiv:2502.05656 , year =

arXiv
[16]

Proceedings of the 42nd International Conference on Machine Learning , year =

He, Yi and Yang, Yiming and Cheng, Xiaoyuan and Wang, Hai and Xue, Xiao and Chen, Boli and Hu, Yukun , title =. Proceedings of the 42nd International Conference on Machine Learning , year =
[17]

arXiv preprint arXiv:2511.00663 , year =

Dobra, Alex and Pidstrigach, Jakiw and Reichelt, Tim and Fraccaro, Paolo and Jakubik, Johannes and Jones, Anne and Schroeder de Witt, Christian and Torr, Philip and Stier, Philip , title =. arXiv preprint arXiv:2511.00663 , year =

arXiv
[18]

, title =

Hoover, Brett T. , title =
[19]

arXiv preprint arXiv:2410.05988 , year =

Mittra, Tirthankar , title =. arXiv preprint arXiv:2410.05988 , year =

arXiv
[20]

arXiv preprint arXiv:2308.09955 , year =

To prune or not to prune: A chaos-causality approach to principled pruning of dense neural networks , author =. arXiv preprint arXiv:2308.09955 , year =

arXiv
[21]

arXiv preprint arXiv:2506.07975 , year =

Hyperpruning: Efficient Search through Pruned Variants of Recurrent Neural Networks Leveraging Lyapunov Spectrum , author =. arXiv preprint arXiv:2506.07975 , year =

arXiv
[22]

2026 , journal =

Structural Sensitivity in Compressed Transformers: Relative Error Propagation and Layer Removal , author =. 2026 , journal =

2026
[23]

and Lucic, Ana and others , title =

Bodnar, Cristian and Bruinsma, Wessel P. and Lucic, Ana and others , title =. Nature , year =
[24]

Nature , volume =

Bi, Kaifeng and Xie, Lingxi and Zhang, Hengheng and Chen, Xin and Gu, Xiaotao and Tian, Qi , title =. Nature , volume =
[25]

Journal of Computer Science and Technology , volume =

Chen, Yao-Dong and Zheng, Kai-Jun and Guo, Zhen-Hua and others , title =. Journal of Computer Science and Technology , volume =. 2026 , doi =

2026
[26]

arXiv preprint arXiv:2602.02110 , year =

Fu, Zhongqian and Zhao, Tianyi and Han, Kai and Zhou, Hang and Chen, Xinghao and Wang, Yunhe , title =. arXiv preprint arXiv:2602.02110 , year =

arXiv
[27]

International Conference on Learning Representations , year =

Frantar, Elias and Ashkboos, Saleh and Hoefler, Torsten and Alistarh, Dan , title =. International Conference on Learning Representations , year =
[28]

arXiv preprint arXiv:2504.02692 , year =

Li, Yuhang and others , title =. arXiv preprint arXiv:2504.02692 , year =

arXiv
[29]

arXiv preprint arXiv:2504.09629 , year =

Arai, Yamato and Ichikawa, Yuma , title =. arXiv preprint arXiv:2504.09629 , year =

arXiv
[30]

Proceedings of Machine Learning and Systems , year =

Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song , title =. Proceedings of Machine Learning and Systems , year =
[31]

and Keutzer, Kurt , title =

Dong, Zhen and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W. and Keutzer, Kurt , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =
[32]

and Keutzer, Kurt , title =

Dong, Zhen and Yao, Zhewei and Cai, Yaohui and Arfeen, Daiyaan and Gholami, Amir and Mahoney, Michael W. and Keutzer, Kurt , title =. Advances in Neural Information Processing Systems , year =
[33]

Advances in Neural Information Processing Systems , year =

Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke , title =. Advances in Neural Information Processing Systems , year =
[34]

and Keutzer, Kurt , title =

Gholami, Amir and Kim, Sehoon and Dong, Zhen and Yao, Zhewei and Mahoney, Michael W. and Keutzer, Kurt , title =. arXiv preprint arXiv:2103.13630 , year =

arXiv
[35]

arXiv preprint arXiv:2409.16694 , year =

Gong, Ruihao and others , title =. arXiv preprint arXiv:2409.16694 , year =

arXiv
[36]

arXiv preprint arXiv:2212.07048 , year =

Liu, Jiawei and Niu, Lin and Yuan, Zhihang and Yang, Dawei and Wang, Xinggang and Liu, Wenyu , title =. arXiv preprint arXiv:2212.07048 , year =

arXiv
[37]

Kim, Jinuk and El Halabi, Marwa and Park, Wonpyo and Schaefer, Clemens J. S. and Lee, Deokjae and Park, Yeonhong and Lee, Jae W. and Song, Hyun Oh , title =. Proceedings of the 42nd International Conference on Machine Learning , year =
[38]

arXiv preprint , year =

Kogan, Maxim and others , title =. arXiv preprint , year =
[39]

Zico , title =

Sun, Mingjie and Liu, Zhuang and Bair, Anna and Kolter, J. Zico , title =. arXiv preprint arXiv:2306.11695 , year =

Pith/arXiv arXiv
[40]

arXiv preprint arXiv:2410.03294 , year =

Anonymous , title =. arXiv preprint arXiv:2410.03294 , year =

Pith/arXiv arXiv
[41]

Proceedings of the 41st International Conference on Machine Learning , year =

Das, Abhimanyu and Kong, Weihao and Sen, Rajat and Zhou, Yichen , title =. Proceedings of the 41st International Conference on Machine Learning , year =
[42]

Hersbach, Hans and Bell, Bill and Berrisford, Paul and Hirahara, Shoji and Hor. The. Quarterly Journal of the Royal Meteorological Society , volume =
[43]

Engelken, Rainer and Wolf, Fred and Abbott, L. F. , title =. Physical Review Research , volume =. 2023 , doi =

2023
[44]

, title =

Lorenz, Edward N. , title =. Journal of the Atmospheric Sciences , volume =
[45]

Predictability of Large-Scale Atmospheric Motions:

Vannitsem, St. Predictability of Large-Scale Atmospheric Motions:. Chaos: An Interdisciplinary Journal of Nonlinear Science , volume =
[46]

1992 , publisher=

The General Problem of the Stability of Motion , author=. 1992 , publisher=

1992
[47]

Characteristic Ljapunov, exponents of dynamical systems , author=

A multiplicative ergodic theorem. Characteristic Ljapunov, exponents of dynamical systems , author=. Trudy Moskovskogo Matematicheskogo Obshchestva , volume=. 1968 , publisher=

1968
[48]

Proceedings of the 37th International Conference on Machine Learning , year=

Up or Down? Adaptive Rounding for Post-Training Quantization , author=. Proceedings of the 37th International Conference on Machine Learning , year=
[49]

2023 , publisher =

Xiao, Guangxuan and Lin, Ji and Seznec, Mickael and Wu, Hao and Demouth, Julien and Han, Song , booktitle =. 2023 , publisher =

2023
[50]

Ranjan, Navin and Savakis, Andreas , journal=. Mix-
[51]

Communications in Mathematics and Statistics , volume =

A Proposal on Machine Learning via Dynamical Systems , author =. Communications in Mathematics and Statistics , volume =. 2017 , doi =

2017
[52]

arXiv preprint arXiv:2507.05164 , year =

A Dynamical Systems Perspective on the Analysis of Neural Networks , author =. arXiv preprint arXiv:2507.05164 , year =

Pith/arXiv arXiv
[53]

Frontiers in Applied Mathematics and Statistics , volume =

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems Tools , author =. Frontiers in Applied Mathematics and Statistics , volume =. 2022 , doi =

2022
[54]

Proceedings of the AAAI Conference on Artificial Intelligence , year =

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting , author =. Proceedings of the AAAI Conference on Artificial Intelligence , year =
[55]

Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , author =. Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 2018 , publisher =

2018
[56]

Advances in Neural Information Processing Systems , volume =

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting , author =. Advances in Neural Information Processing Systems , volume =
[57]

Chee, Jerry and Cai, Yaohui and Kuleshov, Volodymyr and De Sa, Christopher , booktitle =
[58]

Yao, Zhewei and Aminabadi, Reza Yazdani and Zhang, Minjia and Wu, Xiaoxia and Li, Conglong and He, Yuxiong , booktitle =
[59]

arXiv preprint arXiv:2503.01901 , year =

Identifying Sensitive Weights via Post-quantization Integral , author =. arXiv preprint arXiv:2503.01901 , year =

arXiv
[60]

Nature , volume =

Neural General Circulation Models for Weather and Climate , author =. Nature , volume =. 2024 , doi =

2024
[61]

2022 , archivePrefix =

Pathak, Jaideep and Subramanian, Shashank and Harrington, Peter and Raja, Sanjeev and Chattopadhyay, Ashesh and Mardani, Morteza and Kurth, Thorsten and Hall, David and Li, Zongyi and Azizzadenesheli, Kamyar and Hassanzadeh, Pedram and Kashinath, Karthik and Anandkumar, Animashree , journal =. 2022 , archivePrefix =

2022
[62]

Science , volume =

Learning Skillful Medium-Range Global Weather Forecasting , author =. Science , volume =. 2023 , doi =

2023
[63]

2022 , howpublished =

Early Warnings for All , author =. 2022 , howpublished =

2022
[64]

2025 , month = jun, day =

2025
[65]

Physical Review Letters , volume =

Enforcing Analytic Constraints in Neural Networks Emulating Physical Systems , author =. Physical Review Letters , volume =. 2021 , doi =

2021
[66]

and Wexler, Anthony S

Sturm, Philipp O. and Wexler, Anthony S. , journal =. Conservation Laws in a Neural Network Architecture: Enforcing the Atom Balance of a Julia-Based Photochemical Model to Assess Stratospheric. 2022 , doi =

2022
[67]

Journal of Machine Learning Research , volume =

Hard-Constrained Deep Learning for Climate Downscaling , author =. Journal of Machine Learning Research , volume =
[68]

Geophysical Research Letters , volume =

On Some Limitations of Current Machine Learning Weather Prediction Models , author =. Geophysical Research Letters , volume =. 2024 , doi =

2024
[69]

2022 , howpublished =

2022
[70]

Operations Research , volume=

Multiple choice knapsack problem , author=. Operations Research , volume=. 1979 , publisher=

1979

[1] [1]

Proceedings of the 17th International Conference on Machine Learning , editor =

Langley, Pat , title =. Proceedings of the 17th International Conference on Machine Learning , editor =. 2000 , address =

2000

[2] [2]

, title =

Mitchell, Tom M. , title =. 1980 , address =

1980

[3] [3]

, title =

Kearns, Michael J. , title =

[4] [4]

I , publisher =

Machine Learning: An Artificial Intelligence Approach, Vol. I , publisher =. 1983 , address =

1983

[5] [5]

and Hart, Peter E

Duda, Richard O. and Hart, Peter E. and Stork, David G. , title =

[6] [6]

, title =

Newell, Allen and Rosenbloom, Paul S. , title =. Cognitive Skills and Their Acquisition , editor =. 1981 , address =

1981

[7] [7]

, title =

Samuel, Arthur L. , title =. IBM Journal of Research and Development , volume =

[8] [8]

Inverse Problems , volume =

Stable architectures for deep neural networks , author =. Inverse Problems , volume =. 2018 , doi =

2018

[9] [9]

Communications in Mathematics and Statistics , volume =

E, Weinan , title =. Communications in Mathematics and Statistics , volume =

[10] [10]

Chen, Ricky T. Q. and Rubanova, Yulia and Bettencourt, Jesse and Duvenaud, David K. , title =. Advances in Neural Information Processing Systems , volume =

[11] [11]

Finite-Time

Storm, Lina and Linander, Hampus and Bec, J. Finite-Time. Physical Review Letters , volume =. 2024 , doi =

2024

[12] [12]

arXiv preprint arXiv:2403.02579 , year =

Geometric Dynamics of Signal Propagation Predict Trainability of Transformers , author =. arXiv preprint arXiv:2403.02579 , year =

arXiv

[13] [13]

Tracking Finite-Time

W. Tracking Finite-Time. arXiv preprint arXiv:2602.09613 , year =

arXiv

[14] [14]

arXiv preprint arXiv:2602.16864 , year =

Position: Why a Dynamical Systems Perspective is Needed to Advance Time Series Modeling , author =. arXiv preprint arXiv:2602.16864 , year =

Pith/arXiv arXiv

[15] [15]

arXiv preprint arXiv:2502.05656 , year =

Flowing Through Layers: A Continuous Dynamical Systems Perspective on Transformers , author =. arXiv preprint arXiv:2502.05656 , year =

arXiv

[16] [16]

Proceedings of the 42nd International Conference on Machine Learning , year =

He, Yi and Yang, Yiming and Cheng, Xiaoyuan and Wang, Hai and Xue, Xiao and Chen, Boli and Hu, Yukun , title =. Proceedings of the 42nd International Conference on Machine Learning , year =

[17] [17]

arXiv preprint arXiv:2511.00663 , year =

Dobra, Alex and Pidstrigach, Jakiw and Reichelt, Tim and Fraccaro, Paolo and Jakubik, Johannes and Jones, Anne and Schroeder de Witt, Christian and Torr, Philip and Stier, Philip , title =. arXiv preprint arXiv:2511.00663 , year =

arXiv

[18] [18]

, title =

Hoover, Brett T. , title =

[19] [19]

arXiv preprint arXiv:2410.05988 , year =

Mittra, Tirthankar , title =. arXiv preprint arXiv:2410.05988 , year =

arXiv

[20] [20]

arXiv preprint arXiv:2308.09955 , year =

To prune or not to prune: A chaos-causality approach to principled pruning of dense neural networks , author =. arXiv preprint arXiv:2308.09955 , year =

arXiv

[21] [21]

arXiv preprint arXiv:2506.07975 , year =

Hyperpruning: Efficient Search through Pruned Variants of Recurrent Neural Networks Leveraging Lyapunov Spectrum , author =. arXiv preprint arXiv:2506.07975 , year =

arXiv

[22] [22]

2026 , journal =

Structural Sensitivity in Compressed Transformers: Relative Error Propagation and Layer Removal , author =. 2026 , journal =

2026

[23] [23]

and Lucic, Ana and others , title =

Bodnar, Cristian and Bruinsma, Wessel P. and Lucic, Ana and others , title =. Nature , year =

[24] [24]

Nature , volume =

Bi, Kaifeng and Xie, Lingxi and Zhang, Hengheng and Chen, Xin and Gu, Xiaotao and Tian, Qi , title =. Nature , volume =

[25] [25]

Journal of Computer Science and Technology , volume =

Chen, Yao-Dong and Zheng, Kai-Jun and Guo, Zhen-Hua and others , title =. Journal of Computer Science and Technology , volume =. 2026 , doi =

2026

[26] [26]

arXiv preprint arXiv:2602.02110 , year =

Fu, Zhongqian and Zhao, Tianyi and Han, Kai and Zhou, Hang and Chen, Xinghao and Wang, Yunhe , title =. arXiv preprint arXiv:2602.02110 , year =

arXiv

[27] [27]

International Conference on Learning Representations , year =

Frantar, Elias and Ashkboos, Saleh and Hoefler, Torsten and Alistarh, Dan , title =. International Conference on Learning Representations , year =

[28] [28]

arXiv preprint arXiv:2504.02692 , year =

Li, Yuhang and others , title =. arXiv preprint arXiv:2504.02692 , year =

arXiv

[29] [29]

arXiv preprint arXiv:2504.09629 , year =

Arai, Yamato and Ichikawa, Yuma , title =. arXiv preprint arXiv:2504.09629 , year =

arXiv

[30] [30]

Proceedings of Machine Learning and Systems , year =

Lin, Ji and Tang, Jiaming and Tang, Haotian and Yang, Shang and Chen, Wei-Ming and Wang, Wei-Chen and Xiao, Guangxuan and Dang, Xingyu and Gan, Chuang and Han, Song , title =. Proceedings of Machine Learning and Systems , year =

[31] [31]

and Keutzer, Kurt , title =

Dong, Zhen and Yao, Zhewei and Gholami, Amir and Mahoney, Michael W. and Keutzer, Kurt , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =

[32] [32]

and Keutzer, Kurt , title =

Dong, Zhen and Yao, Zhewei and Cai, Yaohui and Arfeen, Daiyaan and Gholami, Amir and Mahoney, Michael W. and Keutzer, Kurt , title =. Advances in Neural Information Processing Systems , year =

[33] [33]

Advances in Neural Information Processing Systems , year =

Dettmers, Tim and Lewis, Mike and Belkada, Younes and Zettlemoyer, Luke , title =. Advances in Neural Information Processing Systems , year =

[34] [34]

and Keutzer, Kurt , title =

Gholami, Amir and Kim, Sehoon and Dong, Zhen and Yao, Zhewei and Mahoney, Michael W. and Keutzer, Kurt , title =. arXiv preprint arXiv:2103.13630 , year =

arXiv

[35] [35]

arXiv preprint arXiv:2409.16694 , year =

Gong, Ruihao and others , title =. arXiv preprint arXiv:2409.16694 , year =

arXiv

[36] [36]

arXiv preprint arXiv:2212.07048 , year =

Liu, Jiawei and Niu, Lin and Yuan, Zhihang and Yang, Dawei and Wang, Xinggang and Liu, Wenyu , title =. arXiv preprint arXiv:2212.07048 , year =

arXiv

[37] [37]

Kim, Jinuk and El Halabi, Marwa and Park, Wonpyo and Schaefer, Clemens J. S. and Lee, Deokjae and Park, Yeonhong and Lee, Jae W. and Song, Hyun Oh , title =. Proceedings of the 42nd International Conference on Machine Learning , year =

[38] [38]

arXiv preprint , year =

Kogan, Maxim and others , title =. arXiv preprint , year =

[39] [39]

Zico , title =

Sun, Mingjie and Liu, Zhuang and Bair, Anna and Kolter, J. Zico , title =. arXiv preprint arXiv:2306.11695 , year =

Pith/arXiv arXiv

[40] [40]

arXiv preprint arXiv:2410.03294 , year =

Anonymous , title =. arXiv preprint arXiv:2410.03294 , year =

Pith/arXiv arXiv

[41] [41]

Proceedings of the 41st International Conference on Machine Learning , year =

Das, Abhimanyu and Kong, Weihao and Sen, Rajat and Zhou, Yichen , title =. Proceedings of the 41st International Conference on Machine Learning , year =

[42] [42]

Hersbach, Hans and Bell, Bill and Berrisford, Paul and Hirahara, Shoji and Hor. The. Quarterly Journal of the Royal Meteorological Society , volume =

[43] [43]

Engelken, Rainer and Wolf, Fred and Abbott, L. F. , title =. Physical Review Research , volume =. 2023 , doi =

2023

[44] [44]

, title =

Lorenz, Edward N. , title =. Journal of the Atmospheric Sciences , volume =

[45] [45]

Predictability of Large-Scale Atmospheric Motions:

Vannitsem, St. Predictability of Large-Scale Atmospheric Motions:. Chaos: An Interdisciplinary Journal of Nonlinear Science , volume =

[46] [46]

1992 , publisher=

The General Problem of the Stability of Motion , author=. 1992 , publisher=

1992

[47] [47]

Characteristic Ljapunov, exponents of dynamical systems , author=

A multiplicative ergodic theorem. Characteristic Ljapunov, exponents of dynamical systems , author=. Trudy Moskovskogo Matematicheskogo Obshchestva , volume=. 1968 , publisher=

1968

[48] [48]

Proceedings of the 37th International Conference on Machine Learning , year=

Up or Down? Adaptive Rounding for Post-Training Quantization , author=. Proceedings of the 37th International Conference on Machine Learning , year=

[49] [49]

2023 , publisher =

Xiao, Guangxuan and Lin, Ji and Seznec, Mickael and Wu, Hao and Demouth, Julien and Han, Song , booktitle =. 2023 , publisher =

2023

[50] [50]

Ranjan, Navin and Savakis, Andreas , journal=. Mix-

[51] [51]

Communications in Mathematics and Statistics , volume =

A Proposal on Machine Learning via Dynamical Systems , author =. Communications in Mathematics and Statistics , volume =. 2017 , doi =

2017

[52] [52]

arXiv preprint arXiv:2507.05164 , year =

A Dynamical Systems Perspective on the Analysis of Neural Networks , author =. arXiv preprint arXiv:2507.05164 , year =

Pith/arXiv arXiv

[53] [53]

Frontiers in Applied Mathematics and Statistics , volume =

On Lyapunov Exponents for RNNs: Understanding Information Propagation Using Dynamical Systems Tools , author =. Frontiers in Applied Mathematics and Statistics , volume =. 2022 , doi =

2022

[54] [54]

Proceedings of the AAAI Conference on Artificial Intelligence , year =

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting , author =. Proceedings of the AAAI Conference on Artificial Intelligence , year =

[55] [55]

Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =

Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks , author =. Proceedings of the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval , pages =. 2018 , publisher =

2018

[56] [56]

Advances in Neural Information Processing Systems , volume =

Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting , author =. Advances in Neural Information Processing Systems , volume =

[57] [57]

Chee, Jerry and Cai, Yaohui and Kuleshov, Volodymyr and De Sa, Christopher , booktitle =

[58] [58]

Yao, Zhewei and Aminabadi, Reza Yazdani and Zhang, Minjia and Wu, Xiaoxia and Li, Conglong and He, Yuxiong , booktitle =

[59] [59]

arXiv preprint arXiv:2503.01901 , year =

Identifying Sensitive Weights via Post-quantization Integral , author =. arXiv preprint arXiv:2503.01901 , year =

arXiv

[60] [60]

Nature , volume =

Neural General Circulation Models for Weather and Climate , author =. Nature , volume =. 2024 , doi =

2024

[61] [61]

2022 , archivePrefix =

Pathak, Jaideep and Subramanian, Shashank and Harrington, Peter and Raja, Sanjeev and Chattopadhyay, Ashesh and Mardani, Morteza and Kurth, Thorsten and Hall, David and Li, Zongyi and Azizzadenesheli, Kamyar and Hassanzadeh, Pedram and Kashinath, Karthik and Anandkumar, Animashree , journal =. 2022 , archivePrefix =

2022

[62] [62]

Science , volume =

Learning Skillful Medium-Range Global Weather Forecasting , author =. Science , volume =. 2023 , doi =

2023

[63] [63]

2022 , howpublished =

Early Warnings for All , author =. 2022 , howpublished =

2022

[64] [64]

2025 , month = jun, day =

2025

[65] [65]

Physical Review Letters , volume =

Enforcing Analytic Constraints in Neural Networks Emulating Physical Systems , author =. Physical Review Letters , volume =. 2021 , doi =

2021

[66] [66]

and Wexler, Anthony S

Sturm, Philipp O. and Wexler, Anthony S. , journal =. Conservation Laws in a Neural Network Architecture: Enforcing the Atom Balance of a Julia-Based Photochemical Model to Assess Stratospheric. 2022 , doi =

2022

[67] [67]

Journal of Machine Learning Research , volume =

Hard-Constrained Deep Learning for Climate Downscaling , author =. Journal of Machine Learning Research , volume =

[68] [68]

Geophysical Research Letters , volume =

On Some Limitations of Current Machine Learning Weather Prediction Models , author =. Geophysical Research Letters , volume =. 2024 , doi =

2024

[69] [69]

2022 , howpublished =

2022

[70] [70]

Operations Research , volume=

Multiple choice knapsack problem , author=. Operations Research , volume=. 1979 , publisher=

1979