Learning Koopman Models From Data Under General Noise Conditions
Pith reviewed 2026-05-19 05:00 UTC · model grok-4.3
The pith
A multiple-shooting formulation with deep encoders yields statistically consistent Koopman models from input-output data under general noise.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that their approach, which combines deep state-space encoders based on state reconstructability with a multiple-shooting squared loss that includes an innovation noise term, produces Koopman models whose parameters can be estimated consistently from input-output data alone, with the error vanishing as the data volume grows, while the parallelizable formulation supports efficient optimization and strong long-horizon predictions.
What carries the argument
Deep state-space encoders that exploit the state reconstructability property to recover the Koopman lifted state, paired with a multiple-shooting formulation of the prediction error loss and an innovation noise model.
If this is right
- The estimator is statistically consistent, so estimation error goes to zero with infinite data.
- Multiple-shooting enables parallel computation of the loss over data segments for faster batch training.
- Obtained models achieve excellent long-term prediction on nonlinear benchmarks and quadcopter flight data.
Where Pith is reading between the lines
- Such models could support predictive control without requiring direct state sensors.
- Similar techniques might extend to other lifted representations beyond Koopman.
- Validation on systems with structured noise could show where the general innovation term suffices or needs refinement.
Load-bearing premise
The nonlinear system must allow state reconstruction from inputs and outputs through the deep encoder, and the single innovation noise term must sufficiently describe all process and measurement disturbances.
What would settle it
If increasing the amount of training data from the quadcopter or benchmark systems does not reduce the long-term prediction error or the parameter estimation discrepancy, the statistical consistency would be refuted.
Figures
read the original abstract
This paper presents a novel identification approach of Koopman models of nonlinear systems with inputs under rather general noise conditions. The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state only from input-output data. Furthermore, the Koopman model structure includes an innovation noise term that is used to handle process and measurement noise. It is shown that the proposed approach is statistically consistent (estimation error tends to zero when the number of data points goes to infinity) and computationally efficient due to the multiple-shooting formulation, by which the prediction error of the model can be calculated on multiple subsections of the data in parallel. The latter allows for efficient batch optimization of the network parameters and, at the same time, excellent long-term prediction capabilities of the obtained models. The performance of the approach is illustrated by nonlinear benchmark examples and experimental data from a Crazyflie 2.1 quadcopter.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a method for identifying Koopman models of nonlinear systems with inputs from noisy input-output data. It uses deep state-space encoders based on state reconstructability, augments the model with an innovation noise term to handle process and measurement disturbances, and employs a multiple-shooting formulation of the squared prediction-error loss. The central claims are statistical consistency (estimation error tends to zero as the number of data points N goes to infinity) and computational efficiency due to parallelizable multiple shooting, with demonstrated long-term prediction performance on nonlinear benchmarks and Crazyflie 2.1 quadcopter experiments.
Significance. If the consistency result is rigorously established, this would advance data-driven modeling and control of nonlinear systems in realistic noisy settings by extending Koopman operator methods with deep encoders and efficient optimization. The multiple-shooting approach for scalable batch optimization and long-term prediction is a clear strength. Experimental validation on hardware data adds practical value.
major comments (2)
- [Theoretical analysis / consistency result] The statistical consistency claim (estimation error → 0 as N → ∞) is load-bearing but rests on the deep encoders recovering an exact lifted representation under the reconstructability property while the innovation term absorbs all disturbances. For general (non-white, possibly state-dependent) noise, reconstructability from I/O data alone is not guaranteed for arbitrary nonlinear systems, and no mechanism enforces it; residual reconstruction error would prevent convergence of the empirical minimizer to true parameters. A detailed proof sketch, error bounds, or additional identifiability conditions are needed to support this.
- [Method / multiple-shooting formulation] The multiple-shooting formulation is credited with both efficiency and excellent long-term prediction, but it is unclear how the parallel subsection loss interacts with the innovation noise term to preserve consistency. The abstract states the loss is computed on multiple subsections in parallel; a specific derivation showing that this does not introduce bias or violate the asymptotic properties would clarify the central efficiency-consistency tradeoff.
minor comments (2)
- [Abstract] The abstract refers to 'rather general noise conditions' without specifying the precise class (e.g., bounded moments, independence assumptions); a short clarifying sentence would improve precision.
- [Method] Notation for the lifted state and innovation term should be introduced with explicit definitions early in the method section to aid readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment below with clarifications on our approach and indicate planned revisions to strengthen the presentation.
read point-by-point responses
-
Referee: [Theoretical analysis / consistency result] The statistical consistency claim (estimation error → 0 as N → ∞) is load-bearing but rests on the deep encoders recovering an exact lifted representation under the reconstructability property while the innovation term absorbs all disturbances. For general (non-white, possibly state-dependent) noise, reconstructability from I/O data alone is not guaranteed for arbitrary nonlinear systems, and no mechanism enforces it; residual reconstruction error would prevent convergence of the empirical minimizer to true parameters. A detailed proof sketch, error bounds, or additional identifiability conditions are needed to support this.
Authors: We appreciate the referee's emphasis on rigorously supporting the consistency result. The method relies on deep encoders designed around the state reconstructability property to recover a lifted representation from input-output data, with the innovation term explicitly modeling general disturbances (including non-white and state-dependent noise) so that the prediction-error objective remains well-defined. While the manuscript establishes consistency under these modeling assumptions as N tends to infinity, we agree that the current presentation would benefit from greater detail. In the revision we will add an appendix containing a proof sketch that outlines the key steps: (i) uniform approximation of the reconstructability map by the deep network class, (ii) convergence of the empirical minimizer of the innovation-augmented loss to the population risk, and (iii) identifiability of the Koopman parameters once the lifted state is recovered. We will also state the precise technical conditions (e.g., persistence of excitation and network capacity) under which the result holds. revision: yes
-
Referee: [Method / multiple-shooting formulation] The multiple-shooting formulation is credited with both efficiency and excellent long-term prediction, but it is unclear how the parallel subsection loss interacts with the innovation noise term to preserve consistency. The abstract states the loss is computed on multiple subsections in parallel; a specific derivation showing that this does not introduce bias or violate the asymptotic properties would clarify the central efficiency-consistency tradeoff.
Authors: The multiple-shooting loss is formed by partitioning the data into contiguous subsections and summing the squared prediction errors (each computed with its own innovation sequence) across all subsections. Because the subsections are non-overlapping and together exhaust the full dataset, the total objective is mathematically identical to the single-shooting prediction-error loss; the innovation term is simply applied segment-wise. Consequently, the empirical risk minimizer and its asymptotic convergence properties remain unchanged. In the revised manuscript we will include a short derivation in the main text (or appendix) that explicitly shows the equivalence of the summed subsection losses to the full-trajectory loss and confirms that the parallel formulation introduces neither bias nor alteration to the consistency argument. revision: yes
Circularity Check
No significant circularity: consistency claim rests on explicit assumptions and multiple-shooting loss without reduction to fitted inputs
full rationale
The paper states that statistical consistency holds under the state reconstructability property and with an added innovation noise term that absorbs disturbances. The multiple-shooting formulation is introduced solely for computational efficiency in parallel loss evaluation on data subsections; it does not define or force the consistency result. No equations or self-citations in the provided abstract reduce the consistency claim to a tautology or to parameters fitted from the target quantity itself. The derivation therefore remains self-contained against external benchmarks once the reconstructability assumption is granted.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The nonlinear system admits a state reconstructability property that allows recovery of the lifted Koopman state from input-output trajectories.
- domain assumption An additive innovation noise term inside the Koopman model structure is sufficient to capture both process and measurement disturbances.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The method uses deep state-space encoders based on the concept of state reconstructability and an efficient multiple-shooting formulation of the squared loss of the prediction error to estimate the dynamics and the lifted state only from input-output data.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 4.2. Under the conditions of Theorem 4.1 and Condition 3, lim N→∞ ξ̂N ∈ Ξo with probability 1
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
J. Alexandre Pinto Sales de Noronha, Suficient conditions for reconstructability on the autonomous continuous time SUBNET method , Eindhoven University of Technology, 2023, https://research.tue. nl/en/publications/suficient-conditions-for-reconstructability-on-the-autonomous-con. Stageverslag
work page 2023
-
[2]
G. I. Beintema, M. Schoukens, and R. T ´oth, Deep subspace encoders for nonlinear system identifi- cation, Automatica, 156 (2023), p. 111210
work page 2023
-
[3]
M. Bonnert and U. Konigorski , Estimating Koopman invariant subspaces of excited systems using artificial neural networks, 21st IFAC World Congress, Berlin, Germany, 53 (2020), pp. 1156–1162
work page 2020
- [4]
-
[5]
S. L. Brunton, B. W. Brunton, J. L. Proctor, and J. N. Kutz , Koopman invariant subspaces and finite linear representations of nonlinear dynamical systems for control , PLoS ONE, 11 (2016)
work page 2016
-
[6]
S. L. Brunton, M. Budi ˇsi´c, E. Kaiser, and J. N. Kutz , Modern Koopman theory for dynamical systems, SIAM Review, 64 (2022), pp. 229–340
work page 2022
-
[7]
S. L. Brunton, J. L. Proctor, and J. N. Kutz , Discovering governing equations from data by sparse identification of nonlinear dynamical systems , Proceedings of the National Academy of Sciences, 113 (2016), pp. 3932–3937
work page 2016
-
[8]
A. Fakhrizadeh Esfahani, P. Dreesen, K. Tiels, J.-P. No¨el, and J. Schoukens, Parameter reduc- tion in nonlinear state-space identification of hysteresis , Mechanical Systems and Signal Processing, 104 (2018), pp. 884–895
work page 2018
-
[9]
L. C. Iacob, G. I. Beintema, M. Schoukens, and R. T ´oth, Deep identification of nonlinear systems in Koopman form , in 60th IEEE Conference on Decision and Control (CDC), 2021, pp. 2288–2293
work page 2021
-
[10]
L. C. Iacob, M. Schoukens, and R. T´oth, Finite dimensional Koopman form of polynomial nonlinear systems, 22nd IFAC World Congress, Yokohama, Japan, 56 (2023), pp. 6423–6428
work page 2023
-
[11]
L. C. Iacob, R. T ´oth, and M. Schoukens , Koopman form of nonlinear systems with inputs , Auto- matica, 162 (2024), p. 111525
work page 2024
- [12]
-
[13]
L. C. Iacob, R. T ´oth, and M. Schoukens , Exact Koopman Embedding of Discrete Time Wiener- Hammerstein Structure with Noise , Eindhoven University of Technology, 2025, https://research.tue. nl/en/publications/exact-koopman-embedding-of-discrete-time-wiener-hammerstein-struc. Techni- cal Report
work page 2025
-
[14]
Isidori, Nonlinear Control Systems, Springer London, 3 ed., 1995
A. Isidori, Nonlinear Control Systems, Springer London, 3 ed., 1995
work page 1995
-
[15]
L. Jiang and N. Liu , Correcting noisy dynamic mode decomposition with Kalman filters , Journal of Computational Physics, 461 (2022), p. 111175
work page 2022
-
[16]
R. M. Jungers and P. Tabuada, Non-local linearization of nonlinear differential equations via polyflows, in 2019 American Control Conference (ACC), 2019, pp. 1–6
work page 2019
-
[17]
P. Ketthong, J. Samkunta, N. T. Mai, M. A. S. Kamal, I. Murakami, and K. Yamada , Data- driven Koopman based system identification for partially observed dynamical systems with input and disturbance, Sci, 6 (2024)
work page 2024
-
[18]
D. P. Kingma and J. Ba , Adam: A method for stochastic optimization , in International Conference on Learning Representations (ICLR), 2015
work page 2015
-
[19]
M. Korda and I. Mezi ´c, Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control, Automatica, 93 (2018), pp. 149–160
work page 2018
-
[20]
Y. Lian and C. N. Jones, On Gaussian process based Koopman operators, 21st World Congress, Berlin, Germany, 53 (2020), pp. 449–455
work page 2020
-
[21]
L. Ljung, Convergence analysis of parametric identification methods , IEEE Transactions on Automatic Control, 23 (1978), pp. 770–783
work page 1978
-
[22]
Ljung, System Identification: Theory for the User , Prentice Hall PTR, 2 ed., 1999
L. Ljung, System Identification: Theory for the User , Prentice Hall PTR, 2 ed., 1999
work page 1999
-
[23]
Ljung, System Identification: An Overview , Springer, London, 2013, pp
L. Ljung, System Identification: An Overview , Springer, London, 2013, pp. 1–20
work page 2013
- [24]
- [25]
-
[26]
G. Mamakoukas, M. L. Casta˜no, X. Tan, and T. D. Murphey, Derivative-based Koopman operators for real-time control of robotic systems , IEEE Transactions on Robotics, 37 (2020), pp. 2173–2192
work page 2020
-
[27]
A. Mauroy and J. Goncalves , Koopman-based lifting techniques for nonlinear systems identification , IEEE Transactions on Automatic Control, 65 (2020), pp. 2550–2565
work page 2020
- [28]
-
[29]
D. Mellinger and V. Kumar, Minimum snap trajectory generation and control for quadrotors, in Proc. of the IEEE Int. Conf. on Rob. and Aut., 2011, pp. 2520–2525
work page 2011
-
[30]
I. Mezi ´c, Koopman operator, geometry, and learning of dynamical systems , Notices of the American Mathematical Society, 68 (2021), pp. 1087–1105
work page 2021
-
[31]
J. Mohammadpour and C. W. Scherer , Control of Linear Parameter Varying Systems with Applica- tions, Springer, 2012
work page 2012
-
[32]
H. Nijmeijer , Observability of autonomous discrete time non-linear systems: a geometric approach , International Journal of Control, 36 (1982), pp. 867–874
work page 1982
-
[33]
Hysteretic Benchmark with a Dynamic Nonlinearity , publisher =
J.-P. No¨el and M. Schoukens , Hysteretic benchmark with a dynamic nonlinearity , 2020, https://doi. org/10.4121/12967592.v1
-
[34]
J.-P. No ¨el and M. Schoukens , Nonlinear benchmark: Bouc-Wen hysteretic system , 2025, https:// www.nonlinearbenchmark.org/benchmarks/bouc-wen (accessed 18-03-2025)
work page 2025
-
[35]
S. Otto, S. Peitz, and C. Rowley , Learning bilinear models of actuated Koopman generators from partially observed trajectories, SIAM Journal on Applied Dynamical Systems, 23 (2024), pp. 885–923
work page 2024
-
[36]
S. Otto and C. Rowley, Linearly recurrent autoencoder networks for learning dynamics, SIAM Journal on Applied Dynamical Systems, 18 (2019), pp. 558–593
work page 2019
-
[37]
F. M. Philipp, M. Schaller, K. Worthmann, S. Peitz, and F. N¨uske, Error bounds for kernel-based approximations of the Koopman operator, Applied and Computational Harmonic Analysis, 71 (2024), p. 101657
work page 2024
-
[38]
A. H. Ribeiro, K. Tiels, J. Umenberger, T. B. Sch ¨on, and L. A. Aguirre , On the smoothness of nonlinear system identification, Automatica, 121 (2020), p. 109158
work page 2020
- [39]
-
[40]
M. Schoukens, Improved initialization of state-space artificial neural networks, in 2021 European Control Conference (ECC), 2021, pp. 1913–1918
work page 2021
-
[41]
M. Schoukens and J. No ¨el, Three benchmarks addressing open challenges in nonlinear system identi- fication, 20th IFAC World Congress, Toulouse, France, 50 (2017), pp. 446–451
work page 2017
-
[42]
J. C. Schulze, D. T. Doncevic, and A. Mitsos, Identification of mimo Wiener-type Koopman models for data-driven model reduction using deep learning , Computers & Chemical Engineering, 161 (2022), p. 107781
work page 2022
- [43]
- [44]
-
[45]
Stark, Delay embeddings for forced systems
J. Stark, Delay embeddings for forced systems. i. deterministic forcing , Journal of Nonlinear Science, 9 (1999), pp. 255–332
work page 1999
-
[46]
Stark, Delay embeddings for forced systems
J. Stark, Delay embeddings for forced systems. ii. stochastic forcing , Journal of Nonlinear Science, 13 (2003), pp. 519–577
work page 2003
-
[47]
A. Surana , Koopman operator based observer synthesis for control-affine nonlinear systems , in IEEE 55th Conference on Decision and Control (CDC), 2016, pp. 6492–6499
work page 2016
-
[48]
M. Sz´ecsi, B. Gy ¨or¨ok, A. Weinhardt-Kov ´acs, G. I. Beintema, M. Schoukens, T. P ´eni, and R. T ´oth, Deep learning of vehicle dynamics , in 20th IFAC Symposium on System Identification SYSID 2024, vol. 58, Jan. 2024, pp. 283–288
work page 2024
-
[49]
N. Takeishi, Y. Kawahara, and T. Yairi , Learning Koopman invariant subspaces for dynamic mode decomposition, in International Conference on Neural Information Processing Systems (NIPS), 2017
work page 2017
-
[50]
F. Takens, Detecting strange attractors in turbulence , in Dynamical Systems and Turbulence, Warwick LEARNING KOOPMAN MODELS FROM DATA UNDER GENERAL NOISE CONDITIONS 29 1980, D. Rand and L.-S. Young, eds., Springer Berlin Heidelberg, 1981, pp. 366–381
work page 1980
-
[51]
B. van der Heijden, L. Ferranti, J. Kober, and R. Babu ˇska, Deepkoco: Efficient latent plan- ning with a task-relevant Koopman representation , in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2021, pp. 183–189
work page 2021
-
[52]
Z. Wang and R. M. Jungers , A data-driven immersion technique for linearization of discrete-time nonlinear systems, 21st World Congress, Berlin, Germany, 53 (2020), pp. 869–874
work page 2020
-
[53]
M. Williams, I. Kevrekidis, and C. Rowley, A data–driven approximation of the Koopman operator: Extending dynamic mode decomposition, Journal of Nonlinear Science, 25 (2015), pp. 1307–1346
work page 2015
-
[54]
M. O. Williams, C. W. Rowley, and I. G. Kevrekidis , A kernel-based method for data-driven Koopman spectral analysis, Journal of Computational Dynamics, 2 (2015), pp. 247–265
work page 2015
- [55]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.