pith. sign in

arxiv: 2403.02290 · v2 · submitted 2024-03-04 · 💻 cs.AI · cs.LG· math.DS· math.OC

Koopman-Assisted Reinforcement Learning

Pith reviewed 2026-05-24 02:47 UTC · model grok-4.3

classification 💻 cs.AI cs.LGmath.DSmath.OC
keywords koopman operatorreinforcement learningsoft actor-criticvalue functionhamilton-jacobi-bellmannonlinear dynamicscontrolled systemsdata-driven methods
0
0 comments X

The pith

The controlled Koopman tensor linearizes value function evolution so that soft actor-critic and value iteration become tractable for nonlinear systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops two reinforcement learning algorithms that lift nonlinear dynamics into approximately linear coordinates using the data-driven Koopman operator. By parameterizing this operator with control actions to form a controlled Koopman tensor, the expectation of the value function's time evolution is captured through linear dynamics, which makes the Hamilton-Jacobi-Bellman equation easier to handle. This reformulation of soft value iteration and soft actor-critic applies to deterministic and stochastic systems in both discrete and continuous time. A sympathetic reader would care because the standard Bellman equations are intractable for high-dimensional nonlinear cases, and the approach yields measurable gains over neural-network baselines on concrete test systems.

Core claim

By constructing a controlled Koopman tensor from data, the method reformulates soft value iteration and soft actor-critic to estimate the optimal value function using linear dynamics in lifted coordinates, achieving state-of-the-art performance on a linear state-space system, the Lorenz system, fluid flow past a cylinder, and a double-well potential with non-isotropic stochastic forcing.

What carries the argument

The controlled Koopman tensor, which parameterizes the data-driven Koopman operator by control actions so that the expected time evolution of the value function is captured by linear dynamics.

If this is right

  • The framework covers deterministic and stochastic systems as well as discrete and continuous dynamics.
  • Reformulated soft actor-critic exceeds traditional neural-network soft actor-critic on the four tested systems.
  • Value-function estimation reduces to linear operations once the system is lifted by the controlled Koopman tensor.
  • The same tensor construction supports both soft value iteration and soft actor-critic.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The linear structure in lifted space could make learned policies easier to analyze or verify than black-box neural policies.
  • The method might combine with other data-driven linearization techniques for hybrid control algorithms on robotic systems.
  • Performance on fluid and chaotic examples suggests the tensor could scale to additional physical domains with similar lifting properties.

Load-bearing premise

The Koopman operator, once parameterized by control actions into a controlled Koopman tensor, accurately captures the expectation of the time evolution of the value function for the systems considered.

What would settle it

Running the Koopman-assisted soft actor-critic on the fluid flow past a cylinder or Lorenz system and finding no performance improvement over standard neural-network soft actor-critic would falsify the state-of-the-art claim.

Figures

Figures reproduced from arXiv: 2403.02290 by Edward Mehrez, Ludger Paehler, Preston Rozwood, Steven L. Brunton, Wen Sun.

Figure 1
Figure 1. Figure 1: Koopman-assisted reinforcement learning in the example of the Soft Actor Koopman [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Soft Koopman Value Iteration, a Koopman variant of the widely used value iteration [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Construction of action-dependent Koopman operators [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Schematic of Koopman with control. (left) A nominal trajectory is shown for a given [PITH_FULL_IMAGE:figures/full_fig_p013_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Four benchmark problems investigated: (a) simple linear system; (b) Lorenz 1963 model; [PITH_FULL_IMAGE:figures/full_fig_p014_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Episodic returns of the evaluation environments for the compared algorithms. We consider [PITH_FULL_IMAGE:figures/full_fig_p019_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Zoomed in episodic return of the Linear System. The first 24000, as well as the last 10000 steps in the environment have been omitted to focus on the performance of the algorithms [PITH_FULL_IMAGE:figures/full_fig_p020_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Zoomed in episodic return for the Fluid Flow, and the Stochastic Double Well Potential, where the first 10000 steps in the environment have not been included due to the large variability of the episodic return during the initial 10000 steps. The Soft Koopman Value Iteration (SKVI)a shows here that its performance lags behind that of the actor-critic based algorithms. The value-based Soft Actor-Critic (SAC … view at source ↗
Figure 9
Figure 9. Figure 9: Episodic returns rollouts associated with the soft-max policy of the optimal value function, [PITH_FULL_IMAGE:figures/full_fig_p021_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Ablation analysis of the compute afforded to the Soft Koopman Value Iteration (SKVI) [PITH_FULL_IMAGE:figures/full_fig_p024_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Ablation analysis of the order of monomials used for SAKC. The [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Ablation over the amount of compute allocated to construct the Koopman tensor in [PITH_FULL_IMAGE:figures/full_fig_p026_12.png] view at source ↗
read the original abstract

The Bellman equation and its continuous form, the Hamilton-Jacobi-Bellman equation, are ubiquitous in reinforcement learning and control theory. However, these equations become intractable for high-dimensional or nonlinear systems. This paper develops two new reinforcement learning algorithms based on the data-driven Koopman operator, which lifts a nonlinear system into new coordinates where the dynamics become approximately linear, and where Hamilton-Jacobi-Bellman-based methods are more tractable. In particular, the Koopman operator captures the expectation of the time evolution of the value function via linear dynamics in the lifted coordinates. By parameterizing the Koopman operator with the control actions, we construct a ``controlled Koopman tensor'' that facilitates the estimation of the optimal value function. This enables us to reformulate two max-entropy RL algorithms: soft value iteration and soft actor-critic. This flexible and interpretable framework includes deterministic and stochastic systems, as well as discrete and continuous dynamics. Koopman Assisted reinforcement learning attains state-of-the-art performance with respect to traditional neural network-based soft actor-critic baselines on a linear state-space system, the Lorenz system, fluid flow past a cylinder, and a double-well potential with non-isotropic stochastic forcing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper introduces two max-entropy RL algorithms (Koopman-assisted soft value iteration and soft actor-critic) that construct a data-driven controlled Koopman tensor by parameterizing the Koopman operator with control inputs. This tensor is used to obtain linear dynamics in lifted coordinates that approximate the expectation of the value-function evolution under the Bellman operator, thereby making HJB-based methods tractable for nonlinear and stochastic systems. The central empirical claim is that the resulting algorithms attain state-of-the-art performance relative to standard neural-network soft actor-critic baselines on a linear state-space system, the Lorenz attractor, fluid flow past a cylinder, and a double-well potential with non-isotropic stochastic forcing.

Significance. If the controlled Koopman tensor accurately recovers (or bounds) the controlled expectation of the value function after finite lifting and data-driven fitting, the framework supplies an interpretable, potentially lower-dimensional alternative to deep RL that inherits linear-algebraic tools while retaining the max-entropy objective. The explicit handling of both deterministic/stochastic and discrete/continuous cases, together with the reported outperformance on the four benchmark systems, would constitute a concrete advance for systems where suitable observables exist.

major comments (3)
  1. [§3.2] §3.2 (Controlled Koopman tensor construction): the claim that the tensor 'facilitates the estimation of the optimal value function' by capturing the expectation of the time evolution under the Bellman operator is load-bearing for all four experimental systems, yet the manuscript provides neither an a-priori error bound on the lifted approximation nor a quantitative residual analysis of ||K_u V - E[V(x_{t+1}) | u]|| for the stochastic double-well and cylinder-flow cases.
  2. [§4.3, Table 3] §4.3 and Table 3 (Lorenz and cylinder results): the reported SOTA margins are obtained after finite data-driven fitting of the tensor; without an ablation on observable choice, lifting dimension, or tensor rank, it is impossible to determine whether the performance gain is attributable to the Koopman linearization or to implicit regularization that the neural SAC baseline does not receive.
  3. [§5.1] §5.1 (Soft actor-critic reformulation): the actor update is derived under the assumption that the controlled tensor yields exact linear dynamics for the soft value function; any mismatch between the tensor action and the true controlled expectation propagates directly into the policy gradient, yet no sensitivity analysis or robustness check against tensor approximation error is supplied.
minor comments (3)
  1. [§3.2] Notation for the controlled tensor K_u is introduced without an explicit statement of its dimensions or the precise least-squares objective used for its data-driven estimation.
  2. [Figure 4] Figure 4 (double-well trajectories) lacks error bars or multiple random seeds, making it difficult to assess statistical significance of the reported improvement.
  3. [Abstract and §4] The abstract states 'state-of-the-art performance' but the main text compares only against NN-SAC; a brief discussion of other Koopman or linear-embedding baselines would strengthen the positioning.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments on the controlled Koopman tensor framework. We address each major point below and outline revisions to strengthen the empirical and robustness aspects of the manuscript.

read point-by-point responses
  1. Referee: [§3.2] the claim that the tensor 'facilitates the estimation of the optimal value function' by capturing the expectation of the time evolution under the Bellman operator is load-bearing for all four experimental systems, yet the manuscript provides neither an a-priori error bound on the lifted approximation nor a quantitative residual analysis of ||K_u V - E[V(x_{t+1}) | u]|| for the stochastic double-well and cylinder-flow cases.

    Authors: We agree that quantitative residual analysis would strengthen validation of the approximation quality. Deriving a general a-priori error bound for arbitrary nonlinear stochastic systems is challenging without further assumptions on observables and is left for future work. In revision we will add explicit residual computations ||K_u V - E[V(x_{t+1}) | u]|| for the double-well and cylinder cases. revision: partial

  2. Referee: [§4.3, Table 3] the reported SOTA margins are obtained after finite data-driven fitting of the tensor; without an ablation on observable choice, lifting dimension, or tensor rank, it is impossible to determine whether the performance gain is attributable to the Koopman linearization or to implicit regularization that the neural SAC baseline does not receive.

    Authors: We concur that systematic ablations would help isolate the contribution of the Koopman linearization. The revised manuscript will include additional results varying lifting dimension and tensor rank (with discussion of observable selection) on the Lorenz and cylinder benchmarks. revision: yes

  3. Referee: [§5.1] the actor update is derived under the assumption that the controlled tensor yields exact linear dynamics for the soft value function; any mismatch between the tensor action and the true controlled expectation propagates directly into the policy gradient, yet no sensitivity analysis or robustness check against tensor approximation error is supplied.

    Authors: We will add a sensitivity study in the revision that perturbs the fitted tensor entries and reports resulting changes in policy performance and value estimates, thereby quantifying robustness to approximation error. revision: yes

standing simulated objections not resolved
  • Deriving a general a-priori error bound on the lifted approximation for arbitrary nonlinear stochastic systems without additional assumptions.

Circularity Check

0 steps flagged

No significant circularity; derivation builds on independent Koopman theory and standard RL

full rationale

The paper constructs a controlled Koopman tensor from data-driven approximation of the Koopman operator to linearize value-function evolution for reformulating soft value iteration and actor-critic. This relies on established Koopman lifting (not self-defined here) and standard Bellman operators; performance is evaluated on external benchmark systems (Lorenz, cylinder flow, etc.) rather than reducing any claimed result to a fitted quantity defined by the same equations. No self-citation chains, ansatz smuggling, or fitted-input-as-prediction patterns appear in the derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only view limits visibility; the central claim rests on the domain assumption that a data-driven Koopman operator can be controlled and used to estimate value-function expectations.

axioms (1)
  • domain assumption The Koopman operator can be parameterized with control actions to form a controlled Koopman tensor that captures the expectation of the time evolution of the value function.
    This premise is invoked to enable the reformulation of the two RL algorithms.

pith-pipeline@v0.9.0 · 5755 in / 1248 out tokens · 20034 ms · 2026-05-24T02:47:52.575514+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

    cs.LG 2025-02 unverdicted novelty 5.0

    SALSA-RL introduces latent-space stability analysis for actions of pretrained RL agents using encoder-decoder and state-dependent linear dynamics to enable non-invasive interpretability.

Reference graph

Works this paper leans on

98 extracted references · 98 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    Reinforcement learning: An introduction, volume 1

    Richard S Sutton and Andrew G Barto. Reinforcement learning: An introduction, volume 1. MIT press Cambridge, 1998

  2. [2]

    Data-driven science and engineering: Machine learning, dynamical systems, and control

    Steven L Brunton and J Nathan Kutz. Data-driven science and engineering: Machine learning, dynamical systems, and control. Cambridge University Press, 2022

  3. [3]

    Continuous control with deep reinforcement learning

    Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. Continuous control with deep reinforcement learning. arxiv:1509.02971, 2015

  4. [4]

    Asynchronous methods for deep reinforcement learning

    Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lilli- crap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In ICML, pages 1928–1937. PMLR, 2016

  5. [5]

    Deep reinforcement learning with double q-learning

    Hado Van Hasselt, Arthur Guez, and David Silver. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI conference on artificial intelligence, volume 30, 2016

  6. [6]

    Dueling network architectures for deep reinforcement learning

    Ziyu Wang, Tom Schaul, Matteo Hessel, Hado Hasselt, Marc Lanctot, and Nando Freitas. Dueling network architectures for deep reinforcement learning. In International conference on machine learning, pages 1995–2003. PMLR, 2016

  7. [7]

    Hands-on reinforcement learning with Python: master reinforcement and deep reinforcement learning using OpenAI gym and tensorFlow

    Sudharsan Ravichandiran. Hands-on reinforcement learning with Python: master reinforcement and deep reinforcement learning using OpenAI gym and tensorFlow. Packt Publishing Ltd, 2018

  8. [8]

    Rainbow: Combining improve- ments in deep reinforcement learning

    Matteo Hessel, Joseph Modayil, Hado Van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, and David Silver. Rainbow: Combining improve- ments in deep reinforcement learning. In Thirty-second AAAI conference on artificial intelligence, 2018

  9. [9]

    Shared Autonomy via Deep Reinforcement Learning

    Siddharth Reddy, Anca D Dragan, and Sergey Levine. Shared autonomy via deep reinforce- ment learning. arxiv:1802.01744, 2018

  10. [10]

    Human-level control through deep reinforcement learning

    Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, et al. Human-level control through deep reinforcement learning. Nature, 518(7540):529, 2015

  11. [11]

    Grandmaster level in starcraft ii using multi-agent reinforcement learning

    Oriol Vinyals, Igor Babuschkin, Wojciech M Czarnecki, Micha¨el Mathieu, Andrew Dudzik, Junyoung Chung, David H Choi, Richard Powell, , et al. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575(7782):350–354, 2019

  12. [12]

    Mastering the game of go with deep neural networks and tree search

    David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016

  13. [13]

    Mastering the game of go without human knowledge

    David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, et al. Mastering the game of go without human knowledge. nature, 550(7676):354–359, 2017

  14. [14]

    A general reinforcement learning algorithm that masters chess, shogi, and go through self-play

    David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018

  15. [15]

    Deep reinforcement learning for de novo drug design

    Mariya Popova, Olexandr Isayev, and Alexander Tropsha. Deep reinforcement learning for de novo drug design. Science advances, 4(7):eaap7885, 2018

  16. [16]

    Deep reinforcement learning 30 for robotic manipulation with asynchronous off-policy updates

    Shixiang Gu, Ethan Holly, Timothy Lillicrap, and Sergey Levine. Deep reinforcement learning 30 for robotic manipulation with asynchronous off-policy updates. In 2017 IEEE international conference on robotics and automation (ICRA), pages 3389–3396. IEEE, 2017

  17. [17]

    Deep rein- forcement learning framework for autonomous driving

    Ahmad EL Sallab, Mohammed Abdou, Etienne Perot, and Senthil Yogamani. Deep rein- forcement learning framework for autonomous driving. Electronic Imaging, 2017(19):70–76, 2017

  18. [18]

    Champion-level drone racing using deep reinforcement learning

    Elia Kaufmann, Leonard Bauersfeld, Antonio Loquercio, Matthias M ¨uller, Vladlen Koltun, and Davide Scaramuzza. Champion-level drone racing using deep reinforcement learning. Nature, 620(7976):982–987, 2023

  19. [19]

    Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers

    Mattia Gazzola, Babak Hejazialhosseini, and Petros Koumoutsakos. Reinforcement learning and wavelet adapted vortex methods for simulations of self-propelled swimmers. SIAM Journal on Scientific Computing, 36(3):B622–B639, 2014

  20. [20]

    Flow navigation by smart microswimmers via reinforcement learning

    Simona Colabrese, Kristian Gustavsson, Antonio Celani, and Luca Biferale. Flow navigation by smart microswimmers via reinforcement learning. Phys. Rev. Lett., 118(15):158004, 2017

  21. [21]

    Efficient collective swimming by harnessing vortices through deep reinforcement learning

    Siddhartha Verma, Guido Novati, and Petros Koumoutsakos. Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proceedings of the National Academy of Sciences, 115(23):5849–5854, 2018

  22. [22]

    Controlled gliding and perching through deep-reinforcement-learning

    Guido Novati, Lakshminarayanan Mahadevan, and Petros Koumoutsakos. Controlled gliding and perching through deep-reinforcement-learning. Physical Review Fluids, 4(9):093902, 2019

  23. [23]

    Zermelo’s problem: Optimal point-to-point navigation in 2d turbulent flows using reinforcement learning

    Luca Biferale, Fabio Bonaccorso, Michele Buzzicotti, Patricio Clark Di Leoni, and Kristian Gustavsson. Zermelo’s problem: Optimal point-to-point navigation in 2d turbulent flows using reinforcement learning. Chaos, 29(10):103138, 2019

  24. [24]

    Reinforcement learning for bluff body active flow control in experiments and simulations

    Dixia Fan, Liu Yang, Zhicheng Wang, Michael S Triantafyllou, and George Em Karniadakis. Reinforcement learning for bluff body active flow control in experiments and simulations. Proceedings of the National Academy of Sciences, 117(42):26091–26098, 2020

  25. [25]

    Scientific multi-agent reinforcement learning for wall-models of turbulent flows

    H Jane Bae and Petros Koumoutsakos. Scientific multi-agent reinforcement learning for wall-models of turbulent flows. Nature Communications, 13(1):1443, 2022

  26. [26]

    Mag- netic control of tokamak plasmas through deep reinforcement learning

    Jonas Degrave, Federico Felici, Jonas Buchli, Michael Neunert, Brendan Tracey, Francesco Carpanese, Timo Ewalds, Roland Hafner, Abbas Abdolmaleki, Diego de Las Casas, et al. Mag- netic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897):414– 419, 2022

  27. [27]

    Hamiltonian systems and transformation in hilbert space

    Bernard O Koopman. Hamiltonian systems and transformation in hilbert space. Proceedings of the national academy of sciences of the united states of america, 17(5):315, 1931

  28. [28]

    Dynamical systems of continuous spectra

    Bernard O Koopman and J v Neumann. Dynamical systems of continuous spectra. Proceedings of the National Academy of Sciences, 18(3):255–263, 1932

  29. [29]

    Comparison of systems with complex behavior

    Igor Mezi´c and Andrzej Banaszuk. Comparison of systems with complex behavior. Physica D: Nonlinear Phenomena, 197(1):101–133, 2004

  30. [30]

    Spectral properties of dynamical systems, model reduction and decompositions

    Igor Mezi´c. Spectral properties of dynamical systems, model reduction and decompositions. Nonlinear Dynamics, 41(1-3):309–325, 2005

  31. [31]

    Applied Koopmanism a)

    Marko Budiˇsi´c, Ryan Mohr, and Igor Mezi´c. Applied Koopmanism a). Chaos: An Interdisci- plinary Journal of Nonlinear Science, 22(4):047510, 2012

  32. [32]

    Analysis of fluid flows via spectral properties of the Koopman operator

    Igor Mezic. Analysis of fluid flows via spectral properties of the Koopman operator. Annual Review of Fluid Mechanics, 45:357–378, 2013

  33. [33]

    Modern Koopman theory 31 for dynamical systems

    Steven L Brunton, Marko Budiˇsi´c, Eurika Kaiser, and J Nathan Kutz. Modern Koopman theory 31 for dynamical systems. SIAM Review, 64(2):229–340, 2022

  34. [34]

    C. W. Rowley, I. Mezic, S. Bagheri, P . Schlatter, and D.S. Henningson. Spectral analysis of nonlinear flows. J. Fluid Mech., 645:115–127, 2009

  35. [35]

    Dynamic mode decomposition of numerical and experimental data

    Peter J Schmid. Dynamic mode decomposition of numerical and experimental data. Journal of fluid mechanics, 656:5–28, 2010

  36. [36]

    J. H. Tu, C. W. Rowley, D. M. Luchtenburg, S. L. Brunton, and J. N. Kutz. On dynamic mode decomposition: theory and applications. Journal of Computational Dynamics, 1(2):391–421, 2014

  37. [37]

    J. N. Kutz, S. L. Brunton, B. W. Brunton, and J. L. Proctor. Dynamic Mode Decomposition: Data-Driven Modeling of Complex Systems. SIAM, 2016

  38. [38]

    Variable projection methods for an optimized dynamic mode decomposition

    Travis Askham and J Nathan Kutz. Variable projection methods for an optimized dynamic mode decomposition. SIAM Journal on Applied Dynamical Systems, 17(1):380–416, 2018

  39. [39]

    Consistent dynamic mode decomposition

    Omri Azencot, Wotao Yin, and Andrea Bertozzi. Consistent dynamic mode decomposition. SIAM Journal on Applied Dynamical Systems, 18(3):1565–1585, 2019

  40. [40]

    A data-driven ap- proximation of the Koopman operator: extending dynamic mode decomposition

    Matthew O Williams, Ioannis G Kevrekidis, and Clarence W Rowley. A data-driven ap- proximation of the Koopman operator: extending dynamic mode decomposition. Journal of Nonlinear Science, 6:1307–1346, 2015

  41. [41]

    A kernel approach to data-driven Koopman spectral analysis

    Matthew O Williams, Clarence W Rowley, and Ioannis G Kevrekidis. A kernel approach to data-driven Koopman spectral analysis. Journal of Computational Dynamics, 2(2):247–265, 2015

  42. [42]

    Extended dynamic mode decomposition with learned Koopman eigenfunctions for prediction and control

    Carl Folkestad, Daniel Pastor, Igor Mezic, Ryan Mohr, Maria Fonoberova, and Joel Burdick. Extended dynamic mode decomposition with learned Koopman eigenfunctions for prediction and control. In 2020 american control conference (acc), pages 3906–3913. IEEE, 2020

  43. [43]

    The mpedmd algorithm for data-driven computations of measure- preserving dynamical systems

    Matthew J Colbrook. The mpedmd algorithm for data-driven computations of measure- preserving dynamical systems. SIAM Journal on Numerical Analysis, 61(3):1585–1608, 2023

  44. [44]

    Beyond expectations: residual dynamic mode decomposition and variance for stochastic dynamical systems

    Matthew J Colbrook, Qin Li, Ryan V Raut, and Alex Townsend. Beyond expectations: residual dynamic mode decomposition and variance for stochastic dynamical systems. Nonlinear Dynamics, pages 1–25, 2023

  45. [45]

    Residual dynamic mode decomposition: robust and verified koopmanism

    Matthew J Colbrook, Lorna J Ayton, and M´at´e Sz˝oke. Residual dynamic mode decomposition: robust and verified koopmanism. Journal of Fluid Mechanics, 955:A21, 2023

  46. [46]

    Rigorous data-driven computation of spectral properties of koopman operators for dynamical systems

    Matthew J Colbrook and Alex Townsend. Rigorous data-driven computation of spectral properties of koopman operators for dynamical systems. Communications on Pure and Applied Mathematics, 77(1):221–283, 2024

  47. [47]

    Data-driven model reduction and transfer operator approximation

    Stefan Klus, Feliks N ¨uske, P´eter Koltai, Hao Wu, Ioannis Kevrekidis, Christof Sch ¨utte, and Frank No´e. Data-driven model reduction and transfer operator approximation. Journal of Nonlinear Science, 28:985–1010, 2018

  48. [48]

    Data-driven approximation of the Koopman generator: Model reduction, system identification, and control

    Stefan Klus, Feliks N ¨uske, Sebastian Peitz, Jan-Hendrik Niemann, Cecilia Clementi, and Christof Sch ¨utte. Data-driven approximation of the Koopman generator: Model reduction, system identification, and control. Physica D: Nonlinear Phenomena, 406:132416, 2020

  49. [49]

    Dynamic mode decomposition with control

    Joshua L Proctor, Steven L Brunton, and J Nathan Kutz. Dynamic mode decomposition with control. SIAM Journal on Applied Dynamical Systems, 15(1):142–161, 2016

  50. [51]

    Data-driven discovery of Koopman eigenfunctions for control

    Eurika Kaiser, J Nathan Kutz, and Steven L Brunton. Data-driven discovery of Koopman eigenfunctions for control. Machine Learning: Science and Technology, 2(3):035023, 2021. 32

  51. [52]

    Provably efficient maximum entropy exploration

    Elad Hazan, Sham Kakade, Karan Singh, and Abby Van Soest. Provably efficient maximum entropy exploration. In ICML, pages 2681–2691. PMLR, 2019

  52. [53]

    Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor

    Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. Soft actor-critic: Off- policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018

  53. [54]

    Soft Actor-Critic Algorithms and Applications

    Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, et al. Soft actor-critic algorithms and applications. arxiv:1812.05905, 2018

  54. [55]

    Koopman operator–based knowledge-guided reinforcement learning for safe human–robot interaction

    Anirban Sinha and Yue Wang. Koopman operator–based knowledge-guided reinforcement learning for safe human–robot interaction. Frontiers in Robotics and AI, 9:779194, 2022

  55. [56]

    Koopman Q-learning: Offline reinforcement learning via symmetries of dynamics

    Matthias Weissenbacher, Samarth Sinha, Animesh Garg, and Kawahara Yoshinobu. Koopman Q-learning: Offline reinforcement learning via symmetries of dynamics. In International Conference on Machine Learning, pages 23645–23667. PMLR, 2022

  56. [57]

    Koopman constrained policy optimization: A Koopman operator theoretic method for differentiable optimal control in robotics

    Matthew Retchin, Brandon Amos, Steven Brunton, and Shuran Song. Koopman constrained policy optimization: A Koopman operator theoretic method for differentiable optimal control in robotics. In ICML 2023 Workshop on Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators, 2023

  57. [58]

    Deep learning for universal linear embeddings of nonlinear dynamics

    Bethany Lusch, J Nathan Kutz, and Steven L Brunton. Deep learning for universal linear embeddings of nonlinear dynamics. Nature communications, 9(1):4950, 2018

  58. [59]

    Deep dynamical modeling and control of unsteady fluid flows

    Jeremy Morton, Antony Jameson, Mykel J Kochenderfer, and Freddie Witherden. Deep dynamical modeling and control of unsteady fluid flows. Advances in Neural Information Processing Systems, 31, 2018

  59. [60]

    Linearly recurrent autoencoder networks for learning dynamics

    Samuel E Otto and Clarence W Rowley. Linearly recurrent autoencoder networks for learning dynamics. SIAM Journal on Applied Dynamical Systems, 18(1):558–593, 2019

  60. [61]

    Learning Koopman invariant subspaces for dynamic mode decomposition

    Naoya Takeishi, Yoshinobu Kawahara, and Takehisa Yairi. Learning Koopman invariant subspaces for dynamic mode decomposition. In Advances in Neural Information Processing Systems, pages 1130–1140, 2017

  61. [62]

    Learning deep neural network represen- tations for koopman operators of nonlinear dynamical systems

    Enoch Yeung, Soumya Kundu, and Nathan Hodas. Learning deep neural network represen- tations for koopman operators of nonlinear dynamical systems. In 2019 American Control Conference (ACC), pages 4832–4839. IEEE, 2019

  62. [63]

    VAMPnets: Deep learning of molecular kinetics

    Andreas Mardt, Luca Pasquali, Hao Wu, and Frank No ´e. VAMPnets: Deep learning of molecular kinetics. Nature Communications, 9(5), 2018

  63. [64]

    Sparse identification of nonlinear dynamics with control (sindyc)

    Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. Sparse identification of nonlinear dynamics with control (sindyc). IFAC-PapersOnLine, 49(18):710–715, 2016

  64. [65]

    Sparse identification of nonlinear dynamics for model predictive control in the low-data limit

    Eurika Kaiser, J Nathan Kutz, and Steven L Brunton. Sparse identification of nonlinear dynamics for model predictive control in the low-data limit. Proceedings of the Royal Society of London A, 474(2219), 2018

  65. [66]

    Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control

    Milan Korda and Igor Mezi´c. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control. Automatica, 93:149–160, 2018

  66. [67]

    A data-driven Koopman model predictive control framework for nonlinear partial differential equations

    Hassan Arbabi, Milan Korda, and Igor Mezi ´c. A data-driven Koopman model predictive control framework for nonlinear partial differential equations. In 2018 IEEE Conference on Decision and Control (CDC), pages 6409–6414. IEEE, 2018

  67. [68]

    Optimal construction of Koopman eigenfunctions for prediction and control

    Milan Korda and Igor Mezi´c. Optimal construction of Koopman eigenfunctions for prediction and control. IEEE Transactions on Automatic Control, 65(12):5114–5129, 2020. 33

  68. [69]

    Koopman operator-based model predictive control with recursive online update

    Horacio M Calder ´on, Erik Schulz, Thimo Oehlschl ¨agel, and Herbert Werner. Koopman operator-based model predictive control with recursive online update. In 2021 European Control Conference (ECC), pages 1543–1549. IEEE, 2021

  69. [70]

    S. L. Brunton and J. N. Kutz. Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control. Cambridge University Press, 2nd edition, 2022

  70. [71]

    Koopman operator in systems and control

    Alexandre Mauroy, Y Susuki, and I Mezi´c. Koopman operator in systems and control. Springer, 2020

  71. [72]

    Geometry of the ergodic quotient reveals coherent structures in flows

    Marko Budiˇsi´c and Igor Mezi´c. Geometry of the ergodic quotient reveals coherent structures in flows. Physica D: Nonlinear Phenomena, 241(15):1255–1269, 2012

  72. [73]

    Linearization in the large of nonlinear systems and Koopman operator spectrum

    Yueheng Lan and Igor Mezi´c. Linearization in the large of nonlinear systems and Koopman operator spectrum. Physica D: Nonlinear Phenomena, 242(1):42–53, 2013

  73. [74]

    S. L. Brunton, B. W. Brunton, J. L. Proctor, and J. N Kutz. Koopman invariant subspaces and finite linear representations of nonlinear dynamical systems for control. PLoS ONE, 11(2):e0150171, 2016

  74. [75]

    On convergence of extended dynamic mode decomposition to the Koopman operator

    Milan Korda and Igor Mezi´c. On convergence of extended dynamic mode decomposition to the Koopman operator. Journal of Nonlinear Science, 28(2):687–710, 2018

  75. [76]

    Deep learning markov and Koopman models with physical constraints

    Andreas Mardt, Luca Pasquali, Frank No´e, and Hao Wu. Deep learning markov and Koopman models with physical constraints. In Mathematical and Scientific Machine Learning , pages 451–475. PMLR, 2020

  76. [77]

    S. L. Brunton, B. W. Brunton, J. L. Proctor, E. Kaiser, and J. N. Kutz. Chaos as an intermittently forced linear system. Nature Communications, 8(19):1–9, 2017

  77. [78]

    Structured time-delay models for dynamical systems with connections to frenet–serret frame

    Seth M Hirsh, Sara M Ichinaga, Steven L Brunton, J Nathan Kutz, and Bingni W Brunton. Structured time-delay models for dynamical systems with connections to frenet–serret frame. Proceedings of the Royal Society A, 477(2254):20210097, 2021

  78. [79]

    Extracting reproducible time-resolved resting state networks using dynamic mode decomposition

    James M Kunert-Graf, Kristian M Eschenburg, David J Galas, J Nathan Kutz, Swati D Rane, and Bingni W Brunton. Extracting reproducible time-resolved resting state networks using dynamic mode decomposition. Frontiers in computational neuroscience, page 75, 2019

  79. [80]

    Centering data improves the dynamic mode decomposition

    Seth M Hirsh, Kameron Decker Harris, J Nathan Kutz, and Bingni W Brunton. Centering data improves the dynamic mode decomposition. SIAM Journal on Applied Dynamical Systems, 19(3):1920–1955, 2020

  80. [81]

    Data-driven resolvent analysis

    Benjamin Herrmann, Peter J Baddoo, Richard Semaan, Steven L Brunton, and Beverley J McKeon. Data-driven resolvent analysis. Journal of Fluid Mechanics, 918, 2021

Showing first 80 references.