pith. sign in

arxiv: 2604.23650 · v1 · submitted 2026-04-26 · 🧮 math.OC

On Tikhonov Regularization for Direct and Indirect Data-Driven LQR Control

Pith reviewed 2026-05-08 05:40 UTC · model grok-4.3

classification 🧮 math.OC
keywords direct data-driven controlLQR controlTikhonov regularizationcovariance parameterizationcertainty equivalenceKoopman embeddinglinear time-invariant systems
0
0 comments X

The pith

A covariance parameterization with Tikhonov regularization makes direct data-driven LQR control equivalent to the regularized indirect approach.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a regularization method for direct data-driven linear quadratic regulator design of unknown linear systems. It parameterizes the covariance in a regularized way to ensure the resulting controller is reliable even if the collected data leads to an ill-conditioned matrix. The authors show this direct method matches the indirect method of first estimating the system matrices with Tikhonov regularization and then solving the LQR problem. They further apply the same idea to nonlinear systems by using a linear embedding from Koopman theory. This is useful because many real systems are only partially known and data can be limited or noisy.

Core claim

The central claim is that for unknown LTI systems, the direct data-driven LQR controller obtained via regularized covariance parameterization is identical to the certainty-equivalence LQR controller computed from a Tikhonov-regularized system identification step. This equivalence holds while providing better handling of cases where the data matrix has a large condition number. The method is also extended to unknown nonlinear systems by embedding their dynamics into a linear form using Koopman operators.

What carries the argument

The regularized covariance parameterization, which incorporates a regularization term directly into the empirical covariance matrix used for controller synthesis.

If this is right

  • Direct data-driven controllers can be designed without separate system identification while achieving the same performance as regularized indirect methods.
  • The controllers remain effective and stable for data sets that would otherwise cause numerical issues due to high condition numbers.
  • The same regularization principle applies to controller design for nonlinear systems via linear Koopman embeddings.
  • Validation through simulations shows improved closed-loop behavior compared to unregularized or other data-driven methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This bridging of direct and indirect approaches may allow control engineers to select regularization parameters using familiar identification tools and then apply them in direct designs.
  • For nonlinear systems, the quality of the Koopman embedding becomes critical, suggesting further work on choosing good observables.
  • The method could inspire similar regularizations for other direct data-driven control problems like robust or adaptive control.

Load-bearing premise

The collected input-output data comes from an underlying linear time-invariant system or one that can be accurately embedded linearly, allowing the regularization to produce a controller close to the optimal one for the true system.

What would settle it

A counterexample would be a simple LTI system where data is collected with insufficient excitation, leading to a high condition number, and the closed-loop performance or stability of the proposed controller differs significantly from that of the Tikhonov-regularized indirect LQR controller.

Figures

Figures reproduced from arXiv: 2604.23650 by Raphael M. Jungers, Shuyuan Zhang, Zheming Wang.

Figure 1
Figure 1. Figure 1: Comparison of Method I and Method II in terms view at source ↗
Figure 2
Figure 2. Figure 2: (a) log(SII /SI ) as a function of T and σw; (b) log(MI /MII ) as a function of T and σw. the logarithmic values of SII /SI and MI /MII . For each γ, let S γ II and Mγ II denote the stabilizing percentage and the median value of J(K(i) ) over 100 independent trials, respectively. Then, SII and MII represent the maximum and minimum of all S γ II and Mγ II , respectively. The quantities SI and MI are defined… view at source ↗
Figure 3
Figure 3. Figure 3: , the number of cases where SII > SI exceeds those where SII < SI , demonstrating that our method is more robust in finding a stabilizing controller for random systems. TABLE I: Number of systems (out of 1000) for each case of SI and SII under different noise levels. σw = 0.1 σw = 1.0 SI = 0, SII = 0 16 76 SI = 0, SII > 0 62 95 SI > 0, SII = 0 4 5 SI > 0, SII > 0 918 824 -1.5 -1 -0.5 1 1.5 0 50 100 150 200… view at source ↗
read the original abstract

In recent years, the so-called `direct data-driven control' has been a topic of intense research, and it is expected that it will become prominent in future complex dynamical systems control. Within this framework, regularization not only implicitly enforces system identification, but also plays a crucial role in ensuring reliable closed-loop behavior. To further enhance the performance of data-driven controllers, we propose a new regularization method for direct data-driven LQR control of unknown LTI systems, based on a regularized covariance parameterization. Unlike existing data-driven techniques, the proposed method remains effective in handling ill-conditioned cases, such as when the data matrix has a large condition number. Then, we demonstrate that our method is equivalent to the indirect certainty-equivalence LQR combined with Tikhonov regularization. Furthermore, we extend our method to the design of controllers for unknown nonlinear systems using Koopman linear embedding. Finally, the simulation results validate the effectiveness and advantages of the proposed regularization method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The paper proposes a regularized covariance parameterization for direct data-driven LQR control of unknown LTI systems. It claims this approach remains effective for ill-conditioned data matrices (large condition numbers), is mathematically equivalent to indirect certainty-equivalence LQR with Tikhonov regularization, and extends to nonlinear systems via Koopman linear embeddings. Simulations are used to validate performance advantages over existing direct methods.

Significance. If the equivalence holds and the robustness to ill-conditioned data is established with explicit guarantees, the work provides a useful bridge between direct and indirect data-driven control, allowing direct methods to inherit regularization benefits without separate identification steps. The Koopman extension suggests broader applicability to nonlinear systems, though its practical value depends on embedding quality.

major comments (3)
  1. [§3.2] §3.2, Eq. (12)–(14): The regularized covariance parameterization is introduced to handle ill-conditioned data, but the manuscript provides no explicit bound relating the regularization parameter to the condition number of the data matrix or to closed-loop stability margins. Without such a bound, the claim that the method 'remains effective' for arbitrarily large condition numbers rests on the equivalence rather than a direct analysis of the direct formulation.
  2. [§4] §4, Theorem 1: The equivalence to indirect Tikhonov-regularized LQR is shown algebraically, yet the proof does not address whether the regularization strength is preserved under the same data conditioning that destabilizes unregularized direct methods. If the equivalence is by construction, it is unclear what new robustness the direct parameterization adds beyond inheriting the indirect method's properties.
  3. [§5] §5, Koopman extension: The extension assumes a useful linear embedding exists, but no error bounds are given on how embedding approximation error interacts with the covariance regularization or propagates to the LQR gain under ill-conditioned data. This assumption is load-bearing for the nonlinear claim.
minor comments (3)
  1. [§2–3] Notation for the data matrix and covariance is introduced inconsistently between §2 and §3; a single table of symbols would improve readability.
  2. [§6] Simulation figures lack error bars or multiple random seeds, making it difficult to assess variability in performance for ill-conditioned cases.
  3. [Introduction] The abstract states equivalence but the introduction does not cite prior work on Tikhonov regularization in data-driven LQR; a brief literature comparison would clarify novelty.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their insightful and constructive comments. We provide a point-by-point response to each major comment below. Where we agree that additional clarification strengthens the manuscript, we have revised accordingly.

read point-by-point responses
  1. Referee: [§3.2] §3.2, Eq. (12)–(14): The regularized covariance parameterization is introduced to handle ill-conditioned data, but the manuscript provides no explicit bound relating the regularization parameter to the condition number of the data matrix or to closed-loop stability margins. Without such a bound, the claim that the method 'remains effective' for arbitrarily large condition numbers rests on the equivalence rather than a direct analysis of the direct formulation.

    Authors: We agree that an explicit bound relating the regularization parameter to the condition number would provide stronger direct guarantees. Deriving such a bound is non-trivial, as the parameterization modifies the covariance in a manner that does not yield a simple closed-form stability margin independent of the equivalence. The equivalence to indirect Tikhonov-regularized LQR transfers the known robustness properties. In the revised manuscript we have added a remark in Section 3.2 suggesting a practical heuristic for selecting λ based on the condition number (e.g., proportional to the smallest singular value), supported by the numerical evidence in Section 6. revision: partial

  2. Referee: [§4] §4, Theorem 1: The equivalence to indirect Tikhonov-regularized LQR is shown algebraically, yet the proof does not address whether the regularization strength is preserved under the same data conditioning that destabilizes unregularized direct methods. If the equivalence is by construction, it is unclear what new robustness the direct parameterization adds beyond inheriting the indirect method's properties.

    Authors: The equivalence is by construction, and the algebraic identity holds for any data matrix provided the regularized covariance remains positive definite; thus the regularization strength is preserved independently of conditioning. The direct parameterization adds the practical benefit of computing the LQR gain without an explicit identification step, which avoids potential numerical instabilities or model mismatch that can arise during separate identification. We have expanded the discussion immediately after Theorem 1 in the revised manuscript to clarify this point and to emphasize that the direct method inherits the indirect method's closed-loop guarantees while remaining fully data-driven. revision: yes

  3. Referee: [§5] §5, Koopman extension: The extension assumes a useful linear embedding exists, but no error bounds are given on how embedding approximation error interacts with the covariance regularization or propagates to the LQR gain under ill-conditioned data. This assumption is load-bearing for the nonlinear claim.

    Authors: We acknowledge that general error bounds on the interaction between Koopman embedding error and the regularized covariance would strengthen the claim. Such bounds depend on the specific embedding method and would require additional assumptions (e.g., on the approximation error in a chosen function space) that are outside the scope of the present work and are rarely provided in the broader Koopman-control literature. Our contribution is to demonstrate how the proposed regularization integrates directly into the Koopman framework. In the revised manuscript we have added a paragraph in Section 5 discussing the role of embedding quality and noting that covariance regularization improves numerical stability against small embedding errors, together with an additional simulation example in Section 6.3. revision: partial

Circularity Check

0 steps flagged

Proposed parameterization derived independently; equivalence shown as post-hoc insight rather than definitional reduction

full rationale

The paper first proposes a regularized covariance parameterization as a new direct method for data-driven LQR, then separately demonstrates its equivalence to indirect certainty-equivalence LQR with Tikhonov regularization. No equations or definitions in the provided abstract reduce the new method to the indirect one by construction; the equivalence is presented as a derived result. No self-citations are invoked as load-bearing for the core claim, and the handling of ill-conditioned data is asserted as a property of the proposed parameterization rather than inherited tautologically. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the assumption that the plant is linear time-invariant or Koopman-embeddable and that regularization on the covariance suffices to handle ill-conditioning without further conditions on the data.

axioms (1)
  • domain assumption The unknown system is linear time-invariant (or admits a Koopman linear embedding for the nonlinear case)
    Explicitly stated in the abstract as the setting for both LTI and nonlinear extensions.

pith-pipeline@v0.9.0 · 5468 in / 1338 out tokens · 32536 ms · 2026-05-08T05:40:16.444755+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    From model-based control to data-driven control: Survey, classification and perspective,

    Z.-S. Hou and Z. Wang, “From model-based control to data-driven control: Survey, classification and perspective,”Information Sciences, vol. 235, pp. 3–35, 2013

  2. [2]

    Guarantees for data- driven control of nonlinear systems using semidefinite programming: A survey,

    T. Martin, T. B. Sch ¨on, and F. Allg ¨ower, “Guarantees for data- driven control of nonlinear systems using semidefinite programming: A survey,”Annual Reviews in Control, vol. 56, p. 100911, 2023

  3. [3]

    Ljung,System Identification: Theory for the User

    L. Ljung,System Identification: Theory for the User. Prentice Hall PTR, 1999

  4. [4]

    Certainty equivalence is efficient for linear quadratic control,

    H. Mania, S. Tu, and B. Recht, “Certainty equivalence is efficient for linear quadratic control,”Advances in Neural Information Processing Systems, vol. 32, 2019

  5. [5]

    K. J. Astrom and B. Wittenmark,Adaptive control. Addison-Wesley Longman Publishing Co., Inc., 1994

  6. [6]

    Pillonetto, T

    G. Pillonetto, T. Chen, A. Chiuso, G. De Nicolao, and L. Ljung, Regularized system identification: Learning dynamic models from data. Springer Nature, 2022

  7. [7]

    Data-driven control: A behavioral approach,

    T. M. Maupong and P. Rapisarda, “Data-driven control: A behavioral approach,”Systems & Control Letters, vol. 101, pp. 37–43, 2017

  8. [8]

    A note on persistency of excitation,

    J. C. Willems, P. Rapisarda, I. Markovsky, and B. L. De Moor, “A note on persistency of excitation,”Systems & Control Letters, vol. 54, no. 4, pp. 325–329, 2005

  9. [9]

    Formulas for data-driven control: Stabi- lization, optimality, and robustness,

    C. De Persis and P. Tesi, “Formulas for data-driven control: Stabi- lization, optimality, and robustness,”IEEE Transactions on Automatic Control, vol. 65, no. 3, pp. 909–924, 2020

  10. [10]

    Low-complexity learning of linear quadratic regulators from noisy data,

    C. De Persis and P. Tesi, “Low-complexity learning of linear quadratic regulators from noisy data,”Automatica, vol. 128, p. 109548, 2021

  11. [11]

    Bridging direct and indirect data-driven control formulations via regularizations and relaxations,

    F. D ¨orfler, J. Coulson, and I. Markovsky, “Bridging direct and indirect data-driven control formulations via regularizations and relaxations,” IEEE Transactions on Automatic Control, vol. 68, no. 2, pp. 883–897, 2023

  12. [12]

    On the role of regularization in direct data-driven lqr control,

    F. D ¨orfler, P. Tesi, and C. De Persis, “On the role of regularization in direct data-driven lqr control,” inProceedings of the Conference on Decision and Control, pp. 1091–1098, 2022

  13. [13]

    On the certainty-equivalence approach to direct data-driven lqr design,

    F. D ¨orfler, P. Tesi, and C. De Persis, “On the certainty-equivalence approach to direct data-driven lqr design,”IEEE Transactions on Automatic Control, vol. 68, no. 12, pp. 7989–7996, 2023

  14. [14]

    Regularization for covariance parameterization of direct data-driven lqr control,

    F. Zhao, A. Chiuso, and F. D ¨orfler, “Regularization for covariance parameterization of direct data-driven lqr control,”IEEE Control Systems Letters, vol. 9, pp. 961–966, 2025

  15. [15]

    arXiv preprint arXiv:2502.13676 , year=

    N. Persson, F. Zhao, M. Kaheni, F. D ¨orfler, and A. V . Papadopou- los, “An adaptive data-enabled policy optimization approach for au- tonomous bicycle control,” 2025. arXiv:2502.13676

  16. [16]

    Di- rect adaptive control of grid-connected power converters via output- feedback data-enabled policy optimization,

    F. Zhao, R. Leng, L. Huang, H. Xin, K. You, and F. D ¨orfler, “Di- rect adaptive control of grid-connected power converters via output- feedback data-enabled policy optimization,” inProceedings of the European Control Conference, pp. 2563–2568, 2025

  17. [17]

    Ridge regression: Biased estimation for nonorthogonal problems,

    A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,”Technometrics, vol. 12, no. 1, pp. 55–67, 1970

  18. [18]

    Data-enabled policy optimization for direct adaptive learning of the lqr,

    F. Zhao, F. D ¨orfler, A. Chiuso, and K. You, “Data-enabled policy optimization for direct adaptive learning of the lqr,”IEEE Transactions on Automatic Control, vol. 70, no. 11, pp. 7217–7232, 2025

  19. [19]

    B. D. Anderson and J. B. Moore,Optimal control: linear quadratic methods. Courier Corporation, 2007

  20. [20]

    Numerical methods for H2 related problems,

    E. Feron, V . Balakrishnan, S. Boyd, and L. El Ghaoui, “Numerical methods for H2 related problems,” inProceedings of the American Control Conference, pp. 2921–2922, 1992

  21. [21]

    Data informativity: A new perspective on data-driven analysis and control,

    H. J. Van Waarde, J. Eising, H. L. Trentelman, and M. K. Camlibel, “Data informativity: A new perspective on data-driven analysis and control,”IEEE Transactions on Automatic Control, vol. 65, no. 11, pp. 4753–4768, 2020

  22. [22]

    Ridge regularization: An essential concept in data science,

    T. Hastie, “Ridge regularization: An essential concept in data science,” Technometrics, vol. 62, no. 4, pp. 426–433, 2020

  23. [23]

    Notes on the schur complement,

    J. H. Gallier, “Notes on the schur complement,” 2010. upenn.edu

  24. [24]

    Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control,

    M. Korda and I. Mezi ´c, “Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control,”Auto- matica, vol. 93, pp. 149–160, 2018

  25. [25]

    Mosek optimization toolbox for matlab,

    M. ApS, “Mosek optimization toolbox for matlab,”User’s Guide and Reference Manual, 2019. https://docs.mosek.com/10.1/toolbox.pdf