Direct Data-Driven Linear Quadratic Tracking via Policy Optimization

Keyou You; Shubo Kang

arxiv: 2605.15563 · v1 · pith:5LKHMYW5new · submitted 2026-05-15 · 📡 eess.SY · cs.SY· math.OC

Direct Data-Driven Linear Quadratic Tracking via Policy Optimization

Shubo Kang , Keyou You This is my paper

Pith reviewed 2026-05-20 19:42 UTC · model grok-4.3

classification 📡 eess.SY cs.SYmath.OC

keywords data-driven controllinear quadratic trackingpolicy optimizationcertainty equivalenceconvergence analysisDeePOtracking control

0 comments

The pith

Reference decoupling renders data-driven linear quadratic tracking exactly equivalent to certainty-equivalence control.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a reference-decoupled reformulation of the LQT problem that separates the time-varying reference from the feedback-feedforward policy. This reformulation is shown to be exactly equivalent to the standard indirect certainty-equivalence approach while allowing a fixed-dimension covariance parameterization. A sympathetic reader cares because it removes the dimension barrier that previously blocked direct data-driven methods from handling tracking tasks, which appear in most real applications requiring trajectory following. If the equivalence holds, it directly supports new offline and online algorithms with linear convergence guarantees and enables practitioners to optimize policies from data without growing decision variables as horizons lengthen.

Core claim

The paper claims that a reference-decoupled reformulation of LQT is exactly equivalent to the indirect certainty-equivalence LQT solution. This reformulation accommodates the covariance parameterization with decision variables whose dimension stays fixed independent of data horizon. It supports development of offline and online DeePO algorithms, which achieve global linear convergence in the offline case via local gradient dominance and smoothness, and linear decay of the optimality gap up to an SNR-dependent bias in the online case.

What carries the argument

The reference-decoupled reformulation of LQT, which decouples the time-varying reference from the feedback-feedforward policy to enable fixed-dimension sample-covariance parameterization.

Load-bearing premise

The linear system and quadratic cost structure allow the time-varying reference to be fully decoupled from the policy without any loss of optimality.

What would settle it

Apply the proposed DeePO algorithm to a low-dimensional linear system with a known closed-form LQT solution and verify whether the achieved cost equals that of the indirect certainty-equivalence controller or whether observed convergence deviates from the predicted linear rate.

Figures

Figures reproduced from arXiv: 2605.15563 by Keyou You, Shubo Kang.

**Figure 5.** Figure 5: Evolution of σ(U0,t) and σ(Mt) during the online adaptation. bounded by the accumulating SNR bottleneck. Consequently, the real-time tracking performance dynamically improves as the policy adapts, eventually aligning with the optimal tracking trajectory. In Remark 2, we discussed the possibility of using different step sizes for the V - and H-components of the gradient. We empirically validate this by sc… view at source ↗

**Figure 2.** Figure 2: Tracking performance of the Offline DeePO algorithm [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗

**Figure 3.** Figure 3: Convergence andTracking performance of the Online [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

**Figure 4.** Figure 4: Convergence with different step sizes for the [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

read the original abstract

Direct data-driven optimal control provides an elegant end-to-end paradigm, yet its real-time applicability is often hindered by the growing dimensionality of online decision variables. Recent breakthroughs, notably Data-EnablEd Policy Optimization (DeePO), overcome this bottleneck for the Linear Quadratic Regulator (LQR) through sample-covariance parameterization; however, extending this paradigm to Linear Quadratic Tracking (LQT) poses a fundamental challenge. The core difficulty stems from the intricate coupling between time-varying references and the feedback-feedforward policy structure, which prevents a direct application of constant-dimension parameterization. We first introduce a reference-decoupled reformulation of LQT that naturally accommodates the covariance parameterization, guaranteeing a fixed dimension of decision variables independent of data horizon. This formulation is proven to be exactly equivalent to the indirect certainty-equivalence LQT solution. Leveraging this characterization, we develop offline and online DeePO algorithms. Theoretically, we prove global linear convergence for the offline algorithm using local gradient dominance and smoothness, and show that in the online setting the optimality gap decays linearly up to a bias term that scales inversely with the signal-to-noise ratio (SNR). Numerical simulations varify the theoretical results and illustrate the superior tracking performance of the proposed method.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a reference-decoupled reformulation that keeps LQT at fixed covariance dimension and claims exact match to certainty-equivalence control.

read the letter

The main point is that they've come up with a reference-decoupled reformulation of the LQT problem. This lets them parameterize the policy using sample covariances at a dimension that stays fixed regardless of how long the reference trajectory is. They prove this version matches the standard certainty-equivalence LQT exactly, which then lets them run the DeePO algorithms on it. What the paper does well is tackle the coupling issue head-on. In the original LQR DeePO work, the constant dimension came easily because there was no reference. Here, they separate the reference effect so the covariance trick still works. The offline algorithm gets global linear convergence thanks to local gradient dominance and smoothness properties. For the online case, they show the optimality gap goes down linearly but stops at a bias that shrinks with better SNR. The simulations confirm the convergence rates and demonstrate better tracking than some baselines. The soft spots are around verifying that equivalence for general cases. The decoupling has to preserve optimality without restricting the class of references too much. If the reference needs to follow specific dynamics or the feedforward term isn't fully recovered, the method might optimize something close but not identical to the original LQT. The abstract points to a proof, so the full paper should have the details. The SNR bias is a realistic acknowledgment for noisy data, but it means the online performance has a floor. This work is for people building data-driven controllers for tracking tasks in robotics or industrial processes, where real-time computation with long horizons is an issue. Anyone who has used DeePO for regulation will find the extension straightforward to follow. It has enough new technical content and verifiable claims to go to a serious referee. I would recommend putting it through peer review.

Referee Report

2 major / 2 minor

Summary. The paper introduces a reference-decoupled reformulation of the linear quadratic tracking (LQT) problem that is proven equivalent to the indirect certainty-equivalence LQT solution. This reformulation enables covariance parameterization of the policy with dimension independent of the data horizon, allowing development of offline and online Data-EnablEd Policy Optimization (DeePO) algorithms. The authors prove global linear convergence of the offline algorithm via local gradient dominance and smoothness, and show linear decay of the optimality gap up to an SNR-dependent bias in the online setting. Numerical simulations are used to verify the theoretical claims and demonstrate improved tracking performance.

Significance. If the equivalence holds without hidden restrictions on the reference class, the work provides a scalable direct data-driven extension of DeePO from LQR to LQT, with fixed-dimensional parameterization and explicit convergence rates. This could enable more practical real-time tracking controllers from data, strengthening the case for end-to-end data-driven methods in linear systems with time-varying references.

major comments (2)

[Abstract and reformulation section] The equivalence between the reference-decoupled reformulation and the indirect certainty-equivalence LQT solution is the load-bearing claim for both the fixed-dimension parameterization and the convergence results. The abstract states this equivalence is proven, but the decoupling conditions for arbitrary time-varying r_t under the quadratic cost (including whether the feedforward component is exactly recovered) require explicit statement and verification to rule out implicit restrictions on the reference class.
[Online convergence theorem] The online result claims linear decay of the optimality gap up to a bias scaling inversely with SNR. The derivation of this bias term and its dependence on data statistics (e.g., how it arises from the online bias term) should be cross-checked against the simulation quantification to confirm it does not undermine the linear rate claim for practical SNR values.

minor comments (2)

[Abstract] Abstract contains the typo 'varify' which should be corrected to 'verify'.
[Numerical simulations] The manuscript would benefit from a table summarizing the offline vs. online DeePO convergence rates and bias terms for direct comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, providing clarifications on the equivalence and convergence results while making targeted revisions to improve explicitness and verification.

read point-by-point responses

Referee: [Abstract and reformulation section] The equivalence between the reference-decoupled reformulation and the indirect certainty-equivalence LQT solution is the load-bearing claim for both the fixed-dimension parameterization and the convergence results. The abstract states this equivalence is proven, but the decoupling conditions for arbitrary time-varying r_t under the quadratic cost (including whether the feedforward component is exactly recovered) require explicit statement and verification to rule out implicit restrictions on the reference class.

Authors: The equivalence holds for arbitrary bounded time-varying references r_t under the standard quadratic cost, with no implicit restrictions on the reference class beyond system stabilizability and the boundedness of r_t. Theorem 1 establishes that the reference-decoupled reformulation is exactly equivalent to the indirect certainty-equivalence LQT solution, exactly recovering both the feedback gain and the feedforward component. To make the decoupling conditions fully explicit, we have revised the abstract to reference Theorem 1 directly and added a clarifying remark in Section III stating the conditions and confirming exact feedforward recovery. revision: yes
Referee: [Online convergence theorem] The online result claims linear decay of the optimality gap up to a bias scaling inversely with SNR. The derivation of this bias term and its dependence on data statistics (e.g., how it arises from the online bias term) should be cross-checked against the simulation quantification to confirm it does not undermine the linear rate claim for practical SNR values.

Authors: The bias term in the online convergence result (Theorem 3) arises from the persistent covariance estimation error in the online data-driven gradient step, which is inversely proportional to SNR due to the additive noise variance in the collected trajectories. We have cross-checked the derivation against the simulation results in Section V; for practical SNR values (above approximately 15 dB), the plots show clear linear decay of the optimality gap until the predicted bias floor is reached, without undermining the linear rate. In the revision we have expanded the discussion following Theorem 3 to explicitly trace the bias to the online bias term and data statistics, and added SNR-sweep simulation figures to quantify the effect. revision: yes

Circularity Check

0 steps flagged

Reference-decoupled LQT reformulation equivalence derived via internal proof without reduction to inputs or self-citation chains.

full rationale

The paper presents the reference-decoupled reformulation as a new characterization of LQT, followed by an explicit proof of exact equivalence to the indirect certainty-equivalence solution. This equivalence is used to enable covariance parameterization of fixed dimension. Subsequent offline global linear convergence (via local gradient dominance) and online linear decay results are derived from standard policy optimization analysis applied to the reformulated problem. No equations or claims reduce by construction to fitted parameters, prior self-citations, or ansatzes; the derivation chain is self-contained and relies on the linear-quadratic structure and data-driven covariance properties as independent inputs. The SNR-dependent bias term is an explicit output of the online analysis rather than an implicit fit.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard linear-system assumptions and the existence of an equivalent certainty-equivalence solution; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption The underlying system is linear time-invariant with quadratic costs.
Invoked throughout the reformulation and equivalence claim.

pith-pipeline@v0.9.0 · 5739 in / 1153 out tokens · 46071 ms · 2026-05-20T19:42:19.435483+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

186 extracted references · 186 canonical work pages · 2 internal anchors

[1]

Stabilizing Dynamical Systems via Policy Gradient Methods , volume =

Perdomo, Juan and Umenberger, Jack and Simchowitz, Max , booktitle =. Stabilizing Dynamical Systems via Policy Gradient Methods , volume =

work page
[2]

Reinforcement learning: An introduction , year =

Sutton, Richard S and Barto, Andrew G , publisher =. Reinforcement learning: An introduction , year =

work page
[3]

Learning control systems--Review and outlook , volume =

Fu, King-Sun , journal =. Learning control systems--Review and outlook , volume =

work page
[4]

arXiv preprint arXiv:2202.07187 , year=

On the sample complexity of stabilizing lti systems on a single trajectory , author=. arXiv preprint arXiv:2202.07187 , year=

work page arXiv
[5]

Data informativity: a new perspective on data-driven analysis and control , volume =

van Waarde, Henk J and Eising, Jaap and Trentelman, Harry L and Camlibel, M Kanat , journal =. Data informativity: a new perspective on data-driven analysis and control , volume =

work page
[6]

Linear System Theory and Design (4th edition) , year =

Chen, Chi-Tsong , date-modified =. Linear System Theory and Design (4th edition) , year =

work page
[7]

, date-modified =

Ljung, L. , date-modified =. System Identification: Theory for the User , year =

work page
[8]

and Camlibel, M

van Waarde, Henk J. and Camlibel, M. Kanat and Mesbahi, Mehran , doi =. From Noisy Data to Feedback Controllers: Nonconservative Design via a Matrix. IEEE Transactions on Automatic Control , number =. 2022 , bdsk-url-1 =

work page 2022
[9]

Formulas for data-driven control: Stabilization, optimality, and robustness , volume =

De Persis, Claudio and Tesi, Pietro , journal =. Formulas for data-driven control: Stabilization, optimality, and robustness , volume =

work page
[10]

Global convergence of policy gradient methods for the linear quadratic regulator , year =

Fazel, Maryam and Ge, Rong and Kakade, Sham and Mesbahi, Mehran , booktitle =. Global convergence of policy gradient methods for the linear quadratic regulator , year =

work page
[11]

Analysis of the Optimization Landscape of Linear Quadratic Gaussian

Tang, Yujie and Zheng, Yang and and Li, Na , booktitle =. Analysis of the Optimization Landscape of Linear Quadratic Gaussian. arXiv:2102.04393 , organization =

work page arXiv
[12]

Global convergence of policy gradient primal--dual methods for risk-constrained

Zhao, Feiran and You, Keyou and Ba. Global convergence of policy gradient primal--dual methods for risk-constrained. IEEE Transactions on Automatic Control , volume=. 2023 , publisher=

work page 2023
[13]

On the linear quadratic data-driven control , year =

Markovsky, Ivan and Rapisarda, Paolo , booktitle =. On the linear quadratic data-driven control , year =. doi:10.23919/ECC.2007.7068299 , pages =

work page doi:10.23919/ecc.2007.7068299 2007
[14]

Data-enabled predictive control: In the shallows of the

Coulson, Jeremy and Lygeros, John and D. Data-enabled predictive control: In the shallows of the. 18th European Control Conference (ECC) , organization =

work page
[15]

Stability analysis and control design of

Park, Un Sik and Ikeda, Masao , journal =. Stability analysis and control design of

work page
[16]

Data-based controllability and observability analysis of linear discrete-time systems , volume =

Wang, Zhuo and Liu, Derong , journal =. Data-based controllability and observability analysis of linear discrete-time systems , volume =

work page
[17]

Data-based analysis of discrete-time linear systems in noisy environment: Controllability and observability , volume =

Liu, Derong and Yan, Pengfei and Wei, Qinglai , journal =. Data-based analysis of discrete-time linear systems in noisy environment: Controllability and observability , volume =

work page
[18]

Data-driven analysis methods for controllability and observability of a class of discrete LTI systems with delays , year =

Zhou, Binquan and Wang, Zhuo and Zhai, Yueyang and Yuan, Heng , booktitle =. Data-driven analysis methods for controllability and observability of a class of discrete LTI systems with delays , year =

work page
[19]

Behavioral systems theory in data-driven analysis, signal processing, and control , volume =

Markovsky, Ivan and D. Behavioral systems theory in data-driven analysis, signal processing, and control , volume =. Annual Reviews in Control , pages =

work page
[20]

Maupong and J.C

T.M. Maupong and J.C. Mayo-Maldonado and P. Rapisarda , issn =. On Lyapunov functions and data-driven dissipativity , volume =. IFAC-PapersOnLine , number =

work page
[21]

Determining optimal input--output properties: A data-driven approach , volume =

Koch, Anne and Berberich, Julian and K. Determining optimal input--output properties: A data-driven approach , volume =. Automatica , pages =

work page
[22]

Data-driven inference on optimal input-output properties of polynomial systems with focus on nonlinearity measures , year=

Martin, Tim and Allgöwer, Frank , journal=. Data-driven inference on optimal input-output properties of polynomial systems with focus on nonlinearity measures , year=

work page
[23]

Data-driven tests for controllability , volume =

Mishra, Vikas Kumar and Markovsky, Ivan and Grossmann, Ben , journal =. Data-driven tests for controllability , volume =

work page
[24]

ArXiv preprint arXiv:2109.02090 , title =

van Waarde, Henk J and Camlibel, M Kanat and Rapisarda, Paolo and Trentelman, Harry L , date-modified =. ArXiv preprint arXiv:2109.02090 , title =

work page arXiv
[25]

, journal=

van Waarde, Henk J. , journal=. Beyond Persistent Excitation: Online Experiment Design for Data-Driven Modeling and Control , year=

work page
[26]

A note on persistency of excitation , volume =

Willems, Jan C and Rapisarda, Paolo and Markovsky, Ivan and De Moor, Bart LM , journal =. A note on persistency of excitation , volume =

work page
[27]

Data-driven model predictive control with stability and robustness guarantees , volume =

Berberich, Julian and K. Data-driven model predictive control with stability and robustness guarantees , volume =. IEEE Transactions on Automatic Control , number =

work page
[28]

1985 , issn =

Persistency of excitation, sufficient richness and parameter convergence in discrete time adaptive control , journal =. 1985 , issn =. doi:https://doi.org/10.1016/0167-6911(85)90035-0 , author =

work page doi:10.1016/0167-6911(85)90035-0 1985
[29]

From model-based control to data-driven control: Survey, classification and perspective , volume =

Zhong-Sheng Hou and Zhuo Wang , date-modified =. From model-based control to data-driven control: Survey, classification and perspective , volume =. Information Sciences , pages =

work page
[30]

Bridging direct & indirect data-driven control formulations via regularizations and relaxations , year =

D. Bridging direct & indirect data-driven control formulations via regularizations and relaxations , year =. IEEE Transactions on Automatic Control , publisher =

work page
[31]

A Tour of Reinforcement Learning: The View from Continuous Control , volume =

Recht, Benjamin , doi =. A Tour of Reinforcement Learning: The View from Continuous Control , volume =. Annual Review of Control, Robotics, and Autonomous Systems , number =. 2019 , bdsk-url-1 =

work page 2019
[32]

Data-driven control of complex networks , volume =

Baggio, Giacomo and Bassett, Danielle S and Pasqualetti, Fabio , journal =. Data-driven control of complex networks , volume =

work page
[33]

Human-level control through deep reinforcement learning , volume =

Mnih, Volodymyr and Kavukcuoglu, Koray and Silver, David and Rusu, Andrei A and Veness, Joel and Bellemare, Marc G and Graves, Alex and Riedmiller, Martin and Fidjeland, Andreas K and Ostrovski, Georg and others , journal =. Human-level control through deep reinforcement learning , volume =

work page
[34]

Mastering the game of Go with deep neural networks and tree search , volume =

Silver, David and Huang, Aja and Maddison, Chris J and Guez, Arthur and Sifre, Laurent and Van Den Driessche, George and Schrittwieser, Julian and Antonoglou, Ioannis and Panneershelvam, Veda and Lanctot, Marc and others , journal =. Mastering the game of Go with deep neural networks and tree search , volume =

work page
[35]

The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint , year =

Tu, Stephen and Recht, Benjamin , booktitle =. The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint , year =

work page
[36]

Convergence and Sample Complexity of Gradient Methods for the Model-Free Linear--Quadratic Regulator Problem , volume =

Mohammadi, Hesameddin and Zare, Armin and Soltanolkotabi, Mahdi and Jovanovi. Convergence and Sample Complexity of Gradient Methods for the Model-Free Linear--Quadratic Regulator Problem , volume =. IEEE Transactions on Automatic Control , number =

work page
[37]

From time series to linear system---Part I

Willems, Jan C , journal =. From time series to linear system---Part I. Finite dimensional linear time invariant systems , volume =

work page
[38]

Distributionally robust chance constrained data-enabled predictive control , volume =

Coulson, Jeremy and Lygeros, John and D. Distributionally robust chance constrained data-enabled predictive control , volume =. IEEE Transactions on Automatic Control , number =

work page
[39]

A trajectory-based framework for data-driven system analysis and control , year =

Berberich, Julian and Allg. A trajectory-based framework for data-driven system analysis and control , year =. European Control Conference (ECC) , organization =

work page
[40]

Data-driven stabilization of nonlinear polynomial systems with noisy data , volume =

Guo, Meichen and De Persis, Claudio and Tesi, Pietro , date-modified =. Data-driven stabilization of nonlinear polynomial systems with noisy data , volume =. IEEE Transactions on Automatic Control , number =

work page
[41]

Data-driven control of dynamic event-triggered systems with delays , year =

Wang, Xin and Sun, Jian and Berberich, Julian and Wang, Gang and Allgower, Frank and Chen, Jie , journal =. Data-driven control of dynamic event-triggered systems with delays , year =

work page
[42]

Control theory for linear systems , year =

Trentelman, Harry L and Stoorvogel, Anton A and Hautus, Malo , publisher =. Control theory for linear systems , year =

work page
[43]

Robust and optimal control , year =

Zhou, Kemin and Doyle, John Comstock and Glover, Keith , publisher =. Robust and optimal control , year =

work page
[44]

1994 , publisher=

Adaptive control , author=. 1994 , publisher=

work page 1994
[45]

2017 , publisher=

Model Predictive Control: Theory, Computation, and Design , author=. 2017 , publisher=

work page 2017
[46]

2003 , publisher=

Process control: modeling, design, and simulation , author=. 2003 , publisher=

work page 2003
[47]

2012 , publisher=

Robust and Adaptive Control: With Aerospace Applications , author=. 2012 , publisher=

work page 2012
[48]

导弹控制原理 , year =

陈坚 , publisher=. 导弹控制原理 , year =

work page
[49]

IFAC-PapersOnLine , volume=

Experiment design for impulse response identification with signal matrix models , author=. IFAC-PapersOnLine , volume=. 2021 , publisher=

work page 2021
[50]

IFAC Proceedings Volumes , volume=

Numerical identification of linear dynamic systems from normal operating records , author=. IFAC Proceedings Volumes , volume=. 1965 , publisher=

work page 1965
[51]

Fast identification and stabilization of unknown linear systems , year =

Dennis Gramlich, Christian Ebenbauer , journal =. Fast identification and stabilization of unknown linear systems , year =

work page
[52]

IEEE Transactions on Industrial Electronics , volume=

A failure-detection strategy for IGBT based on gate-voltage behavior applied to a motor drive system , author=. IEEE Transactions on Industrial Electronics , volume=. 2010 , publisher=

work page 2010
[53]

1996 , publisher=

Subspace identification for linear systems: Theory, Implementation, Applications , author=. 1996 , publisher=

work page 1996
[54]

2023 , title =

Kang, Shubo and You, Keyou , journal=. 2023 , title =

work page 2023
[55]

2012 , publisher=

Optimal control , author=. 2012 , publisher=

work page 2012
[56]

Automatica , volume=

Minimum input design for direct data-driven property identification of unknown linear systems , author=. Automatica , volume=. 2023 , publisher=

work page 2023
[57]

2012 , publisher=

Dynamic programming and optimal control , author=. 2012 , publisher=

work page 2012
[58]

On the certainty-equivalence approach to direct data-driven

D. On the certainty-equivalence approach to direct data-driven. IEEE Transactions on Automatic Control , volume=. 2023 , publisher=

work page 2023
[59]

IEEE Control Systems Magazine , volume=

Control for societal-scale challenges: Road map 2030 , author=. IEEE Control Systems Magazine , volume=. 2024 , publisher=

work page 2030
[60]

Proceedings of the 24th Annual Conference on Learning Theory , pages=

Regret bounds for the adaptive control of linear quadratic systems , author=. Proceedings of the 24th Annual Conference on Learning Theory , pages=. 2011 , organization=

work page 2011
[61]

Foundations of Computational Mathematics , volume=

On the sample complexity of the linear quadratic regulator , author=. Foundations of Computational Mathematics , volume=. 2020 , publisher=

work page 2020
[62]

Almost Surely

Lu, Yiwen and Mo, Yilin , journal=. Almost Surely. 2025 , publisher=

work page 2025
[63]

2023 62nd IEEE Conference on Decision and Control (CDC) , pages=

Data-enabled policy optimization for the linear quadratic regulator , author=. 2023 62nd IEEE Conference on Decision and Control (CDC) , pages=. 2023 , organization=

work page 2023
[64]

IEEE Transactions on Automatic Control , year=

Convergence and sample complexity of policy gradient methods for stabilizing linear systems , author=. IEEE Transactions on Automatic Control , year=

work page
[65]

Annual Review of Control, Robotics, and Autonomous Systems , volume=

Toward a theoretical foundation of policy optimization for learning control policies , author=. Annual Review of Control, Robotics, and Autonomous Systems , volume=. 2023 , publisher=

work page 2023
[66]

2019 , publisher=

Reinforcement learning and optimal control , author=. 2019 , publisher=

work page 2019
[67]

IEEE Control Systems Magazine , volume=

Data-driven control based on the behavioral approach: From theory to applications in power systems , author=. IEEE Control Systems Magazine , volume=. 2023 , publisher=

work page 2023
[68]

arXiv preprint arXiv:2312.14788 , year=

Harnessing the final control error for optimal data-driven predictive control , author=. arXiv preprint arXiv:2312.14788 , year=

work page arXiv
[69]

Automatica , volume=

Low-complexity learning of linear quadratic regulators from noisy data , author=. Automatica , volume=. 2021 , publisher=

work page 2021
[70]

Advances in Neural Information Processing Systems , volume=

Certainty equivalence is efficient for linear quadratic control , author=. Advances in Neural Information Processing Systems , volume=

work page
[71]

2025 American Control Conference (ACC) , pages=

Linear convergence of data-enabled policy optimization for linear quadratic tracking , author=. 2025 American Control Conference (ACC) , pages=. 2025 , organization=

work page 2025
[72]

IEEE Control Systems Letters , volume=

Data-driven design of explicit predictive controllers with structural priors , author=. IEEE Control Systems Letters , volume=. 2023 , publisher=

work page 2023
[73]

Data-Enabled Policy Optimization for Direct Adaptive Learning of the

Zhao, Feiran and Dörfler, Florian and Chiuso, Alessandro and You, Keyou , journal=. Data-Enabled Policy Optimization for Direct Adaptive Learning of the. 2025 , volume=

work page 2025
[74]

Regularization for Covariance Parameterization of Direct Data-Driven LQR Control , year=

Zhao, Feiran and Chiuso, Alessandro and Dörfler, Florian , journal=. Regularization for Covariance Parameterization of Direct Data-Driven LQR Control , year=

work page
[75]

On the Role of Regularization in Direct Data-Driven

Dörfler, Florian and Tesi, Pietro and De Persis, Claudio , booktitle=. On the Role of Regularization in Direct Data-Driven. 2022 , volume=

work page 2022
[76]

van and Lygeros, John and Dörfler, Florian , journal=

Coulson, Jeremy and Waarde, Henk J. van and Lygeros, John and Dörfler, Florian , journal=. A Quantitative Notion of Persistency of Excitation and the Robust Fundamental Lemma , year=

work page
[77]

2023 , publisher=

Topics in random matrix theory , author=. 2023 , publisher=

work page 2023
[78]

Policy Gradient Adaptive Control for the

Zhao, Feiran and Chiuso, Alessandro and D. Policy Gradient Adaptive Control for the. arXiv preprint arXiv:2505.03706 , year=

work page arXiv
[79]

Mathematics of Control, Signals and Systems , volume=

Small-gain theorem for ISS systems and applications , author=. Mathematics of Control, Signals and Systems , volume=. 1994 , publisher=

work page 1994
[80]

2002 , publisher=

Nonlinear systems , author=. 2002 , publisher=

work page 2002

Showing first 80 references.

[1] [1]

Stabilizing Dynamical Systems via Policy Gradient Methods , volume =

Perdomo, Juan and Umenberger, Jack and Simchowitz, Max , booktitle =. Stabilizing Dynamical Systems via Policy Gradient Methods , volume =

work page

[2] [2]

Reinforcement learning: An introduction , year =

Sutton, Richard S and Barto, Andrew G , publisher =. Reinforcement learning: An introduction , year =

work page

[3] [3]

Learning control systems--Review and outlook , volume =

Fu, King-Sun , journal =. Learning control systems--Review and outlook , volume =

work page

[4] [4]

arXiv preprint arXiv:2202.07187 , year=

On the sample complexity of stabilizing lti systems on a single trajectory , author=. arXiv preprint arXiv:2202.07187 , year=

work page arXiv

[5] [5]

Data informativity: a new perspective on data-driven analysis and control , volume =

van Waarde, Henk J and Eising, Jaap and Trentelman, Harry L and Camlibel, M Kanat , journal =. Data informativity: a new perspective on data-driven analysis and control , volume =

work page

[6] [6]

Linear System Theory and Design (4th edition) , year =

Chen, Chi-Tsong , date-modified =. Linear System Theory and Design (4th edition) , year =

work page

[7] [7]

, date-modified =

Ljung, L. , date-modified =. System Identification: Theory for the User , year =

work page

[8] [8]

and Camlibel, M

van Waarde, Henk J. and Camlibel, M. Kanat and Mesbahi, Mehran , doi =. From Noisy Data to Feedback Controllers: Nonconservative Design via a Matrix. IEEE Transactions on Automatic Control , number =. 2022 , bdsk-url-1 =

work page 2022

[9] [9]

Formulas for data-driven control: Stabilization, optimality, and robustness , volume =

De Persis, Claudio and Tesi, Pietro , journal =. Formulas for data-driven control: Stabilization, optimality, and robustness , volume =

work page

[10] [10]

Global convergence of policy gradient methods for the linear quadratic regulator , year =

Fazel, Maryam and Ge, Rong and Kakade, Sham and Mesbahi, Mehran , booktitle =. Global convergence of policy gradient methods for the linear quadratic regulator , year =

work page

[11] [11]

Analysis of the Optimization Landscape of Linear Quadratic Gaussian

Tang, Yujie and Zheng, Yang and and Li, Na , booktitle =. Analysis of the Optimization Landscape of Linear Quadratic Gaussian. arXiv:2102.04393 , organization =

work page arXiv

[12] [12]

Global convergence of policy gradient primal--dual methods for risk-constrained

Zhao, Feiran and You, Keyou and Ba. Global convergence of policy gradient primal--dual methods for risk-constrained. IEEE Transactions on Automatic Control , volume=. 2023 , publisher=

work page 2023

[13] [13]

On the linear quadratic data-driven control , year =

Markovsky, Ivan and Rapisarda, Paolo , booktitle =. On the linear quadratic data-driven control , year =. doi:10.23919/ECC.2007.7068299 , pages =

work page doi:10.23919/ecc.2007.7068299 2007

[14] [14]

Data-enabled predictive control: In the shallows of the

Coulson, Jeremy and Lygeros, John and D. Data-enabled predictive control: In the shallows of the. 18th European Control Conference (ECC) , organization =

work page

[15] [15]

Stability analysis and control design of

Park, Un Sik and Ikeda, Masao , journal =. Stability analysis and control design of

work page

[16] [16]

Data-based controllability and observability analysis of linear discrete-time systems , volume =

Wang, Zhuo and Liu, Derong , journal =. Data-based controllability and observability analysis of linear discrete-time systems , volume =

work page

[17] [17]

Data-based analysis of discrete-time linear systems in noisy environment: Controllability and observability , volume =

Liu, Derong and Yan, Pengfei and Wei, Qinglai , journal =. Data-based analysis of discrete-time linear systems in noisy environment: Controllability and observability , volume =

work page

[18] [18]

Data-driven analysis methods for controllability and observability of a class of discrete LTI systems with delays , year =

Zhou, Binquan and Wang, Zhuo and Zhai, Yueyang and Yuan, Heng , booktitle =. Data-driven analysis methods for controllability and observability of a class of discrete LTI systems with delays , year =

work page

[19] [19]

Behavioral systems theory in data-driven analysis, signal processing, and control , volume =

Markovsky, Ivan and D. Behavioral systems theory in data-driven analysis, signal processing, and control , volume =. Annual Reviews in Control , pages =

work page

[20] [20]

Maupong and J.C

T.M. Maupong and J.C. Mayo-Maldonado and P. Rapisarda , issn =. On Lyapunov functions and data-driven dissipativity , volume =. IFAC-PapersOnLine , number =

work page

[21] [21]

Determining optimal input--output properties: A data-driven approach , volume =

Koch, Anne and Berberich, Julian and K. Determining optimal input--output properties: A data-driven approach , volume =. Automatica , pages =

work page

[22] [22]

Data-driven inference on optimal input-output properties of polynomial systems with focus on nonlinearity measures , year=

Martin, Tim and Allgöwer, Frank , journal=. Data-driven inference on optimal input-output properties of polynomial systems with focus on nonlinearity measures , year=

work page

[23] [23]

Data-driven tests for controllability , volume =

Mishra, Vikas Kumar and Markovsky, Ivan and Grossmann, Ben , journal =. Data-driven tests for controllability , volume =

work page

[24] [24]

ArXiv preprint arXiv:2109.02090 , title =

van Waarde, Henk J and Camlibel, M Kanat and Rapisarda, Paolo and Trentelman, Harry L , date-modified =. ArXiv preprint arXiv:2109.02090 , title =

work page arXiv

[25] [25]

, journal=

van Waarde, Henk J. , journal=. Beyond Persistent Excitation: Online Experiment Design for Data-Driven Modeling and Control , year=

work page

[26] [26]

A note on persistency of excitation , volume =

Willems, Jan C and Rapisarda, Paolo and Markovsky, Ivan and De Moor, Bart LM , journal =. A note on persistency of excitation , volume =

work page

[27] [27]

Data-driven model predictive control with stability and robustness guarantees , volume =

Berberich, Julian and K. Data-driven model predictive control with stability and robustness guarantees , volume =. IEEE Transactions on Automatic Control , number =

work page

[28] [28]

1985 , issn =

Persistency of excitation, sufficient richness and parameter convergence in discrete time adaptive control , journal =. 1985 , issn =. doi:https://doi.org/10.1016/0167-6911(85)90035-0 , author =

work page doi:10.1016/0167-6911(85)90035-0 1985

[29] [29]

From model-based control to data-driven control: Survey, classification and perspective , volume =

Zhong-Sheng Hou and Zhuo Wang , date-modified =. From model-based control to data-driven control: Survey, classification and perspective , volume =. Information Sciences , pages =

work page

[30] [30]

Bridging direct & indirect data-driven control formulations via regularizations and relaxations , year =

D. Bridging direct & indirect data-driven control formulations via regularizations and relaxations , year =. IEEE Transactions on Automatic Control , publisher =

work page

[31] [31]

A Tour of Reinforcement Learning: The View from Continuous Control , volume =

Recht, Benjamin , doi =. A Tour of Reinforcement Learning: The View from Continuous Control , volume =. Annual Review of Control, Robotics, and Autonomous Systems , number =. 2019 , bdsk-url-1 =

work page 2019

[32] [32]

Data-driven control of complex networks , volume =

Baggio, Giacomo and Bassett, Danielle S and Pasqualetti, Fabio , journal =. Data-driven control of complex networks , volume =

work page

[33] [33]

Human-level control through deep reinforcement learning , volume =

Mnih, Volodymyr and Kavukcuoglu, Koray and Silver, David and Rusu, Andrei A and Veness, Joel and Bellemare, Marc G and Graves, Alex and Riedmiller, Martin and Fidjeland, Andreas K and Ostrovski, Georg and others , journal =. Human-level control through deep reinforcement learning , volume =

work page

[34] [34]

Mastering the game of Go with deep neural networks and tree search , volume =

Silver, David and Huang, Aja and Maddison, Chris J and Guez, Arthur and Sifre, Laurent and Van Den Driessche, George and Schrittwieser, Julian and Antonoglou, Ioannis and Panneershelvam, Veda and Lanctot, Marc and others , journal =. Mastering the game of Go with deep neural networks and tree search , volume =

work page

[35] [35]

The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint , year =

Tu, Stephen and Recht, Benjamin , booktitle =. The gap between model-based and model-free methods on the linear quadratic regulator: An asymptotic viewpoint , year =

work page

[36] [36]

Convergence and Sample Complexity of Gradient Methods for the Model-Free Linear--Quadratic Regulator Problem , volume =

Mohammadi, Hesameddin and Zare, Armin and Soltanolkotabi, Mahdi and Jovanovi. Convergence and Sample Complexity of Gradient Methods for the Model-Free Linear--Quadratic Regulator Problem , volume =. IEEE Transactions on Automatic Control , number =

work page

[37] [37]

From time series to linear system---Part I

Willems, Jan C , journal =. From time series to linear system---Part I. Finite dimensional linear time invariant systems , volume =

work page

[38] [38]

Distributionally robust chance constrained data-enabled predictive control , volume =

Coulson, Jeremy and Lygeros, John and D. Distributionally robust chance constrained data-enabled predictive control , volume =. IEEE Transactions on Automatic Control , number =

work page

[39] [39]

A trajectory-based framework for data-driven system analysis and control , year =

Berberich, Julian and Allg. A trajectory-based framework for data-driven system analysis and control , year =. European Control Conference (ECC) , organization =

work page

[40] [40]

Data-driven stabilization of nonlinear polynomial systems with noisy data , volume =

Guo, Meichen and De Persis, Claudio and Tesi, Pietro , date-modified =. Data-driven stabilization of nonlinear polynomial systems with noisy data , volume =. IEEE Transactions on Automatic Control , number =

work page

[41] [41]

Data-driven control of dynamic event-triggered systems with delays , year =

Wang, Xin and Sun, Jian and Berberich, Julian and Wang, Gang and Allgower, Frank and Chen, Jie , journal =. Data-driven control of dynamic event-triggered systems with delays , year =

work page

[42] [42]

Control theory for linear systems , year =

Trentelman, Harry L and Stoorvogel, Anton A and Hautus, Malo , publisher =. Control theory for linear systems , year =

work page

[43] [43]

Robust and optimal control , year =

Zhou, Kemin and Doyle, John Comstock and Glover, Keith , publisher =. Robust and optimal control , year =

work page

[44] [44]

1994 , publisher=

Adaptive control , author=. 1994 , publisher=

work page 1994

[45] [45]

2017 , publisher=

Model Predictive Control: Theory, Computation, and Design , author=. 2017 , publisher=

work page 2017

[46] [46]

2003 , publisher=

Process control: modeling, design, and simulation , author=. 2003 , publisher=

work page 2003

[47] [47]

2012 , publisher=

Robust and Adaptive Control: With Aerospace Applications , author=. 2012 , publisher=

work page 2012

[48] [48]

导弹控制原理 , year =

陈坚 , publisher=. 导弹控制原理 , year =

work page

[49] [49]

IFAC-PapersOnLine , volume=

Experiment design for impulse response identification with signal matrix models , author=. IFAC-PapersOnLine , volume=. 2021 , publisher=

work page 2021

[50] [50]

IFAC Proceedings Volumes , volume=

Numerical identification of linear dynamic systems from normal operating records , author=. IFAC Proceedings Volumes , volume=. 1965 , publisher=

work page 1965

[51] [51]

Fast identification and stabilization of unknown linear systems , year =

Dennis Gramlich, Christian Ebenbauer , journal =. Fast identification and stabilization of unknown linear systems , year =

work page

[52] [52]

IEEE Transactions on Industrial Electronics , volume=

A failure-detection strategy for IGBT based on gate-voltage behavior applied to a motor drive system , author=. IEEE Transactions on Industrial Electronics , volume=. 2010 , publisher=

work page 2010

[53] [53]

1996 , publisher=

Subspace identification for linear systems: Theory, Implementation, Applications , author=. 1996 , publisher=

work page 1996

[54] [54]

2023 , title =

Kang, Shubo and You, Keyou , journal=. 2023 , title =

work page 2023

[55] [55]

2012 , publisher=

Optimal control , author=. 2012 , publisher=

work page 2012

[56] [56]

Automatica , volume=

Minimum input design for direct data-driven property identification of unknown linear systems , author=. Automatica , volume=. 2023 , publisher=

work page 2023

[57] [57]

2012 , publisher=

Dynamic programming and optimal control , author=. 2012 , publisher=

work page 2012

[58] [58]

On the certainty-equivalence approach to direct data-driven

D. On the certainty-equivalence approach to direct data-driven. IEEE Transactions on Automatic Control , volume=. 2023 , publisher=

work page 2023

[59] [59]

IEEE Control Systems Magazine , volume=

Control for societal-scale challenges: Road map 2030 , author=. IEEE Control Systems Magazine , volume=. 2024 , publisher=

work page 2030

[60] [60]

Proceedings of the 24th Annual Conference on Learning Theory , pages=

Regret bounds for the adaptive control of linear quadratic systems , author=. Proceedings of the 24th Annual Conference on Learning Theory , pages=. 2011 , organization=

work page 2011

[61] [61]

Foundations of Computational Mathematics , volume=

On the sample complexity of the linear quadratic regulator , author=. Foundations of Computational Mathematics , volume=. 2020 , publisher=

work page 2020

[62] [62]

Almost Surely

Lu, Yiwen and Mo, Yilin , journal=. Almost Surely. 2025 , publisher=

work page 2025

[63] [63]

2023 62nd IEEE Conference on Decision and Control (CDC) , pages=

Data-enabled policy optimization for the linear quadratic regulator , author=. 2023 62nd IEEE Conference on Decision and Control (CDC) , pages=. 2023 , organization=

work page 2023

[64] [64]

IEEE Transactions on Automatic Control , year=

Convergence and sample complexity of policy gradient methods for stabilizing linear systems , author=. IEEE Transactions on Automatic Control , year=

work page

[65] [65]

Annual Review of Control, Robotics, and Autonomous Systems , volume=

Toward a theoretical foundation of policy optimization for learning control policies , author=. Annual Review of Control, Robotics, and Autonomous Systems , volume=. 2023 , publisher=

work page 2023

[66] [66]

2019 , publisher=

Reinforcement learning and optimal control , author=. 2019 , publisher=

work page 2019

[67] [67]

IEEE Control Systems Magazine , volume=

Data-driven control based on the behavioral approach: From theory to applications in power systems , author=. IEEE Control Systems Magazine , volume=. 2023 , publisher=

work page 2023

[68] [68]

arXiv preprint arXiv:2312.14788 , year=

Harnessing the final control error for optimal data-driven predictive control , author=. arXiv preprint arXiv:2312.14788 , year=

work page arXiv

[69] [69]

Automatica , volume=

Low-complexity learning of linear quadratic regulators from noisy data , author=. Automatica , volume=. 2021 , publisher=

work page 2021

[70] [70]

Advances in Neural Information Processing Systems , volume=

Certainty equivalence is efficient for linear quadratic control , author=. Advances in Neural Information Processing Systems , volume=

work page

[71] [71]

2025 American Control Conference (ACC) , pages=

Linear convergence of data-enabled policy optimization for linear quadratic tracking , author=. 2025 American Control Conference (ACC) , pages=. 2025 , organization=

work page 2025

[72] [72]

IEEE Control Systems Letters , volume=

Data-driven design of explicit predictive controllers with structural priors , author=. IEEE Control Systems Letters , volume=. 2023 , publisher=

work page 2023

[73] [73]

Data-Enabled Policy Optimization for Direct Adaptive Learning of the

Zhao, Feiran and Dörfler, Florian and Chiuso, Alessandro and You, Keyou , journal=. Data-Enabled Policy Optimization for Direct Adaptive Learning of the. 2025 , volume=

work page 2025

[74] [74]

Regularization for Covariance Parameterization of Direct Data-Driven LQR Control , year=

Zhao, Feiran and Chiuso, Alessandro and Dörfler, Florian , journal=. Regularization for Covariance Parameterization of Direct Data-Driven LQR Control , year=

work page

[75] [75]

On the Role of Regularization in Direct Data-Driven

Dörfler, Florian and Tesi, Pietro and De Persis, Claudio , booktitle=. On the Role of Regularization in Direct Data-Driven. 2022 , volume=

work page 2022

[76] [76]

van and Lygeros, John and Dörfler, Florian , journal=

Coulson, Jeremy and Waarde, Henk J. van and Lygeros, John and Dörfler, Florian , journal=. A Quantitative Notion of Persistency of Excitation and the Robust Fundamental Lemma , year=

work page

[77] [77]

2023 , publisher=

Topics in random matrix theory , author=. 2023 , publisher=

work page 2023

[78] [78]

Policy Gradient Adaptive Control for the

Zhao, Feiran and Chiuso, Alessandro and D. Policy Gradient Adaptive Control for the. arXiv preprint arXiv:2505.03706 , year=

work page arXiv

[79] [79]

Mathematics of Control, Signals and Systems , volume=

Small-gain theorem for ISS systems and applications , author=. Mathematics of Control, Signals and Systems , volume=. 1994 , publisher=

work page 1994

[80] [80]

2002 , publisher=

Nonlinear systems , author=. 2002 , publisher=

work page 2002