Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem,

· 2021

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

browse 3 citing papers

representative citing papers

Data-Driven Linear Quadratic Control Using Output-Feedback via Non-Minimal Realization

math.OC · 2026-05-16 · unverdicted · novelty 7.0

Presents a data-driven value iteration algorithm for output-feedback LQR that recovers the optimal state-feedback gain via a non-minimal realization constructed from Kreisselmeier's adaptive filter.

On the Optimization Landscape of Observer-based Dynamic Linear Quadratic Control

eess.SY · 2026-04-12 · unverdicted · novelty 6.0

The stationary point of observer-based dynamic LQR is characterized by a pair of symmetric discrete-time Sylvester equations, and the usual separated LQR-plus-minimum-trace-observer design is not optimal.

Multitask LQG Control: Performance and Generalization Bounds

math.OC · 2026-04-17 · unverdicted · novelty 5.0

Multitask LQG control via history-dependent lifting to LQR yields generalization bounds tied to bisimulation heterogeneity and reduces policy gradient variance proportionally to the number of training tasks.

citing papers explorer

Showing 3 of 3 citing papers.

Data-Driven Linear Quadratic Control Using Output-Feedback via Non-Minimal Realization math.OC · 2026-05-16 · unverdicted · none · ref 15
Presents a data-driven value iteration algorithm for output-feedback LQR that recovers the optimal state-feedback gain via a non-minimal realization constructed from Kreisselmeier's adaptive filter.
On the Optimization Landscape of Observer-based Dynamic Linear Quadratic Control eess.SY · 2026-04-12 · unverdicted · none · ref 5
The stationary point of observer-based dynamic LQR is characterized by a pair of symmetric discrete-time Sylvester equations, and the usual separated LQR-plus-minimum-trace-observer design is not optimal.
Multitask LQG Control: Performance and Generalization Bounds math.OC · 2026-04-17 · unverdicted · none · ref 11
Multitask LQG control via history-dependent lifting to LQR yields generalization bounds tied to bisimulation heterogeneity and reduces policy gradient variance proportionally to the number of training tasks.

Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem,

fields

years

verdicts

representative citing papers

citing papers explorer