A reference-decoupled reformulation makes direct data-driven LQT equivalent to certainty-equivalence solutions and supports convergent offline and online DeePO algorithms.
Policy gradient adaptive con- trol for the LQR: Indirect and direct approaches
4 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 4representative citing papers
Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.
Primal-dual robust linear regression enables O(1/epsilon) sample complexity for model-free policy gradient methods on stochastic LQR.
LMS estimation paired with certainty-equivalent LQR delivers finite-gain ℓ²-stability for linear systems with unknown time-varying parameters and disturbances.
citing papers explorer
-
Direct Data-Driven Linear Quadratic Tracking via Policy Optimization
A reference-decoupled reformulation makes direct data-driven LQT equivalent to certainty-equivalence solutions and supports convergent offline and online DeePO algorithms.
-
Global Convergence of Policy Gradient Methods for ReLU Controllers in Linear Quadratic Regulation
Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.
-
Sample-Efficient Model-Free Policy Gradient Methods for Stochastic LQR via Robust Linear Regression
Primal-dual robust linear regression enables O(1/epsilon) sample complexity for model-free policy gradient methods on stochastic LQR.
-
Stability of Certainty-Equivalent Adaptive LQR for Linear Systems with Unknown Time-Varying Parameters
LMS estimation paired with certainty-equivalent LQR delivers finite-gain ℓ²-stability for linear systems with unknown time-varying parameters and disturbances.