Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
math.OC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Global Convergence of Policy Gradient Methods for ReLU Controllers in Linear Quadratic Regulation
Model-based policy gradient converges globally to the optimal scalar LQR gain for discounted LQR using overparameterized ReLU networks by reducing the controller to two effective gains on positive and negative half-lines.