A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.
Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.SY 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Bridging Continuous-time LQR and Reinforcement Learning via Gradient Flow of the Bellman Error
A gradient flow on a continuous-time Bellman error parametrized by feedback gain converges to the optimal LQR controller and stays inside the stabilizing region.