New epoch-based direct MRAC algorithm for adaptive discrete-time LQR achieves high-probability regret bounds without requiring an initial stabilizing controller or exploration.
TX t=0 ∥ξt∥2 # +O(1).(35) Additionally, using an argument similar to that in the proof of Lemma 1 in [15], in the limit asT→ ∞, we have T−1X t=0 w⊤ t+1B ⊤ 1 P k,lyapB1BmeΘtϕt =o
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
eess.SY 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Adapt and Stabilize, Then Learn and Optimize: A New Approach to Adaptive LQR
New epoch-based direct MRAC algorithm for adaptive discrete-time LQR achieves high-probability regret bounds without requiring an initial stabilizing controller or exploration.