Periodic and soft target updates guarantee convergence in linear Q-learning to the exact projected Q-Bellman solution under spectral and step-size conditions via joint spectral radius analysis of switched linear systems.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.
Introduces and analyzes the λ-target update for linear Q-learning via geometric averaging of periodic target maps, studied with a switching-system model in the deterministic case.
Derives optimal strategies for a partially observed Stackelberg SDE game with asymmetric information and extends deterministic multi-agent formation control to the stochastic case.
citing papers explorer
-
Target Updates May Stabilize Linear Q-Learning: Periodic and Soft Dynamics
Periodic and soft target updates guarantee convergence in linear Q-learning to the exact projected Q-Bellman solution under spectral and step-size conditions via joint spectral radius analysis of switched linear systems.
-
A Switching System Theory of Q-Learning with Linear Function Approximation
Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.
-
Geometrically Averaged Hard Target Updates for Linear Q-Learning
Introduces and analyzes the λ-target update for linear Q-learning via geometric averaging of periodic target maps, studied with a switching-system model in the deterministic case.
-
A linear-quadratic partially observed Stackelberg stochastic differential game with multiple followers and its application to multi-agent formation control
Derives optimal strategies for a partially observed Stackelberg SDE game with asymmetric information and extends deterministic multi-agent formation control to the stochastic case.