Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Derives optimal strategies for a partially observed Stackelberg SDE game with asymmetric information and extends deterministic multi-agent formation control to the stochastic case.
citing papers explorer
-
A Switching System Theory of Q-Learning with Linear Function Approximation
Derives an exact linear switched model for the mean dynamics of Q-learning with linear function approximation and relates convergence to joint spectral radius stability of the switched system, extending the view to stochastic and regularized cases.
-
A linear-quadratic partially observed Stackelberg stochastic differential game with multiple followers and its application to multi-agent formation control
Derives optimal strategies for a partially observed Stackelberg SDE game with asymmetric information and extends deterministic multi-agent formation control to the stochastic case.