Sign-separated analysis decomposes Q-learning errors into negative parts dominated by an optimal-policy LTI system and positive parts controlled by a switching system, yielding finite-time bounds for deterministic and stochastic cases.
Q-learning.Machine learning, 8(3):279–292, 1992
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Sign-Separated Finite-Time Error Analysis of Q-Learning
Sign-separated analysis decomposes Q-learning errors into negative parts dominated by an optimal-policy LTI system and positive parts controlled by a switching system, yielding finite-time bounds for deterministic and stochastic cases.