Establishes Õ(1/k) mean-square last-iterate convergence for asynchronous average-reward Q-learning with adaptive stepsizes and proves adaptivity is necessary.
U., Khodadadian, S., and Maguluri, S
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2025 2verdicts
UNVERDICTED 2representative citing papers
Proves O(1/k^{1/4-ε}) last-iterate mean-square residual decay and almost-sure convergence for two-time-scale SA with non-expansive slow mappings, viewed as stochastic inexact Krasnoselskii-Mann iterations.
citing papers explorer
-
From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes
Establishes Õ(1/k) mean-square last-iterate convergence for asynchronous average-reward Q-learning with adaptive stepsizes and proves adaptivity is necessary.
-
Non-Expansive Mappings in Two-Time-Scale Stochastic Approximation: Finite-Time Analysis
Proves O(1/k^{1/4-ε}) last-iterate mean-square residual decay and almost-sure convergence for two-time-scale SA with non-expansive slow mappings, viewed as stochastic inexact Krasnoselskii-Mann iterations.