Recognition: unknown
Central Limit Theorems for Asynchronous Averaged Q-Learning
classification
💻 cs.LG
math.OCstat.ML
keywords
centrallimitasynchronousaveragedq-learningtheoremtheoremsaddition
read the original abstract
This paper establishes central limit theorems for Polyak-Ruppert averaged Q-learning under asynchronous updates. We prove a non-asymptotic central limit theorem, where the convergence rate in Wasserstein distance explicitly reflects the dependence on the number of iterations, state-action space size, the discount factor, and the quality of exploration. In addition, we derive a functional central limit theorem, showing that the partial-sum process converges weakly to a Brownian motion.
This paper has not been read by Pith yet.
Forward citations
Cited by 1 Pith paper
-
Gaussian Approximation for Asynchronous Q-learning
Derived rates of order up to n^{-1/6} log^4(n S A) for the high-dimensional CLT of averaged asynchronous Q-learning iterates, plus a general martingale-difference CLT.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.