Sample complexity of asynchronous Q-learning: sharper analysis and variance reduction

· 2006 · arXiv 2006.03041

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

representative citing papers

Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games

cs.LG · 2026-04-06 · unverdicted · novelty 7.0

Provides the first finite-time convergence guarantees for Q-value iteration in general-sum Stackelberg Markov games.

Toward a Unified Lyapunov-Certified ODE Convergence Analysis of Smooth Q-Learning with p-Norms

cs.LG · 2024-04-20 · unverdicted · novelty 5.0

Unified ODE convergence analysis for smooth Q-learning variants via p-norm Lyapunov functions, valid even when the Bellman operator is not a contraction.

citing papers explorer

Showing 2 of 2 citing papers.

Finite-Time Analysis of Q-Value Iteration for General-Sum Stackelberg Games cs.LG · 2026-04-06 · unverdicted · none · ref 3
Provides the first finite-time convergence guarantees for Q-value iteration in general-sum Stackelberg Markov games.
Toward a Unified Lyapunov-Certified ODE Convergence Analysis of Smooth Q-Learning with p-Norms cs.LG · 2024-04-20 · unverdicted · none · ref 14
Unified ODE convergence analysis for smooth Q-learning variants via p-norm Lyapunov functions, valid even when the Bellman operator is not a contraction.

Sample complexity of asynchronous Q-learning: sharper analysis and variance reduction

fields

years

verdicts

representative citing papers

citing papers explorer