pith. sign in

A finite time analysis of temporal difference learning with linear function approximation

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.LG 2

years

2026 1 2025 1

verdicts

UNVERDICTED 2

representative citing papers

Bridging the Gap Between Average and Discounted TD Learning

cs.LG · 2026-05-03 · unverdicted · novelty 6.0

A new two-trajectory sampling algorithm for average-reward TD learning guarantees convergence with quadratic sample complexity and no explicit dimension dependence in both tabular and linear approximation settings.

Corruption-Tolerant Asynchronous Q-Learning with Near-Optimal Rates

cs.LG · 2025-09-10 · unverdicted · novelty 6.0

A novel robust asynchronous Q-learning algorithm achieves finite-time convergence rates that match clean-data bounds up to an additive term proportional to the corruption fraction, with a matching information-theoretic lower bound.

citing papers explorer

Showing 2 of 2 citing papers.

  • Bridging the Gap Between Average and Discounted TD Learning cs.LG · 2026-05-03 · unverdicted · none · ref 6

    A new two-trajectory sampling algorithm for average-reward TD learning guarantees convergence with quadratic sample complexity and no explicit dimension dependence in both tabular and linear approximation settings.

  • Corruption-Tolerant Asynchronous Q-Learning with Near-Optimal Rates cs.LG · 2025-09-10 · unverdicted · none · ref 50

    A novel robust asynchronous Q-learning algorithm achieves finite-time convergence rates that match clean-data bounds up to an additive term proportional to the corruption fraction, with a matching information-theoretic lower bound.