tmixηt−tmix(2∥θ⋆∥2 + 1) + 2ηt−tmix t−2X i=t−tmix E∥∆i∥2 # =−η t−tmix(2∥θ⋆∥2 + 1)

In combination, we obtain P(∥Sn − eSN ∥2 ≳ p 2dκnlogn)≤3n − 1 2

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning

stat.ML · 2025-02-19 · unverdicted · novelty 7.0

Derives novel high-dimensional concentration inequalities for vector-valued Markov chain martingales and applies them to TD learning for consistency guarantees matching asymptotic variance up to logs and O(T^{-1/4} log T) Gaussian approximation rate.

citing papers explorer

Showing 1 of 1 citing paper.

Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning stat.ML · 2025-02-19 · unverdicted · none · ref 8
Derives novel high-dimensional concentration inequalities for vector-valued Markov chain martingales and applies them to TD learning for consistency guarantees matching asymptotic variance up to logs and O(T^{-1/4} log T) Gaussian approximation rate.

tmixηt−tmix(2∥θ⋆∥2 + 1) + 2ηt−tmix t−2X i=t−tmix E∥∆i∥2 # =−η t−tmix(2∥θ⋆∥2 + 1)

fields

years

verdicts

representative citing papers

citing papers explorer