Meanwhile, since(1−λ)e x −1>−1>−e x, it can be guaranteed that [(1−λ)e x −1] hp ((1−λ)e x −x) 2 + 4(ex −1) 2 −((1−λ)e x −x) i + 4(ex −1)e x >−e x[2(ex −1)] + 4(e x −1)e x >0

If(1−λ)e x −1<0and(1−λ)e x −x≥0, then by triangle inequality, p ((1−λ)e x −x) 2 + 4(ex −1) 2 −((1−λ)e x −x)≤2(e x −1)

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning

stat.ML · 2025-02-19 · unverdicted · novelty 7.0

Derives novel high-dimensional concentration inequalities for vector-valued Markov chain martingales and applies them to TD learning for consistency guarantees matching asymptotic variance up to logs and O(T^{-1/4} log T) Gaussian approximation rate.

citing papers explorer

Showing 1 of 1 citing paper.

Uncertainty quantification for Markov chain induced martingales with application to temporal difference learning stat.ML · 2025-02-19 · unverdicted · none · ref 17
Derives novel high-dimensional concentration inequalities for vector-valued Markov chain martingales and applies them to TD learning for consistency guarantees matching asymptotic variance up to logs and O(T^{-1/4} log T) Gaussian approximation rate.

Meanwhile, since(1−λ)e x −1>−1>−e x, it can be guaranteed that [(1−λ)e x −1] hp ((1−λ)e x −x) 2 + 4(ex −1) 2 −((1−λ)e x −x) i + 4(ex −1)e x >−e x[2(ex −1)] + 4(e x −1)e x >0

fields

years

verdicts

representative citing papers

citing papers explorer