Develops quotient-categorical representations that render the average-reward distributional Bellman operator well-defined, non-expansive, and convergent under i.i.d. and Markovian sampling.
Finite- sample analysis of contractive stochastic approximation using smooth convex envelopes.Ad- vances in Neural Information Processing Systems, 33:8223–8234
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
method 1
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.
citing papers explorer
-
Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning
Develops quotient-categorical representations that render the average-reward distributional Bellman operator well-defined, non-expansive, and convergent under i.i.d. and Markovian sampling.
-
A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning
Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.