Develops quotient-categorical representations that render the average-reward distributional Bellman operator well-defined, non-expansive, and convergent under i.i.d. and Markovian sampling.
Asynchronous stochastic approximations.SIAM Journal on Control and Optimization, 36(3):840–851
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.
citing papers explorer
-
Quotient-Categorical Representations for Bellman-Compatible Average-Reward Distributional Reinforcement Learning
Develops quotient-categorical representations that render the average-reward distributional Bellman operator well-defined, non-expansive, and convergent under i.i.d. and Markovian sampling.
-
A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning
Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.