Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.
Distributional reinforce- ment learning with quantile regression
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
RQIQN introduces a Wasserstein DRO-based correction to Bellman quantile targets that enlarges distributional spread without altering risk-neutral averages.
citing papers explorer
-
A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning
Finite-iteration guarantees are established for asynchronous scalar categorical TD in Cramér geometry and multivariate signed-categorical TD in MMD geometry under i.i.d., Markovian, and episodic sampling.
-
Quantile Geometry Regularization for Distributional Reinforcement Learning
RQIQN introduces a Wasserstein DRO-based correction to Bellman quantile targets that enlarges distributional spread without altering risk-neutral averages.