SDRL applies sliced projections of one-dimensional divergences (Wasserstein, Cramér, MMD) to multivariate return distributions in RL, with Bellman contraction proofs for scalar and general matrix discounting.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Multivariate Distributional Reinforcement Learning Using Sliced Divergences
SDRL applies sliced projections of one-dimensional divergences (Wasserstein, Cramér, MMD) to multivariate return distributions in RL, with Bellman contraction proofs for scalar and general matrix discounting.