A diffusion model plans by iteratively denoising trajectories, turning sampling into a flexible planning strategy for model-based reinforcement learning.
We only evaluated IQL on the Multi2D environments because it is the strongest baseline in the single-task Maze2D environments by a sizeable margin
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2022 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Planning with Diffusion for Flexible Behavior Synthesis
A diffusion model plans by iteratively denoising trajectories, turning sampling into a flexible planning strategy for model-based reinforcement learning.