DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.
Recurrent experience replay in distributed reinforcement learning
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
A method trains discrete diffusion policies for combinatorial RL by matching to a PMD-regularized target distribution, reporting SOTA performance and sample efficiency on DNA generation, macro-action, and multi-agent benchmarks.
Q-chunking improves offline-to-online RL sample efficiency on long-horizon sparse-reward manipulation tasks by applying action chunking to TD learning.
citing papers explorer
-
Mastering Diverse Domains through World Models
DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.
-
Reinforcement Learning with Discrete Diffusion Policies for Combinatorial Action Spaces
A method trains discrete diffusion policies for combinatorial RL by matching to a PMD-regularized target distribution, reporting SOTA performance and sample efficiency on DNA generation, macro-action, and multi-agent benchmarks.
-
Reinforcement Learning with Action Chunking
Q-chunking improves offline-to-online RL sample efficiency on long-horizon sparse-reward manipulation tasks by applying action chunking to TD learning.