Deep reinforcement learning at the edge of the statistical precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron C Courville, Marc Bellemare · 2021

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Switching Successor Measures for Hierarchical Zero-shot Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

Switching successor measures extend classical successor measures to enable hierarchical zero-shot RL via the FB π-Switch algorithm that extracts subgoal-selection and control policies from forward-backward representations.

JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning

cs.LG · 2026-05-13 · unverdicted · novelty 7.0

JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.

citing papers explorer

Showing 2 of 2 citing papers.

Switching Successor Measures for Hierarchical Zero-shot Reinforcement Learning cs.LG · 2026-05-13 · unverdicted · none · ref 1
Switching successor measures extend classical successor measures to enable hierarchical zero-shot RL via the FB π-Switch algorithm that extracts subgoal-selection and control policies from forward-backward representations.
JEDI: Joint Embedding Diffusion World Model for Online Model-Based Reinforcement Learning cs.LG · 2026-05-13 · unverdicted · none · ref 44
JEDI is the first online end-to-end latent diffusion world model that trains latents from denoising loss rather than reconstruction, achieving competitive Atari100k results with 43% less VRAM and over 3x faster sampling than pixel diffusion baselines.

Deep reinforcement learning at the edge of the statistical precipice

fields

years

verdicts

representative citing papers

citing papers explorer