Imagination-Augmented Agents for Deep Reinforcement Learning

Th\'eophane Weber , S\'ebastien Racani\`ere , David P. Reichert , Lars Buesing , Arthur Guez , Danilo Jimenez Rezende , Adria Puigdom\`enech Badia , Oriol Vinyals

show 7 more authors

Nicolas Heess Yujia Li Razvan Pascanu Peter Battaglia Demis Hassabis David Silver Daan Wierstra

Authors on Pith no claims yet

classification 💻 cs.LG cs.AIstat.ML

keywords deepi2aslearningmodelreinforcementagentsimagination-augmentedmodel-based

0 comments

read the original abstract

We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Transformer Feed-Forward Layers Are Key-Value Memories
cs.CL 2020-12 conditional novelty 8.0

Transformer feed-forward layers act as key-value memories storing textual patterns and their associated output distributions.
Mastering Atari with Discrete World Models
cs.LG 2020-10 accept novelty 7.0

DreamerV2 reaches human-level performance on 55 Atari games by learning behaviors inside a separately trained discrete-latent world model.
Dream to Control: Learning Behaviors by Latent Imagination
cs.LG 2019-12 accept novelty 7.0

Dreamer learns to control from images by imagining and optimizing behaviors in a learned latent world model, outperforming prior methods on 20 visual tasks in data efficiency and final performance.