pith. machine review for the scientific record. sign in

arxiv: 1707.06203 · v2 · submitted 2017-07-19 · 💻 cs.LG · cs.AI· stat.ML

Recognition: unknown

Imagination-Augmented Agents for Deep Reinforcement Learning

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIstat.ML
keywords deepi2aslearningmodelreinforcementagentsimagination-augmentedmodel-based
0
0 comments X
read the original abstract

We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data efficiency, performance, and robustness to model misspecification compared to several baselines.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Transformer Feed-Forward Layers Are Key-Value Memories

    cs.CL 2020-12 conditional novelty 8.0

    Transformer feed-forward layers act as key-value memories storing textual patterns and their associated output distributions.

  2. Mastering Atari with Discrete World Models

    cs.LG 2020-10 accept novelty 7.0

    DreamerV2 reaches human-level performance on 55 Atari games by learning behaviors inside a separately trained discrete-latent world model.

  3. Dream to Control: Learning Behaviors by Latent Imagination

    cs.LG 2019-12 accept novelty 7.0

    Dreamer learns to control from images by imagining and optimizing behaviors in a learned latent world model, outperforming prior methods on 20 visual tasks in data efficiency and final performance.