pith. machine review for the scientific record. sign in

arxiv: 1705.08997 · v1 · submitted 2017-05-24 · 💻 cs.AI · cs.LG· stat.ML

Recognition: unknown

State Space Decomposition and Subgoal Creation for Transfer in Deep Reinforcement Learning

Authors on Pith no claims yet
classification 💻 cs.AI cs.LGstat.ML
keywords attentionagentdeepdomainsgeneralizegoallearnlearning
0
0 comments X
read the original abstract

Typical reinforcement learning (RL) agents learn to complete tasks specified by reward functions tailored to their domain. As such, the policies they learn do not generalize even to similar domains. To address this issue, we develop a framework through which a deep RL agent learns to generalize policies from smaller, simpler domains to more complex ones using a recurrent attention mechanism. The task is presented to the agent as an image and an instruction specifying the goal. This meta-controller guides the agent towards its goal by designing a sequence of smaller subtasks on the part of the state space within the attention, effectively decomposing it. As a baseline, we consider a setup without attention as well. Our experiments show that the meta-controller learns to create subgoals within the attention.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Modular Reinforcement Learning For Cooperative Swarms

    cs.RO 2026-05 unverdicted novelty 5.0

    Modular decomposition of interaction states allows distributed RL for cooperative robot swarms to scale without combinatorial memory explosion in foraging simulations.