Auxiliary Tasks Speed Up Learning PointGoal Navigation

Abhishek Das; Dhruv Batra; Erik Wijmans; Joel Ye

arxiv: 2007.04561 · v2 · pith:BLKEMQAPnew · submitted 2020-07-09 · 💻 cs.CV · cs.LG· cs.RO

Auxiliary Tasks Speed Up Learning PointGoal Navigation

Joel Ye , Dhruv Batra , Erik Wijmans , Abhishek Das This is my paper

classification 💻 cs.CV cs.LGcs.RO

keywords auxiliarytasksframesdd-ppoefficiencyimproveslearningmethod

0 comments

read the original abstract

PointGoal Navigation is an embodied task that requires agents to navigate to a specified point in an unseen environment. Wijmans et al. showed that this task is solvable but their method is computationally prohibitive, requiring 2.5 billion frames and 180 GPU-days. In this work, we develop a method to significantly increase sample and time efficiency in learning PointNav using self-supervised auxiliary tasks (e.g. predicting the action taken between two egocentric observations, predicting the distance between two observations from a trajectory,etc.).We find that naively combining multiple auxiliary tasks improves sample efficiency,but only provides marginal gains beyond a point. To overcome this, we use attention to combine representations learnt from individual auxiliary tasks. Our best agent is 5.5x faster to reach the performance of the previous state-of-the-art, DD-PPO, at 40M frames, and improves on DD-PPO's performance at 40M frames by 0.16 SPL. Our code is publicly available at https://github.com/joel99/habitat-pointnav-aux.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Behavior-Constrained Reinforcement Learning with Receding-Horizon Credit Assignment for High-Performance Control
cs.RO 2026-04 unverdicted novelty 6.0

A behavior-constrained RL framework with receding-horizon credit assignment learns high-performance control policies that stay aligned with expert behavior in race car simulation.