pith. sign in

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

Learning is an inherently continuous phenomenon. When humans learn a new task there is no explicit distinction between training and inference. As we learn a task, we keep learning about it while performing the task. What we learn and how we learn it varies during different stages of learning. Learning how to learn and adapt is a key property that enables us to generalize effortlessly to new settings. This is in contrast with conventional settings in machine learning where a trained model is frozen during inference. In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation. A fundamental challenge in navigation is generalization to unseen scenes. In this paper we propose a self-adaptive visual navigation method (SAVN) which learns to adapt to new environments without any explicit supervision. Our solution is a meta-reinforcement learning approach where an agent learns a self-supervised interaction loss that encourages effective navigation. Our experiments, performed in the AI2-THOR framework, show major improvements in both success rate and SPL for visual navigation in novel scenes. Our code and data are available at: https://github.com/allenai/savn .

fields

cs.RO 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

MVP-Nav: Multi-layer Value Map Planner Navigator

cs.RO · 2026-06-30 · unverdicted · novelty 5.0

MVP-Nav reconstructs explicit 3D physical occupancy from monocular RGB using foundation models and integrates it with semantic priorities via a Multi-layer Value Map for grounded planning in zero-shot object navigation.

citing papers explorer

Showing 1 of 1 citing paper.

  • MVP-Nav: Multi-layer Value Map Planner Navigator cs.RO · 2026-06-30 · unverdicted · none · ref 34 · internal anchor

    MVP-Nav reconstructs explicit 3D physical occupancy from monocular RGB using foundation models and integrates it with semantic priorities via a Multi-layer Value Map for grounded planning in zero-shot object navigation.