EgoCogNav: Cognition-aware Human Egocentric Navigation
read the original abstract
Modeling the cognitive and experiential factors of human navigation is central to deepening our understanding of human-environment interaction and to enabling safe social navigation and effective assistive wayfinding. Most existing methods focus on forecasting motions in fully observed scenes and often neglect human factors that capture how people feel and respond to space. To address this gap, we propose EgoCogNav, a multimodal egocentric navigation framework that jointly forecasts perceived path uncertainty, trajectories and head motion from egocentric video, gaze, and motion history. To facilitate research in the field, we introduce the Cognition-aware Egocentric Navigation (CEN) dataset consisting of 6 hours real-world egocentric recordings capturing diverse navigation behaviors in real-world scenarios. Experiments show that EgoCogNav learns the perceived uncertainty that strongly correlates with human-like behaviors such as scanning, hesitation, and backtracking while improving trajectory and head-motion forecasting on held-out navigation recordings.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
EgoTraj: Real-World Egocentric Human Trajectory Dataset for Multimodal Prediction
EgoTraj is a new open multimodal dataset of 75 long-horizon egocentric human navigation sequences in urban environments with head pose, gaze, and scene data, plus benchmarks of trajectory prediction methods.
-
Beyond Isolation: A Unified Benchmark for General-Purpose Navigation
OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 17...
-
Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models
Fine-tuned VLMs guided by eye gaze and ego motion achieve 14.5% accuracy improvement over a transformer baseline for egocentric pedestrian intent decoding.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.