pith. sign in

arxiv: 2511.17581 · v3 · pith:UN5G5DBZnew · submitted 2025-11-15 · 💻 cs.LG · cs.CV

EgoCogNav: Cognition-aware Human Egocentric Navigation

classification 💻 cs.LG cs.CV
keywords navigationegocentricegocognavhumanbehaviorscognition-awarefactorsforecasting
0
0 comments X
read the original abstract

Modeling the cognitive and experiential factors of human navigation is central to deepening our understanding of human-environment interaction and to enabling safe social navigation and effective assistive wayfinding. Most existing methods focus on forecasting motions in fully observed scenes and often neglect human factors that capture how people feel and respond to space. To address this gap, we propose EgoCogNav, a multimodal egocentric navigation framework that jointly forecasts perceived path uncertainty, trajectories and head motion from egocentric video, gaze, and motion history. To facilitate research in the field, we introduce the Cognition-aware Egocentric Navigation (CEN) dataset consisting of 6 hours real-world egocentric recordings capturing diverse navigation behaviors in real-world scenarios. Experiments show that EgoCogNav learns the perceived uncertainty that strongly correlates with human-like behaviors such as scanning, hesitation, and backtracking while improving trajectory and head-motion forecasting on held-out navigation recordings.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. EgoTraj: Real-World Egocentric Human Trajectory Dataset for Multimodal Prediction

    cs.CV 2026-05 accept novelty 7.0

    EgoTraj is a new open multimodal dataset of 75 long-horizon egocentric human navigation sequences in urban environments with head pose, gaze, and scene data, plus benchmarks of trajectory prediction methods.

  2. Beyond Isolation: A Unified Benchmark for General-Purpose Navigation

    cs.RO 2026-05 unverdicted novelty 7.0

    OmniNavBench is a unified benchmark for general-purpose navigation featuring composite multi-skill instructions, support for humanoid, quadrupedal and wheeled robots, and 1779 human teleoperated trajectories across 17...

  3. Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models

    cs.CV 2026-06 unverdicted novelty 5.0

    Fine-tuned VLMs guided by eye gaze and ego motion achieve 14.5% accuracy improvement over a transformer baseline for egocentric pedestrian intent decoding.