Fine-tuned VLMs guided by eye gaze and ego motion achieve 14.5% accuracy improvement over a transformer baseline for egocentric pedestrian intent decoding.
Heads-up: Head- mounted egocentric dataset for trajectory prediction in blind assistance systems,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Decoding Pedestrian Crossing Intention from Egocentric Vision via Vision Language Models
Fine-tuned VLMs guided by eye gaze and ego motion achieve 14.5% accuracy improvement over a transformer baseline for egocentric pedestrian intent decoding.