Introduces the Reading in the Wild dataset and a flexible transformer model using egocentric RGB, eye gaze, and head pose modalities to recognize reading activity in diverse real-world scenarios.
Look hear: Gaze prediction for speech-directed human attention
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Reading Recognition in the Wild
Introduces the Reading in the Wild dataset and a flexible transformer model using egocentric RGB, eye gaze, and head pose modalities to recognize reading activity in diverse real-world scenarios.