DeepGaze3.5-VL treats visual scanpaths as discrete token sequences predicted autoregressively by vision-language models, achieving 2.18 bits IG on MIT1003 and outperforming prior specialized models even with matched encoders.
In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology (UIST ’24)
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
background 2
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
Omakase monitors project documents to infer timely queries and distills research reports into actionable suggestions that users rated significantly more useful than raw reports.