SUPERGLASSES is the first VQA benchmark built from actual smart glasses data, and SUPERLENS is an agent using automatic object detection, query decoupling, and multimodal search that outperforms GPT-4o by 2.19% on it.
Yihao Xue, Kristjan Greenewald, Youssef Mroueh, and Baharan Mirzasoleiman
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2representative citing papers
EnsemHalDet improves VLM hallucination detection by ensembling independent detectors trained on diverse internal states, yielding higher AUC than single-detector baselines across VQA datasets.
citing papers explorer
-
SUPERGLASSES: Benchmarking Vision Language Models as Intelligent Agents for AI Smart Glasses
SUPERGLASSES is the first VQA benchmark built from actual smart glasses data, and SUPERLENS is an agent using automatic object detection, query decoupling, and multimodal search that outperforms GPT-4o by 2.19% on it.
-
EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors
EnsemHalDet improves VLM hallucination detection by ensembling independent detectors trained on diverse internal states, yielding higher AUC than single-detector baselines across VQA datasets.