EPIC-Bench is a new fine-grained benchmark that shows leading VLMs struggle with multi-target counting, part-whole relations, and affordance detection in real-world embodied visual grounding tasks.
arXiv preprint arXiv:2503.11117 , year=
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
years
2026 3representative citing papers
SAGE trains agents in physics-grounded semantic abstractions via RL with asymmetric clipping, achieving 53.21% LLM-Match Success on A-EQA (+9.7% over baseline) and encouraging physical robot transfer.
citing papers explorer
-
EPIC-Bench: A Perception-Centric Benchmark for Fine-Grained Embodied Visual Grounding in Vision-Language Models
EPIC-Bench is a new fine-grained benchmark that shows leading VLMs struggle with multi-target counting, part-whole relations, and affordance detection in real-world embodied visual grounding tasks.
-
Plan in Sandbox, Navigate in Open Worlds: Learning Physics-Grounded Abstracted Experience for Embodied Navigation
SAGE trains agents in physics-grounded semantic abstractions via RL with asymmetric clipping, achieving 53.21% LLM-Match Success on A-EQA (+9.7% over baseline) and encouraging physical robot transfer.
- ESI-Bench: Towards Embodied Spatial Intelligence that Closes the Perception-Action Loop