EgoBench is a new benchmark with 1,045 tasks and a simulated user environment showing that the best SOTA video-MLLM agents reach only 19.43% average accuracy on interactive multimodal tool-using tasks.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
EgoBench: An Interactive Egocentric Multimodal Benchmark for Tool-Using Agents
EgoBench is a new benchmark with 1,045 tasks and a simulated user environment showing that the best SOTA video-MLLM agents reach only 19.43% average accuracy on interactive multimodal tool-using tasks.