EgoCoT-Bench provides 3,172 verifiable QA pairs across perception, anticipation, and reasoning tasks on egocentric videos, revealing that many MLLMs give answer-correct but evidence-inconsistent explanations.
In Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
EgoCoT-Bench: Benchmarking Grounded and Verifiable Operation-Centric Chain of Thought Reasoning for MLLMs
EgoCoT-Bench provides 3,172 verifiable QA pairs across perception, anticipation, and reasoning tasks on egocentric videos, revealing that many MLLMs give answer-correct but evidence-inconsistent explanations.