VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding.arXiv preprint arXiv:2603.07071, 2026

Xueqing Yu, Bohan Li, Yan Li, Zhenheng Yang · 2026 · arXiv 2603.07071

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

representative citing papers

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding

cs.AI · 2026-06-06 · accept · novelty 7.0

MLLMs fail to detect absent correct answers in video QA tasks across three evaluation settings, defaulting to distractors even with chain-of-thought prompting.

citing papers explorer

Showing 1 of 1 citing paper after filters.

When No Answer Is Correct: Diagnosing Absent Answer Detection for MLLMs in Video Understanding cs.AI · 2026-06-06 · accept · none · ref 17
MLLMs fail to detect absent correct answers in video QA tasks across three evaluation settings, defaulting to distractors even with chain-of-thought prompting.

VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding.arXiv preprint arXiv:2603.07071, 2026

fields

years

verdicts

representative citing papers

citing papers explorer