pith. sign in

hub Baseline reference

Video-mme: The first-ever comprehensive evaluation benchmark of multi-modal llms in video analysis

Baseline reference. 60% of citing Pith papers use this work as a benchmark or comparison.

19 Pith papers citing it
Baseline 60% of classified citations

hub tools

citation-role summary

dataset 5 background 4 baseline 1

citation-polarity summary

fields

cs.CV 17 cs.AI 2

years

2026 19

representative citing papers

ViMU: Benchmarking Video Metaphorical Understanding

cs.CV · 2026-05-14 · unverdicted · novelty 8.0

ViMU is the first benchmark for evaluating video models on metaphorical and subtextual understanding using hint-free questions grounded in multimodal evidence.

Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction

cs.CV · 2026-05-17 · conditional · novelty 7.0

Omni-DuplexEval creates a new benchmark and LLM-as-a-Judge framework for real-time duplex omni-modal interaction, revealing that current models score below 40% overall and struggle especially with proactive responses.

citing papers explorer

Showing 19 of 19 citing papers.