pith. sign in

Movqa: A benchmark of versatile question-answering for long-form movie understanding

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

fields

cs.CV 6

representative citing papers

LVBench: An Extreme Long Video Understanding Benchmark

cs.CV · 2024-06-12 · accept · novelty 7.0

LVBench is a new benchmark for extreme long video understanding that evaluates multimodal large language models on hour-scale videos using tasks designed to probe extended memory and comprehension.

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling

cs.CV · 2025-01-21 · unverdicted · novelty 5.0

InternVideo2.5 improves video MLLMs by incorporating dense vision task annotations via direct preference optimization and compact spatiotemporal representations via adaptive hierarchical token compression, yielding better benchmark performance, 6x longer video memory, and new capabilities likeobject

citing papers explorer

Showing 6 of 6 citing papers.