MSR-VTT: A large video description dataset for bridging video and language

Xu, J · 2016 · DOI 10.1109/cvpr.2016.571

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

citation-role summary

background 1 dataset 1

citation-polarity summary

background 1 use dataset 1

representative citing papers

Seizure-Semiology-Suite (S3): A Clinically Multimodal Dataset, Benchmark, and Models for Seizure Semiology Understanding

cs.CV · 2026-05-21 · unverdicted · novelty 7.0

Seizure-Semiology-Suite provides a new clinically annotated video dataset and hierarchical benchmark that exposes weaknesses in current MLLMs for seizure semiology and demonstrates gains from fine-tuning and a neuro-symbolic classifier reaching 0.96 F1.

Perception Without Engagement: Dissecting the Causal Discovery Deficit in LMMs

cs.CL · 2026-05-10 · unverdicted · novelty 7.0

LMMs perceive videos but underexploit visual content for causal reasoning due to textual shortcuts; ProCauEval diagnoses this and ADPO training reduces reliance on priors.

ReCoVR: Closing the Loop in Interactive Composed Video Retrieval

cs.IR · 2026-05-11 · unverdicted · novelty 6.0

ReCoVR introduces a reflexive dual-pathway architecture for interactive composed video retrieval that outperforms baselines by combining intent routing with trajectory-level reflection on retrieval history.

citing papers explorer

Showing 3 of 3 citing papers.

Seizure-Semiology-Suite (S3): A Clinically Multimodal Dataset, Benchmark, and Models for Seizure Semiology Understanding cs.CV · 2026-05-21 · unverdicted · none · ref 128
Seizure-Semiology-Suite provides a new clinically annotated video dataset and hierarchical benchmark that exposes weaknesses in current MLLMs for seizure semiology and demonstrates gains from fine-tuning and a neuro-symbolic classifier reaching 0.96 F1.
Perception Without Engagement: Dissecting the Causal Discovery Deficit in LMMs cs.CL · 2026-05-10 · unverdicted · none · ref 6
LMMs perceive videos but underexploit visual content for causal reasoning due to textual shortcuts; ProCauEval diagnoses this and ADPO training reduces reliance on priors.
ReCoVR: Closing the Loop in Interactive Composed Video Retrieval cs.IR · 2026-05-11 · unverdicted · none · ref 43
ReCoVR introduces a reflexive dual-pathway architecture for interactive composed video retrieval that outperforms baselines by combining intent routing with trajectory-level reflection on retrieval history.

MSR-VTT: A large video description dataset for bridging video and language

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer