VidMsg is a new benchmark dataset and QA/retrieval tasks for implicit message inference in short videos, where current models perform poorly.
Vidvec: Unlocking video mllm embeddings for video-text retrieval.arXiv preprint arXiv:2602.08099, 2026
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
MMEmb-R1 adaptively applies chain-of-thought reasoning to multimodal embeddings via pair-aware counterfactual selection and RL, reaching 71.2 on MMEB-V2 with a 4B model and lower latency.
citing papers explorer
-
VidMsg: A Benchmark for Implicit Message Inference in Short Videos
VidMsg is a new benchmark dataset and QA/retrieval tasks for implicit message inference in short videos, where current models perform poorly.
-
MMEmb-R1: Reasoning-Enhanced Multimodal Embedding with Pair-Aware Selection and Adaptive Control
MMEmb-R1 adaptively applies chain-of-thought reasoning to multimodal embeddings via pair-aware counterfactual selection and RL, reaching 71.2 on MMEB-V2 with a 4B model and lower latency.