M-LLM Based Video Frame Selection for Efficient Video Understanding, March 2025

Kai Hu, Feng Gao, Xiaohan Nie, Peng Zhou, Son Tran, Tal Neiman, Lingyun Wang, Mubarak Shah, Raffay Hamid, Bing Yin, Trishul Chilimbi · 2025 · arXiv 2502.19680

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

representative citing papers

PEEK: Picking Essential frames via Efficient Knowledge distillation

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

PEEK distills caption-conditioned frame relevance into a lightweight visual model, outperforming adaptive baselines on ActivityNet Captions and MSR-VTT especially at 1-2 frame budgets while adding only 5.2% overhead.

citing papers explorer

Showing 1 of 1 citing paper after filters.

PEEK: Picking Essential frames via Efficient Knowledge distillation cs.CV · 2026-05-29 · unverdicted · none · ref 10
PEEK distills caption-conditioned frame relevance into a lightweight visual model, outperforming adaptive baselines on ActivityNet Captions and MSR-VTT especially at 1-2 frame budgets while adding only 5.2% overhead.

M-LLM Based Video Frame Selection for Efficient Video Understanding, March 2025

fields

years

verdicts

representative citing papers

citing papers explorer