Moment sampling in video llms for long-form video qa

Mustafa Chasmai, Gauri Jagatap, Gouthaman KV , Grant Van Horn, Subhransu Maji, Andrea Fanelli · 2025 · arXiv 2507.00033

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

VIDEOP2R: Video Understanding from Perception to Reasoning

cs.CV · 2025-11-14 · conditional · novelty 7.0

VideoP2R separates perception and reasoning in a process-aware RFT pipeline with a new CoT dataset and PA-GRPO rewards, reaching SOTA on six of seven video benchmarks.

Rethinking RAG in Long Videos: What to Retrieve and How to Use It?

cs.AI · 2026-06-11 · unverdicted · novelty 6.0

Introduces V-RAGBench benchmark and CARVE method that selects per-chunk retrieval configurations via parallel retrievers and adaptive reranking, outperforming eight VideoRAG baselines.

MemoryCard: Topic-Aware Multi-Modal Clue Compression for Long-Video Question Answering

cs.CV · 2026-06-04 · unverdicted · novelty 6.0

MemoryCard organizes long videos into self-contained topic-aware Memory Cards that improve long-video QA accuracy by up to 21.8% relative under fixed visual-token budgets.

Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge

cs.CV · 2026-06-03 · unverdicted · novelty 3.0

ASC-MQRA applies answer self-consistency across stochastic video QA runs and optional margin-triggered re-arbitration to achieve 81.16% average accuracy on the CVPR 2026 VidLLMs Challenge Track 2 test set.

citing papers explorer

Showing 3 of 3 citing papers after filters.

Rethinking RAG in Long Videos: What to Retrieve and How to Use It? cs.AI · 2026-06-11 · unverdicted · none · ref 7
Introduces V-RAGBench benchmark and CARVE method that selects per-chunk retrieval configurations via parallel retrievers and adaptive reranking, outperforming eight VideoRAG baselines.
MemoryCard: Topic-Aware Multi-Modal Clue Compression for Long-Video Question Answering cs.CV · 2026-06-04 · unverdicted · none · ref 48
MemoryCard organizes long videos into self-contained topic-aware Memory Cards that improve long-video QA accuracy by up to 21.8% relative under fixed visual-token budgets.
Answer Self-Consistency with Margin-Triggered Question Re-Arbitration for the CVPR 2026 VidLLMs Challenge cs.CV · 2026-06-03 · unverdicted · none · ref 1
ASC-MQRA applies answer self-consistency across stochastic video QA runs and optional margin-triggered re-arbitration to achieve 81.16% average accuracy on the CVPR 2026 VidLLMs Challenge Track 2 test set.

Moment sampling in video llms for long-form video qa

fields

years

verdicts

representative citing papers

citing papers explorer