VideoP2R separates perception and reasoning in a process-aware RFT pipeline with a new CoT dataset and PA-GRPO rewards, reaching SOTA on six of seven video benchmarks.
Moment sampling in video llms for long-form video qa
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2representative citing papers
MemoryCard organizes long videos into self-contained topic-aware Memory Cards that improve long-video QA accuracy by up to 21.8% relative under fixed visual-token budgets.
citing papers explorer
-
VIDEOP2R: Video Understanding from Perception to Reasoning
VideoP2R separates perception and reasoning in a process-aware RFT pipeline with a new CoT dataset and PA-GRPO rewards, reaching SOTA on six of seven video benchmarks.