Retake: Reducing temporal and knowledge redundancy for long video understanding

Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie · 2024 · arXiv 2412.20504

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding

cs.CV · 2026-02-24 · unverdicted · novelty 7.0

LongVideo-R1 trains a reasoning agent on 33K trajectories to intelligently select informative video clips via iterative refinement and RL, achieving better accuracy-efficiency tradeoffs on long video QA benchmarks.

Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models

cs.CV · 2026-02-24 · unverdicted · novelty 6.0

MMHNet enables video-to-audio models trained on short clips to generalize and generate audio for videos over 5 minutes long.

citing papers explorer

Showing 2 of 2 citing papers.

LongVideo-R1: Smart Navigation for Low-cost Long Video Understanding cs.CV · 2026-02-24 · unverdicted · none · ref 51
LongVideo-R1 trains a reasoning agent on 33K trajectories to intelligently select informative video clips via iterative refinement and RL, achieving better accuracy-efficiency tradeoffs on long video QA benchmarks.
Echoes Over Time: Unlocking Length Generalization in Video-to-Audio Generation Models cs.CV · 2026-02-24 · unverdicted · none · ref 45
MMHNet enables video-to-audio models trained on short clips to generalize and generate audio for videos over 5 minutes long.

Retake: Reducing temporal and knowledge redundancy for long video understanding

fields

years

verdicts

representative citing papers

citing papers explorer