ClipSum shows that frozen CLIP features outperform traditional CNN features and fine-tuned CLIP for instructional video summarization on YouCook2.
In: International Conference on Learning Representations (ICLR) (2021)
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CV 2years
2026 2roles
method 1polarities
use method 1representative citing papers
citing papers explorer
-
Multimodal Abstractive Summarization of Instructional Videos with Vision-Language Models
ClipSum shows that frozen CLIP features outperform traditional CNN features and fine-tuned CLIP for instructional video summarization on YouCook2.
- Anatomy-Slot: Unsupervised Anatomical Factorization for Homologous Bilateral Reasoning in Retinal Diagnosis