ID-Selection combines importance scoring with iterative diversity suppression to prune 97.2% of visual tokens in LVLMs while retaining 91.8% performance and cutting FLOPs by over 97% without retraining.
Msr-vtt: A large video description dataset for bridging video and language
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
ID-Selection: Importance-Diversity Based Visual Token Selection for Efficient LVLM Inference
ID-Selection combines importance scoring with iterative diversity suppression to prune 97.2% of visual tokens in LVLMs while retaining 91.8% performance and cutting FLOPs by over 97% without retraining.