CLASP reduces visual tokens in MLLMs through class-adaptive multi-layer fusion and dual-stage pruning of salient and completion tokens, outperforming prior single-layer static methods across benchmarks and architectures.
(26)).Computing {¯zn,t}t∈Vn costs O(L Mn dv): a single weighted sum over cached layer outputs
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models
CLASP reduces visual tokens in MLLMs through class-adaptive multi-layer fusion and dual-stage pruning of salient and completion tokens, outperforming prior single-layer static methods across benchmarks and architectures.