CLASP reduces visual tokens in MLLMs through class-adaptive multi-layer fusion and dual-stage pruning of salient and completion tokens, outperforming prior single-layer static methods across benchmarks and architectures.
(39)–(40)).Each iteration costs O((Mn −K 1)K2d) for similarity evalu- ation/assignment plus O((Mn −K 1)d) for accumulating cluster sums and normalization
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
CLASP: Class-Adaptive Layer Fusion and Dual-Stage Pruning for Multimodal Large Language Models
CLASP reduces visual tokens in MLLMs through class-adaptive multi-layer fusion and dual-stage pruning of salient and completion tokens, outperforming prior single-layer static methods across benchmarks and architectures.