QAPruner introduces a hybrid sensitivity metric that combines group-wise quantization error simulation and outlier intensity with semantic scores to prune visual tokens, yielding 2.24% higher accuracy than naive baselines at 12.5% token retention on LLaVA models while surpassing dense low-bit models
In: ICLR (2023)
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
citation-role summary
method 1
citation-polarity summary
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1roles
method 1polarities
use method 1representative citing papers
citing papers explorer
-
QAPruner: Quantization-Aware Vision Token Pruning for Multimodal Large Language Models
QAPruner introduces a hybrid sensitivity metric that combines group-wise quantization error simulation and outlier intensity with semantic scores to prune visual tokens, yielding 2.24% higher accuracy than naive baselines at 12.5% token retention on LLaVA models while surpassing dense low-bit models