HalfV disentangles MLLM visual redundancy into universal IVR and architecture-dependent SSR via a three-stage lifecycle, delivering 4.1x FLOPs speedup with 96.8% performance retention on Qwen25-VL.
Preprint, arXiv:2412.13180
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
From Inheritance to Saturation: Disentangling the Evolution of Visual Redundancy for Architecture-Aware MLLM Inference Acceleration
HalfV disentangles MLLM visual redundancy into universal IVR and architecture-dependent SSR via a three-stage lifecycle, delivering 4.1x FLOPs speedup with 96.8% performance retention on Qwen25-VL.