Q-Zoom achieves up to 4.39x inference speedup in high-resolution MLLM scenarios via query-aware gating and region localization, matching or exceeding baseline accuracy on document and high-res benchmarks.
A convnet for the 2020s
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
SAS adds semantic scoring with CLIP and a two-stage filter-then-diversity selection process to make generative dataset distillation produce more class-discriminative and diverse compact datasets.
citing papers explorer
-
Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models
Q-Zoom achieves up to 4.39x inference speedup in high-resolution MLLM scenarios via query-aware gating and region localization, matching or exceeding baseline accuracy on document and high-res benchmarks.
-
SAS: Semantic-aware Sampling for Generative Dataset Distillation
SAS adds semantic scoring with CLIP and a two-stage filter-then-diversity selection process to make generative dataset distillation produce more class-discriminative and diverse compact datasets.