Dymu: Dynamic merging and virtual unmerging for efficient vlms.arXiv preprint arXiv:2504.17040, 2025

Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong, Silvio Savarese, Heng Ji, Ran Xu · 2025 · arXiv 2504.17040

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

RegimeVGGT: Layer-Wise Spatially Preserving Redundancy Removal for Visual Geometry Grounded Transformer

cs.CV · 2026-06-16 · unverdicted · novelty 6.0

RegimeVGGT applies layer-wise U-shaped compression via saliency-guided banded merging and selectively protected K/V downsampling to deliver 6.7x speedup on VGGT at matched reconstruction quality.

CIVIC: End-to-End Sequence Compactness for Efficient Vision-Language Models

cs.AI · 2026-05-27 · unverdicted · novelty 6.0

CIVIC is a path-consistent compact visual inference framework that reduces KV-cache memory to approximately one-third and end-to-end latency in VLMs while preserving accuracy via text-aligned KL distillation and adaptive spatial retention.

citing papers explorer

Showing 2 of 2 citing papers after filters.

RegimeVGGT: Layer-Wise Spatially Preserving Redundancy Removal for Visual Geometry Grounded Transformer cs.CV · 2026-06-16 · unverdicted · none · ref 38
RegimeVGGT applies layer-wise U-shaped compression via saliency-guided banded merging and selectively protected K/V downsampling to deliver 6.7x speedup on VGGT at matched reconstruction quality.
CIVIC: End-to-End Sequence Compactness for Efficient Vision-Language Models cs.AI · 2026-05-27 · unverdicted · none · ref 6
CIVIC is a path-consistent compact visual inference framework that reduces KV-cache memory to approximately one-third and end-to-end latency in VLMs while preserving accuracy via text-aligned KL distillation and adaptive spatial retention.

Dymu: Dynamic merging and virtual unmerging for efficient vlms.arXiv preprint arXiv:2504.17040, 2025

fields

years

verdicts

representative citing papers

citing papers explorer