Sparsevlm: Vi- sual token sparsification for efficient vision-language model inference

Yuan Zhang, Chun-Kai Fan, Junpeng Ma, Wenzhao Zheng, Tao Huang, Kuan Cheng, Denis A Gudovskiy, Tomoyuki Okuno, Yohei Nakata, Kurt Keutzer, et al · 2025

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models

cs.CV · 2025-11-22 · conditional · novelty 6.0

ActDistill transfers action knowledge from heavy VLA teacher models to lightweight students via graph-encapsulated hierarchies and action-guided dynamic routing, delivering over 50% computation reduction and 1.67x speedup with comparable or better performance on embodied tasks.

citing papers explorer

Showing 1 of 1 citing paper.

ActDistill: General Action-Guided Self-Derived Distillation for Efficient Vision-Language-Action Models cs.CV · 2025-11-22 · conditional · none · ref 32
ActDistill transfers action knowledge from heavy VLA teacher models to lightweight students via graph-encapsulated hierarchies and action-guided dynamic routing, delivering over 50% computation reduction and 1.67x speedup with comparable or better performance on embodied tasks.

Sparsevlm: Vi- sual token sparsification for efficient vision-language model inference

fields

years

verdicts

representative citing papers

citing papers explorer