VLA-AD distills 7B VLA teachers into 158M students using offline VLM semantic guidance on task phases and directions, matching teacher performance on LIBERO with 44x size reduction and 3.28x speedup.
arXiv preprint arXiv:2507.11181 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
MACS improves inference speed in multimodal MoE models by entropy-weighted balancing of visual tokens and real-time modality-adaptive expert capacity allocation.
citing papers explorer
-
Offline Semantic Guidance for Efficient Vision-Language-Action Policy Distillation
VLA-AD distills 7B VLA teachers into 158M students using offline VLM semantic guidance on task phases and directions, matching teacher performance on LIBERO with 44x size reduction and 3.28x speedup.
-
MACS: Modality-Aware Capacity Scaling for Efficient Multimodal MoE Inference
MACS improves inference speed in multimodal MoE models by entropy-weighted balancing of visual tokens and real-time modality-adaptive expert capacity allocation.