Discrete action tokenization in VLA models creates an information bottleneck that prevents vision encoder scaling from improving performance, unlike continuous policies, as validated on the LIBERO benchmark.
Fabian Mentzer, David Minnen, Eirikur Agustsson, and Michael Tschannen
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.RO 2years
2026 2roles
background 1polarities
background 1representative citing papers
citing papers explorer
-
The Compression Gap: Why Discrete Tokenization Limits Vision-Language-Action Model Scaling
Discrete action tokenization in VLA models creates an information bottleneck that prevents vision encoder scaling from improving performance, unlike continuous policies, as validated on the LIBERO benchmark.
- DiscreteRTC: Discrete Diffusion Policies are Natural Asynchronous Executors