A new speculative inference system speeds up diffusion VLAs to 19.1 ms average latency (3.04x faster) on LIBERO by replacing most full 58 ms inferences with 7.8 ms draft rounds while preserving task performance.
Accelerated diffu- sion models via speculative sampling
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 1polarities
background 1representative citing papers
Minimizing averaged squared Lipschitzness of the drift produces interpolation schedules that improve numerical accuracy and mitigate mode collapse in generative models, with closed-form optima for Gaussians and validation on stochastic PDEs.
citing papers explorer
-
Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs
A new speculative inference system speeds up diffusion VLAs to 19.1 ms average latency (3.04x faster) on LIBERO by replacing most full 58 ms inferences with 7.8 ms draft rounds while preserving task performance.
-
Lipschitz-Guided Design of Interpolation Schedules in Generative Models
Minimizing averaged squared Lipschitzness of the drift produces interpolation schedules that improve numerical accuracy and mitigate mode collapse in generative models, with closed-form optima for Gaussians and validation on stochastic PDEs.