Fast-dVLM converts an autoregressive VLM to block diffusion in one stage, matches quality on 11 multimodal benchmarks, and delivers over 6x end-to-end inference speedup with KV-cache-compatible parallel decoding and FP8 quantization.
Implications of the data include:
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Fast-dVLM: Efficient Block-Diffusion VLM via Direct Conversion from Autoregressive VLM
Fast-dVLM converts an autoregressive VLM to block diffusion in one stage, matches quality on 11 multimodal benchmarks, and delivers over 6x end-to-end inference speedup with KV-cache-compatible parallel decoding and FP8 quantization.