pith. sign in

arxiv: 2411.11925 · v3 · pith:VHQY7K7Qnew · submitted 2024-11-18 · 💻 cs.CV

Continuous Speculative Decoding for Autoregressive Image Generation

classification 💻 cs.CV
keywords modelscontinuousdecodinggenerationimagespeculativeacceptanceacceptance-rejection
0
0 comments X
read the original abstract

Continuous visual autoregressive (AR) models have demonstrated promising performance in image generation, but their inherently sequential nature results in slow inference speed. Speculative decoding, a successful acceleration technique for large language models (LLMs), has effectively accelerated discrete visual AR models. However, the absence of an analogous theory for continuous distributions precludes its use in accelerating continuous AR models. To fill this gap, this work presents continuous speculative decoding, and addresses challenges from: 1) low acceptance rate, caused by inconsistent output distribution modeled by target and draft models, and 2) modified distribution without analytic expression, caused by a complex integral. For challenge 1), we address low acceptance rates through an approximated criterion, a novel denoising trajectory alignment strategy based on reparameterization proximity, and token pre-filling. For challenge 2), we introduce acceptance-rejection sampling algorithm with an appropriate upper bound, thereby avoiding explicitly calculating the integral. Furthermore, our denoising trajectory alignment is also reused in acceptance-rejection sampling, effectively avoiding repetitive diffusion model inference. Extensive experiments on various models at 256x256 and 512x512 resolutions demonstrate that our approach achieves over 2x wall-time speedup while preserving the image generation quality. Codes is available at: https://github.com/MarkXCloud/CSpD

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Drift-AR: Single-Step Visual Autoregressive Generation via Anti-Symmetric Drifting

    cs.CV 2026-03 unverdicted novelty 7.0

    Drift-AR achieves 3.8-5.5x speedup in AR-diffusion image models by using entropy to enable entropy-informed speculative decoding and single-step (1-NFE) anti-symmetric drifting decoding.

  2. CASCADE: Context-Aware Relaxation for Speculative Image Decoding

    cs.CV 2026-05 unverdicted novelty 6.0

    CASCADE formalizes semantic interchangeability and convergence in target model representations to enable context-aware acceptance relaxation in tree-based speculative decoding, delivering up to 3.6x speedup on text-to...