pith. sign in

Variational Autoencoding Discrete Diffusion with Enhanced Dimensional Correlations Modeling

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it
abstract

Discrete diffusion models have recently shown great promise for modeling complex discrete data, with masked diffusion models (MDMs) offering a compelling trade-off between quality and generation speed. MDMs denoise by progressively unmasking multiple dimensions from an all-masked input, but their performance can degrade when using few denoising steps due to limited modeling of inter-dimensional dependencies. In this paper, we propose Variational Autoencoding Discrete Diffusion (VADD), a novel framework that enhances discrete diffusion with latent variable modeling to implicitly capture correlations among dimensions. By introducing an auxiliary recognition model, VADD enables stable training via variational lower bounds maximization and amortized inference over the training set. Our approach retains the efficiency of traditional MDMs while significantly improving sample quality, especially when the number of denoising steps is small. Empirical results on 2D toy data, pixel-level image generation, and text generation demonstrate that VADD consistently outperforms MDM baselines in sample quality with few denoising steps.

fields

cs.LG 4 cs.CL 1

years

2026 5

verdicts

UNVERDICTED 5

clear filters

representative citing papers

Infinite Mask Diffusion for Few-Step Distillation

cs.CL · 2026-05-11 · unverdicted · novelty 7.0

Infinite Mask Diffusion Models use stochastic infinite-state masks to overcome the factorization error lower bound in standard masked diffusion, achieving superior few-step performance on language tasks via distillation.

Unifying Masked Diffusion Models with Various Generation Orders and Beyond

cs.LG · 2026-02-02 · unverdicted · novelty 7.0

OeMDM unifies masked diffusion, autoregressive, and block diffusion models under various generation orders; LoMDM jointly optimizes ordering and diffusion backbone from scratch and outperforms prior discrete diffusion models on language benchmarks.

Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling

cs.LG · 2026-07-01 · unverdicted · novelty 6.0

A parallel-in-time τ-leaping sampler for absorbing discrete diffusion models is introduced, with an exponential-factorial convergence proof and empirical speedups of 7-9× on synthetic tasks and 1.45-1.86× on image/text tasks while using 50% fewer NFE.

Learned Relay Representations for Forward-Thinking Discrete Diffusion Models

cs.LG · 2026-05-21 · unverdicted · novelty 6.0 · 2 refs

Learned Relay Representations add a differentiable per-token channel to masked diffusion models so they can propagate latent information across iterative denoising steps, yielding better coding performance and up to 32% lower latency on Fast-dLLM v2 than standard supervised finetuning.

citing papers explorer

Showing 5 of 5 citing papers after filters.

  • Infinite Mask Diffusion for Few-Step Distillation cs.CL · 2026-05-11 · unverdicted · none · ref 11 · internal anchor

    Infinite Mask Diffusion Models use stochastic infinite-state masks to overcome the factorization error lower bound in standard masked diffusion, achieving superior few-step performance on language tasks via distillation.

  • Unifying Masked Diffusion Models with Various Generation Orders and Beyond cs.LG · 2026-02-02 · unverdicted · none · ref 6 · internal anchor

    OeMDM unifies masked diffusion, autoregressive, and block diffusion models under various generation orders; LoMDM jointly optimizes ordering and diffusion backbone from scratch and outperforms prior discrete diffusion models on language benchmarks.

  • Accelerating Discrete Diffusion Models with Parallel-In-Time Sampling cs.LG · 2026-07-01 · unverdicted · none · ref 21 · internal anchor

    A parallel-in-time τ-leaping sampler for absorbing discrete diffusion models is introduced, with an exponential-factorial convergence proof and empirical speedups of 7-9× on synthetic tasks and 1.45-1.86× on image/text tasks while using 50% fewer NFE.

  • Plug-and-Play Guidance for Discrete Diffusion Models via Gradient-Informed Logit Correction cs.LG · 2026-06-04 · unverdicted · none · ref 20 · internal anchor

    Introduces GILC, a training-free plug-and-play guidance framework for discrete diffusion models that uses Jacobian-free logit correction to achieve SOTA results on DNA, protein, and molecular generation tasks.

  • Learned Relay Representations for Forward-Thinking Discrete Diffusion Models cs.LG · 2026-05-21 · unverdicted · none · ref 31 · 2 links · internal anchor

    Learned Relay Representations add a differentiable per-token channel to masked diffusion models so they can propagate latent information across iterative denoising steps, yielding better coding performance and up to 32% lower latency on Fast-dLLM v2 than standard supervised finetuning.