Beyond hard masks: Progressive token evolution for diffusion language models

Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models , author= · 2026 · arXiv 2601.07351

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

DMax: Aggressive Parallel Decoding for dLLMs

cs.LG · 2026-04-09 · conditional · novelty 7.0 · 2 refs

DMax uses On-Policy Uniform Training and Soft Parallel Decoding to enable aggressive parallelism in dLLMs, raising TPF on GSM8K from 2.04 to 5.47 and on MBPP from 2.71 to 5.86 while preserving accuracy.

DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs

cs.CL · 2026-05-31 · unverdicted · novelty 6.0

Adapting LLaDA-8B-Instruct via Discrete Stochastic Localization with continuous per-token Gaussian noise yields continuous denoising that achieves top ROUGE-1 on zero-shot summarization at low step budgets and adds selective noisy-state robustness.

Edit-Based Refinement for Parallel Masked Diffusion Language Models

cs.CL · 2026-05-10 · unverdicted · novelty 6.0

ME-DLM augments parallel masked diffusion models with edit-distance-supervised refinements to raise quality on coding and math benchmarks while using far fewer diffusion steps.

citing papers explorer

Showing 3 of 3 citing papers.

DMax: Aggressive Parallel Decoding for dLLMs cs.LG · 2026-04-09 · conditional · none · ref 107 · 2 links
DMax uses On-Policy Uniform Training and Soft Parallel Decoding to enable aggressive parallelism in dLLMs, raising TPF on GSM8K from 2.04 to 5.47 and on MBPP from 2.71 to 5.86 while preserving accuracy.
DSL-LLaDA: Scaling Continuous Denoising to 8B Masked Diffusion LMs cs.CL · 2026-05-31 · unverdicted · none · ref 6
Adapting LLaDA-8B-Instruct via Discrete Stochastic Localization with continuous per-token Gaussian noise yields continuous denoising that achieves top ROUGE-1 on zero-shot summarization at low step budgets and adds selective noisy-state robustness.
Edit-Based Refinement for Parallel Masked Diffusion Language Models cs.CL · 2026-05-10 · unverdicted · none · ref 32
ME-DLM augments parallel masked diffusion models with edit-distance-supervised refinements to raise quality on coding and math benchmarks while using far fewer diffusion steps.

Beyond hard masks: Progressive token evolution for diffusion language models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer