DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs

Lizhuo Luo; Shenggui Li; Tianwei Zhang; Yonggang Wen

arxiv: 2602.05992 · v3 · pith:FMAXY73Cnew · submitted 2026-02-05 · 💻 cs.CL

DSB: Dynamic Sliding Block Scheduling for Diffusion LLMs

Lizhuo Luo , Shenggui Li , Yonggang Wen , Tianwei Zhang This is my paper

classification 💻 cs.CL

keywords blockdynamicefficiencyinferencenaivequalityschedulingsliding

0 comments

read the original abstract

Diffusion large language models (dLLMs) have emerged as a promising alternative for text generation, distinguished by their native support for parallel decoding. In practice, block inference is crucial for avoiding order misalignment in global bidirectional decoding and improving output quality. However, the widely-used fixed, predefined block (naive) schedule is agnostic to semantic difficulty, making it a suboptimal strategy for both quality and efficiency: it can force premature commitments to uncertain positions while delaying easy positions near block boundaries. In this work, we analyze the limitations of naive block scheduling and disclose the importance of dynamically adapting the schedule to semantic difficulty for reliable and efficient inference. Motivated by this, we propose Dynamic Sliding Block (DSB), a training-free block scheduling method that uses a sliding block with a dynamic size to overcome the rigidity of the naive block. To further improve efficiency, we introduce DSB Cache, a training-free KV-cache mechanism tailored to DSB. Extensive experiments across multiple models and benchmarks demonstrate that DSB, together with DSB Cache, consistently improves both generation quality and inference efficiency for dLLMs. Code is released at https://github.com/lizhuo-luo/DSB.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Focus on the Core: Empowering Diffusion Large Language Models by Self-Contrast
cs.CL 2026-05 unverdicted novelty 7.0

FoCore uses self-contrast on early-converging high-density tokens to boost diffusion LLM quality on reasoning benchmarks while cutting decoding steps by over 2x.
DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference
cs.LG 2026-04 unverdicted novelty 7.0

DepCap accelerates diffusion LM inference up to 5.63x by using last-block influence for adaptive block boundaries and conflict-free token selection for parallel decoding, with negligible quality loss.
VoidPadding: Let [VOID] Handle Padding in Masked Diffusion Language Models so that [EOS] Can Focus on Semantic Termination
cs.CL 2026-06 unverdicted novelty 6.0

VoidPadding decouples padding from termination in MDLMs via a new [VOID] token, delivering +17.84 average benchmark points and 55.7% fewer decoding steps on Dream-7B-Instruct.
SemBlock: Semantic Boundary Dynamic Blocks for Diffusion LLMs
cs.CL 2026-06 unverdicted novelty 6.0

SemBlock adds semantic-boundary prediction to enable dynamic block decoding in diffusion LLMs and reports gains over fixed-block and AdaBlock baselines on GSM8K, IFEval, MATH, and HumanEval.