Discrete Tilt Matching recasts dLLM fine-tuning as state-level matching of tilted local unmasking posteriors, producing a stable weighted cross-entropy loss that improves Sudoku and Countdown performance when applied to LLaDA-8B-Instruct.
Any-order flexible length masked diffusion.arXiv preprint arXiv:2509.01025
4 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
SCSI iteratively refines a self-consistent transport map to invert black-box corruptions and enable generative modeling of clean data.
ME-DLM augments parallel masked diffusion models with edit-distance-supervised refinements to raise quality on coding and math benchmarks while using far fewer diffusion steps.
citing papers explorer
-
Discrete Tilt Matching
Discrete Tilt Matching recasts dLLM fine-tuning as state-level matching of tilted local unmasking posteriors, producing a stable weighted cross-entropy loss that improves Sudoku and Countdown performance when applied to LLaDA-8B-Instruct.
-
Generative Modeling from Black-box Corruptions via Self-Consistent Stochastic Interpolants
SCSI iteratively refines a self-consistent transport map to invert black-box corruptions and enable generative modeling of clean data.
-
Edit-Based Refinement for Parallel Masked Diffusion Language Models
ME-DLM augments parallel masked diffusion models with edit-distance-supervised refinements to raise quality on coding and math benchmarks while using far fewer diffusion steps.
- CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credit