Non-autoregressive diffusion language models have an inherent proximity bias in token unmasking that causes spatial error propagation, which a minimal planner and annealing strategy can mitigate for better reasoning performance.
Final score for the sampled embeddings is obtained as an average of these values
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models
Non-autoregressive diffusion language models have an inherent proximity bias in token unmasking that causes spatial error propagation, which a minimal planner and annealing strategy can mitigate for better reasoning performance.