Layer-wise representation alignment lets diffusion language models reuse semantic structures from frozen autoregressive models, accelerating training by up to 4x without architectural changes beyond the attention mask.
International Conference on Machine Learning , year =
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
CTEM unifies density estimation via a bounded energy-difference transform that yields a sample-only objective with constant target 1, recovering log p without partition functions or unbounded ratio regression.
ReflectDrive-2 combines masked discrete diffusion with RL-aligned self-editing to generate and refine driving trajectories, reaching 91.0 PDMS on NAVSIM camera-only and 94.8 in best-of-6.
citing papers explorer
-
Don't Retrain, Align: Adapting Autoregressive LMs to Diffusion LMs via Representation Alignment
Layer-wise representation alignment lets diffusion language models reuse semantic structures from frozen autoregressive models, accelerating training by up to 4x without architectural changes beyond the attention mask.
-
Constant-Target Energy Matching: A Unified Framework for Continuous and Discrete Density Estimation
CTEM unifies density estimation via a bounded energy-difference transform that yields a sample-only objective with constant target 1, recovering log p without partition functions or unbounded ratio regression.
-
ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving
ReflectDrive-2 combines masked discrete diffusion with RL-aligned self-editing to generate and refine driving trajectories, reaching 91.0 PDMS on NAVSIM camera-only and 94.8 in best-of-6.