R2LM combines causal attention with a reverse Mamba SSM sidecar to supply right-side context in dLLMs, claiming 2.4x-12.9x throughput gains over bidirectional dLLMs and 1.9x-2.9x over AR baselines while matching or exceeding quality.
Causal autoregressive diffusion language model.arXiv preprint arXiv:2601.22031,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.IR 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Bifocal Diffusion Language Models: Asymmetric Bidirectional Context for Parallel Generation
R2LM combines causal attention with a reverse Mamba SSM sidecar to supply right-side context in dLLMs, claiming 2.4x-12.9x throughput gains over bidirectional dLLMs and 1.9x-2.9x over AR baselines while matching or exceeding quality.