A 130M-parameter continuous bitstream diffusion model with entropy-gated Langevin sampling achieves GenPPL 59.76 on LM1B and 27.06 on OWT, closing the gap to autoregressive models at matched entropy with 256 NFEs.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Categorical flow matching models scale to 1.7B parameters on 2.1T tokens, enabling 4-step text generation with competitive quality and benchmark performance.
citing papers explorer
-
Towards Closing the Autoregressive Gap in Language Modeling via Entropy-Gated Continuous Bitstream Diffusion
A 130M-parameter continuous bitstream diffusion model with entropy-gated Langevin sampling achieves GenPPL 59.76 on LM1B and 27.06 on OWT, closing the gap to autoregressive models at matched entropy with 256 NFEs.
-
Scaling Categorical Flow Maps
Categorical flow matching models scale to 1.7B parameters on 2.1T tokens, enabling 4-step text generation with competitive quality and benchmark performance.