Mercury Coder diffusion LLMs achieve throughputs of 1109 and 737 tokens per second on H100 GPUs, up to 10x faster than frontier models with comparable quality.
Likelihood-based diffusion language models
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Mercury: Ultra-Fast Language Models Based on Diffusion
Mercury Coder diffusion LLMs achieve throughputs of 1109 and 737 tokens per second on H100 GPUs, up to 10x faster than frontier models with comparable quality.