Pith Number

pith:36DOMX76

pith:2025:36DOMX76H2RUANZKM4FBMUQPNQ

not attested not anchored not stored refs resolved

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Chengxi Li, Chongxuan Li, Da Zheng, Guoshan Lu, Huabin Liu, Jianfeng Tan, Jianguo Li, Jiaqi Hu, Ji-Rong Wen, Junbo Zhao, Junlin Zhou, Jun Zhou, Kun Chen, Lanning Wei, Lin Liu, Liwang Zhu, Lun Du, Maosong Cao, Mingliang Gong, Tiwei Bie, Xiaocheng Lu, Xiaolu Zhang, Yanmei Gu, Yihong Zhuang, Yipeng Xing, Yuxin Ma, Zehuan Li, Zenan Huang, Zhanchao Zhou, Zhenzhong Lan, Zhuochen Gong

LLaDA2.0 converts pre-trained auto-regressive LLMs into discrete diffusion models at 100B scale using a three-phase block-level training scheme.

arxiv:2512.15745 v2 · 2025-12-10 · cs.LG · cs.AI · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{36DOMX76H2RUANZKM4FBMUQPNQ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

LLaDA2.0 establishes a new paradigm for frontier-scale deployment of discrete diffusion LLMs by systematic conversion from AR models through a novel 3-phase block-level WSD training scheme, delivering superior performance and efficiency at 100B scale.

C2weakest assumption

That the 3-phase progressive block-size WSD training scheme successfully transfers knowledge from the original AR model while preserving parallel decoding advantages without introducing performance degradation at 100B scale.

C3one line summary

LLaDA2.0 scales discrete diffusion language models to 100B parameters via systematic conversion from autoregressive models using a 3-phase WSD training scheme and releases open-source 16B and 100B MoE variants.

References

43 extracted · 43 resolved · 25 Pith anchors

[1] Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models · arXiv:2503.09573

[2] Program Synthesis with Large Language Models · arXiv:2108.07732

[3] Evaluating Large Language Models Trained on Code · arXiv:2107.03374

[4] Dpad: Efficient diffusion language models with suffix dropout

[5] Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge · arXiv:1803.05457

Formal links

2 machine-checked theorem links

Cited by

31 papers in Pith

Elastic-dLLM: Position Preserving Context Compression and Augmentation of Diffusion LLMs

TIDE: Efficient and Lossless MoE Diffusion LLM Inference with I/O-aware Expert Offload

DMax: Aggressive Parallel Decoding for dLLMs

A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

Factorization-Error-Free Discrete Diffusion Language Model via Speculative Decoding

Receipt and verification

First computed	2026-05-17T23:39:22.139494Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

df86e65ffe3ea340372a670a16520f6c20e6da4f5d44d77bc018f94f16709442

Aliases

arxiv: 2512.15745 · arxiv_version: 2512.15745v2 · doi: 10.48550/arxiv.2512.15745 · pith_short_12: 36DOMX76H2RU · pith_short_16: 36DOMX76H2RUANZK · pith_short_8: 36DOMX76

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/36DOMX76H2RUANZKM4FBMUQPNQ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: df86e65ffe3ea340372a670a16520f6c20e6da4f5d44d77bc018f94f16709442

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "b5599606fe0297b25d755d01b81235c4ce8d24568647ceeb70fe01372922fac2",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2025-12-10T09:26:18Z",
    "title_canon_sha256": "7111e055cdf6f5466d217444139bf7dd16339070ce0b798b52a7d85a3ac7c50b"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2512.15745",
    "kind": "arxiv",
    "version": 2
  }
}