pith. sign in
Pith Number

pith:UPZLLDUD

pith:2026:UPZLLDUDH7UTJV2O7W7U74YNJH
not attested not anchored not stored refs resolved

A Few GPUs, A Whole Lotta Scale: Faithful LLM Training Emulation with PrismLLM

Boyi Jia, Brian Sutioso, ChonLam Lao, Ennan Zhai, Erci Xu, Jiamin Cao, Jiaqi Gao, Jingren Zhou, Kui Ren, Minlan Yu, Shaoke Xi, Yong Li, Zhengping Qian, Zhipeng Zhang

PrismLLM emulates 8192-GPU LLM training using fewer than 1% of the GPUs with 0.58% average iteration time error.

arxiv:2605.15617 v1 · 2026-05-15 · cs.DC · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UPZLLDUDH7UTJV2O7W7U74YNJH}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

PrismLLM accurately reproduces performance and memory behavior, achieving only 0.58% average error in iteration time and less than 0.01% error in peak GPU memory usage. PrismLLM can emulate clusters of up to 8192 GPUs using fewer than 1% of the physical GPUs required by the original deployment.

C2weakest assumption

The slicing-based construction of the high-fidelity execution graph fully captures computation, communication, and dependencies at the target scale such that hybrid emulation of selected ranks produces faithful large-scale behavior without missing scale-dependent effects.

C3one line summary

PrismLLM constructs a sliced execution graph and uses hybrid emulation to faithfully reproduce performance and memory behavior of up to 8192-GPU LLM training runs on fewer than 1% of the original GPUs.

References

37 extracted · 37 resolved · 7 Pith anchors

[1] IEEE Computer Society, 338–351 2024 · doi:10.1109/micro61859.2024.00021
[2] Flux: Fast software-based communication overlap on gpus through kernel fusion.arXiv preprint arXiv:2406.06858 2024
[3] CRIU Project Developers. 2026. Github - CRIU: Checkpoint/Restore In Userspace.https://github.com/checkpoint-restore/criu. (2026) 2026
[4] FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning 2023 · arXiv:2307.08691
[5] Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-20T00:01:08.426613Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

a3f2b58e833fe934d74efdbf4ff30d49d263130d3cfba572056945e38e66e9f5

Aliases

arxiv: 2605.15617 · arxiv_version: 2605.15617v1 · doi: 10.48550/arxiv.2605.15617 · pith_short_12: UPZLLDUDH7UT · pith_short_16: UPZLLDUDH7UTJV2O · pith_short_8: UPZLLDUD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UPZLLDUDH7UTJV2O7W7U74YNJH \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a3f2b58e833fe934d74efdbf4ff30d49d263130d3cfba572056945e38e66e9f5
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "8e81e68e1ce152e7814a552d4c12e61b0f44f5dee4dfa4372449a2193be8c481",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.DC",
    "submitted_at": "2026-05-15T04:58:20Z",
    "title_canon_sha256": "2e2370e667ba6379fb2fd1587171bbeb379a059696531bd859c20721b31ce0f4"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15617",
    "kind": "arxiv",
    "version": 1
  }
}