pith. sign in
Pith Number

pith:AARY37WB

pith:2026:AARY37WBSXTPWMW5TVTXEYHQQX
not attested not anchored not stored refs resolved

PerfCodeBench: Benchmarking LLMs for System-Level High-Performance Code Optimization

Hanyu Yang, Haochen Shi, Haoran Li, Huihao Jing, Shaojin Chen, Sirui Zhang, Wenbin Hu, Yangqiu Song

Current LLMs produce code that is functionally correct but far from expert-optimized on system-level performance tasks.

arxiv:2605.15222 v1 · 2026-05-13 · cs.SE · cs.CL · cs.PL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AARY37WBSXTPWMW5TVTXEYHQQX}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our evaluation on a broad set of state-of-the-art LLMs shows a clear gap between model-generated code and expert-optimized implementations. The gap is especially large on tasks involving parallelism and GPU operations.

C2weakest assumption

The selected tasks accurately capture realistic system-level implementation choices, hardware-aware optimizations, and performance bottlenecks that matter in practice.

C3one line summary

PerfCodeBench reveals that state-of-the-art LLMs produce functionally correct but significantly slower code than expert-optimized versions on system-level tasks, especially those involving parallelism and GPUs.

References

52 extracted · 52 resolved · 4 Pith anchors

[1] Introducing Claude Opus 4.5 2025
[2] Claude Code Overview 2026
[3] Claude model overview 2026
[4] Understanding software engineering agents: A study of thought-action-result trajectories 2025
[5] ByteDance Seed. Seed2.0.https://seed.bytedance.com/en/seed2, 2026 2026
Receipt and verification
First computed 2026-05-20T00:00:47.041876Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

00238dfec195e6fb32dd9d677260f085f8cad29fd1b6df629cba53839b9cfbfb

Aliases

arxiv: 2605.15222 · arxiv_version: 2605.15222v1 · doi: 10.48550/arxiv.2605.15222 · pith_short_12: AARY37WBSXTP · pith_short_16: AARY37WBSXTPWMW5 · pith_short_8: AARY37WB
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AARY37WBSXTPWMW5TVTXEYHQQX \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 00238dfec195e6fb32dd9d677260f085f8cad29fd1b6df629cba53839b9cfbfb
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "3379dd3feb7dc3a4a4c597fdbe4e5579af83915fd86765d7a61389721213c942",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.PL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.SE",
    "submitted_at": "2026-05-13T08:10:26Z",
    "title_canon_sha256": "73db000b4f0a474b1bde5d52111578c40706987ac468084b07681e8f9b247a65"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15222",
    "kind": "arxiv",
    "version": 1
  }
}