Pith Number

pith:DMR7KHEB

pith:2026:DMR7KHEBBJSJOQL6E7ISWMVEK4

not attested not anchored not stored refs resolved

FrontierSmith: Synthesizing Open-Ended Coding Problems at Scale

Alex Dimakis, Alvin Cheung, Bo Peng, Hanchen Li, Huanzhi Mao, Jingbo Shang, Joseph E. Gonzalez, Kaiyuan Liu, Lufeng Cheng, Qiuyang Mang, Qizheng Zhang, Runyuan He, Shang Zhou, Tianfu Fu, Wenhao Chai, Yichuan Wang, Zerui Li

An automated system evolves closed-ended competitive programming tasks into open-ended coding problems and uses the resulting data to train stronger LLM coders.

arxiv:2605.14445 v1 · 2026-05-14 · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{DMR7KHEBBJSJOQL6E7ISWMVEK4}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

training on our synthesized data yields substantial gains over the base models: Qwen3.5-9B improves by +8.82 score on FrontierCS and +306.36 (Elo-rating-based performance) on ALE-bench; Qwen3.5-27B improves by +12.12 and +309.12, respectively.

C2weakest assumption

The quantitative idea divergence metric reliably selects problems that elicit genuinely diverse solution approaches from different solvers, and the automatically generated test cases and verifiers are sufficiently robust to support training.

C3one line summary

FrontierSmith automates synthesis of open-ended coding problems from closed-ended seeds and shows measurable gains on two open-ended LLM coding benchmarks.

References

43 extracted · 43 resolved · 15 Pith anchors

[1] Bengt Aspvall, Michael F Plass, and Robert Endre Tarjan 2026

[2] Swe-rebench: An automated pipeline for task collection and decontaminated evaluation of software engineering agents

[3] Scaling Self-Play with Self-Guidance · arXiv:2604.20209

[4] K-search: Llm kernel generation via Page 62 of 110 Evaluation-driven Scaling for Scientific Discovery co-evolving intrinsic world model

[5] Adae- volve: Adaptive llm driven zeroth-order optimization

Receipt and verification

First computed	2026-05-17T23:39:06.965948Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

1b23f51c810a6497417e27d12b32a45714c44fd7a3a0067bb6c494ee1ee46572

Aliases

arxiv: 2605.14445 · arxiv_version: 2605.14445v1 · doi: 10.48550/arxiv.2605.14445 · pith_short_12: DMR7KHEBBJSJ · pith_short_16: DMR7KHEBBJSJOQL6 · pith_short_8: DMR7KHEB

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/DMR7KHEBBJSJOQL6E7ISWMVEK4 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1b23f51c810a6497417e27d12b32a45714c44fd7a3a0067bb6c494ee1ee46572

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "90cea9fd00b568405153a47deeda6843b2d03473e5db4c23554a02208472e9fb",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-14T06:39:42Z",
    "title_canon_sha256": "b21fd42d8f615d8f6a2477476abf9cf330659f87a65c72cf4be33b55f6263dfa"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14445",
    "kind": "arxiv",
    "version": 1
  }
}