pith. sign in
Pith Number

pith:ZUALWONY

pith:2026:ZUALWONYSBKNVUQLQCL72YDU2B
not attested not anchored not stored refs pending

Toward Autonomous Long-Horizon Engineering for ML Research

Cheng Chen, Fanzhe Meng, Guoxin Chen, Jiale Zhao, Jie Chen, Ji-Rong Wen, Kai Jia, Lei Chen, Ruihua Song, Wayne Xin Zhao

AiScientist achieves higher performance on long-horizon ML research benchmarks by using hierarchical orchestration and a File-as-Bus workspace.

arxiv:2604.13018 v2 · 2026-04-14 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ZUALWONYSBKNVUQLQCL72YDU2B}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

AiScientist improves PaperBench score by 10.54 points on average over the best matched baseline and achieves 81.82 Any Medal% on MLE-Bench Lite. Ablation studies further show that File-as-Bus protocol is a key driver of performance, reducing PaperBench by 6.41 points and MLE-Bench Lite by 31.82 points when removed.

C2weakest assumption

That the chosen benchmarks (PaperBench and MLE-Bench Lite) accurately capture real long-horizon ML research engineering capability and that the reported gains are attributable to the hierarchical orchestration plus File-as-Bus design rather than implementation details or baseline mismatches.

C3one line summary

AiScientist improves ML research benchmarks by 10.54 points on PaperBench and reaches 81.82% Any Medal on MLE-Bench Lite through hierarchical control plus durable file-based state instead of conversational handoffs.

Cited by

3 papers in Pith

Receipt and verification
First computed 2026-05-27T01:05:54.535357Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

cd00bb39b89054dad20b8097fd6074d0615f2fa78728c783273ad7ec23d8df99

Aliases

arxiv: 2604.13018 · arxiv_version: 2604.13018v2 · doi: 10.48550/arxiv.2604.13018 · pith_short_12: ZUALWONYSBKN · pith_short_16: ZUALWONYSBKNVUQL · pith_short_8: ZUALWONY
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZUALWONYSBKNVUQLQCL72YDU2B \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: cd00bb39b89054dad20b8097fd6074d0615f2fa78728c783273ad7ec23d8df99
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "99df1d90f1a022d32d8037f6224c8375d0ee7e19af4a719db1a76f71656c821f",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-04-14T17:55:16Z",
    "title_canon_sha256": "eda5ff3025ceba1aad9cca7f65e63e4adb969b5d4c5d1deff099ab1f62b8d9ae"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.13018",
    "kind": "arxiv",
    "version": 2
  }
}