pith:I5GZ6MFW
SWE-Chain: Benchmarking Coding Agents on Chained Release-Level Package Upgrades
Coding agents resolve an average of 44.8 percent of chained release-level package upgrades while preserving prior functionality.
arxiv:2605.14415 v1 · 2026-05-14 · cs.SE · cs.AI · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{I5GZ6MFWK7FFEIOWPADQLIOKE2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Across nine frontier agent-model configurations, agents achieve an average of 44.8% resolving, 65.4% precision, and 50.2% F1 under the Build+Fix regime, with Claude-Opus-4.7 leading at 60.8% resolving, and current agents still struggle to make correct upgrades across chained package releases without breaking existing functionality.
The divide-and-conquer synthesis pipeline produces upgrade specifications that are both grounded in actual code changes and feasible for agents to implement without introducing artificial simplifications that do not occur in real maintenance.
SWE-Chain provides 155 chained version transitions and 1,660 requirements across 9 Python packages, where frontier agents resolve 44.8% of tasks on average and struggle to preserve functionality across releases.
References
Receipt and verification
| First computed | 2026-05-17T23:39:07.314178Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
474d9f30b657ca5221d6780705a1ca269b60eecb7ae13c4f9d51845fcff4cd23
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/I5GZ6MFWK7FFEIOWPADQLIOKE2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 474d9f30b657ca5221d6780705a1ca269b60eecb7ae13c4f9d51845fcff4cd23
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "d07b6fa4a78b1bd684a2c3c1593038313b95af421cf353c5eac080d737a608cf",
"cross_cats_sorted": [
"cs.AI",
"cs.CL"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.SE",
"submitted_at": "2026-05-14T06:04:40Z",
"title_canon_sha256": "d29a8e6f072e3bfa4839c7409bdd0d6d6e603043268be74ba3cb087ad580d879"
},
"schema_version": "1.0",
"source": {
"id": "2605.14415",
"kind": "arxiv",
"version": 1
}
}