Pith Number

pith:IIEAK72Y

pith:2025:IIEAK72YB67YYZTRLIIH2KDEWJ

not attested not anchored not stored refs resolved

SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Daniel Fried, Gabriel Synnaeve, Jade Copet, Lingming Zhang, Olivier Duchenne, Quentin Carbonneaux, Rishabh Singh, Sida I. Wang, Yuxiang Wei

Reinforcement learning on open software evolution data enables LLMs to recover developer reasoning and solve 41% of real GitHub issues.

arxiv:2502.18449 v2 · 2025-02-25 · cs.SE · cs.AI · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{IIEAK72YB67YYZTRLIIH2KDEWJ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

our resulting reasoning model, Llama3-SWE-RL-70B, achieves a 41.0% solve rate on SWE-bench Verified -- a human-verified collection of real-world GitHub issues. To our knowledge, this is the best performance reported for medium-sized (<100B) LLMs to date, even comparable to leading proprietary LLMs like GPT-4o.

C2weakest assumption

The assumption that a lightweight rule-based similarity score between ground-truth and generated solutions serves as an effective reward for learning genuine reasoning processes rather than superficial pattern matching.

C3one line summary

SWE-RL uses RL on software evolution data to train LLMs achieving 41% on SWE-bench Verified with generalization to other reasoning tasks.

References

192 extracted · 192 resolved · 15 Pith anchors

[1] Claude 3.5 sonnet model card addendum 2024

[2] Raising the bar on SWE-bench Verified with Claude 3.5 Sonnet 2024

[4] Codet: Code generation with generated tests 2023

[5] Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Mich 2021

[6] Meta large language model compiler: Foundation models of compiler optimization, 2024 2024

Formal links

3 machine-checked theorem links

Cited by

31 papers in Pith

A Survey of Scaling in Large Language Model Reasoning

Spreadsheet-RL: Advancing Large Language Model Agents on Realistic Spreadsheet Tasks via Reinforcement Learning

Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning

AstraFlow: Dataflow-Oriented Reinforcement Learning for Agentic LLMs

HydroAgent: Closing the Gap Between Frontier LLMs and Human Experts in Hydrologic Model Calibration via Simulator-Grounded RL

Receipt and verification

First computed	2026-05-17T23:38:52.780436Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

4208057f580fbf8c66715a107d2864b2548c8bef57c8e7cc713325e3ddea90f6

Aliases

arxiv: 2502.18449 · arxiv_version: 2502.18449v2 · doi: 10.48550/arxiv.2502.18449 · pith_short_12: IIEAK72YB67Y · pith_short_16: IIEAK72YB67YYZTR · pith_short_8: IIEAK72Y

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/IIEAK72YB67YYZTRLIIH2KDEWJ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4208057f580fbf8c66715a107d2864b2548c8bef57c8e7cc713325e3ddea90f6

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "730dfc0fa178a079f7ed7e2f80a86ff8ac9f804908893c4a1853b3b8bb5d84b9",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.SE",
    "submitted_at": "2025-02-25T18:45:04Z",
    "title_canon_sha256": "a91330707567ffa0ad37268e5f68258ff99c1316fd5dd035252f3e5e1a1cc008"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2502.18449",
    "kind": "arxiv",
    "version": 2
  }
}