pith:KQ3KH2EU
DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments
End-to-end RL training on the open web lets LLM agents outperform prompt and RAG baselines by up to 28.9 points while developing planning and self-reflection.
arxiv:2504.03160 v4 · 2025-04-04 · cs.AI · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{KQ3KH2EUT2DCCAHZ6FTOZNVZI2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
DeepResearcher achieves substantial improvements of up to 28.9 points over prompt engineering-based baselines and up to 7.2 points over RAG-based RL agents, with emergent cognitive behaviors including planning, cross-validation, self-reflection, and honesty.
That the multi-agent browsing architecture can reliably extract information from arbitrary real-world webpage structures at scale without introducing systematic biases or instability that would undermine the reported performance gains.
End-to-end RL in authentic web environments produces LLM research agents that outperform prompt-engineering and RAG-based baselines by up to 28.9 and 7.2 points respectively while exhibiting emergent planning, cross-validation, and self-reflection.
References
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.762485Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
5436a3e8949e862100f9f166ecb6b94699bef612eebd8a025c2452e9a6a41bd3
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/KQ3KH2EUT2DCCAHZ6FTOZNVZI2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 5436a3e8949e862100f9f166ecb6b94699bef612eebd8a025c2452e9a6a41bd3
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "5a6ac865085b2664dc78e29bc353e509efef8d2704b7e734cf0b98455ba9cff6",
"cross_cats_sorted": [
"cs.CL",
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.AI",
"submitted_at": "2025-04-04T04:41:28Z",
"title_canon_sha256": "7996dd2f6a35c010b6913abfcca9e480138c0124154c22fbbe4a9e4b99e57ace"
},
"schema_version": "1.0",
"source": {
"id": "2504.03160",
"kind": "arxiv",
"version": 4
}
}