pith. sign in
Pith Number

pith:2JJKI5UQ

pith:2025:2JJKI5UQGMKXPVPXS5K573WHRC
not attested not anchored not stored refs resolved

WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Guanting Dong, Hongjin Qian, Jiajie Jin, Ji-Rong Wen, Xiaoxi Li, Yongkang Wu, Yutao Zhu, Zhicheng Dou

WebThinker lets large reasoning models search the web and draft reports autonomously during reasoning.

arxiv:2504.21776 v2 · 2025-04-30 · cs.CL · cs.AI · cs.IR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{2JJKI5UQGMKXPVPXS5K573WHRC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems.

C2weakest assumption

That the Deep Web Explorer module can reliably locate, navigate, and extract accurate information from arbitrary web pages without introducing navigation errors or factual hallucinations that propagate into the final report.

C3one line summary

WebThinker equips large reasoning models with autonomous web exploration and interleaved reasoning-drafting via a Deep Web Explorer and RL-based DPO training, yielding gains on GPQA, GAIA, and report-generation benchmarks.

References

91 extracted · 91 resolved · 23 Pith anchors

[1] Self-rag: Learn- ing to retrieve, generate, and critique through self-reflection 2024
[2] ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning 2025 · arXiv:2503.19470
[3] Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models 2025 · arXiv:2503.09567
[4] An empirical study on eliciting and improving r1-like reasoning models 2025
[5] Self-play with execution feedback: Improving instruction-following capabilities of large language models 2025

Cited by

29 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:46.873797Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

d252a47690331577d5f79755dfeec78889b87740c75130c472c25ff2b5a61c87

Aliases

arxiv: 2504.21776 · arxiv_version: 2504.21776v2 · doi: 10.48550/arxiv.2504.21776 · pith_short_12: 2JJKI5UQGMKX · pith_short_16: 2JJKI5UQGMKXPVPX · pith_short_8: 2JJKI5UQ
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d252a47690331577d5f79755dfeec78889b87740c75130c472c25ff2b5a61c87
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "e99d68991c3e046a23e457994dc6f977a81a18aab14292ab79ad9980dc1b958f",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.IR"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2025-04-30T16:25:25Z",
    "title_canon_sha256": "d518db51547776410cc9728128f4fd47506bcf25e3bad8c45c165fe41daa4a18"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2504.21776",
    "kind": "arxiv",
    "version": 2
  }
}