{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2025:2JJKI5UQGMKXPVPXS5K573WHRC","short_pith_number":"pith:2JJKI5UQ","schema_version":"1.0","canonical_sha256":"d252a47690331577d5f79755dfeec78889b87740c75130c472c25ff2b5a61c87","source":{"kind":"arxiv","id":"2504.21776","version":2},"attestation_state":"computed","paper":{"title":"WebThinker: Empowering Large Reasoning Models with Deep Research Capability","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"WebThinker lets large reasoning models search the web and draft reports autonomously during reasoning.","cross_cats":["cs.AI","cs.IR"],"primary_cat":"cs.CL","authors_text":"Guanting Dong, Hongjin Qian, Jiajie Jin, Ji-Rong Wen, Xiaoxi Li, Yongkang Wu, Yutao Zhu, Zhicheng Dou","submitted_at":"2025-04-30T16:25:25Z","abstract_excerpt":"Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders their ability to produce comprehensive research reports requiring synthesis of diverse web information. To address this, we propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate among web pages, and draft reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabl"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":false},"canonical_record":{"source":{"id":"2504.21776","kind":"arxiv","version":2},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.CL","submitted_at":"2025-04-30T16:25:25Z","cross_cats_sorted":["cs.AI","cs.IR"],"title_canon_sha256":"d518db51547776410cc9728128f4fd47506bcf25e3bad8c45c165fe41daa4a18","abstract_canon_sha256":"e99d68991c3e046a23e457994dc6f977a81a18aab14292ab79ad9980dc1b958f"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:46.874411Z","signature_b64":"8zsvoBnKdS7mJ3FJmaxVBc5X2nT1yoorLy5NNWuTaeNIPglWkZQbfB0sQ3mA24sSU+kNJRaqT0CD3CTqpcFuAg==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"d252a47690331577d5f79755dfeec78889b87740c75130c472c25ff2b5a61c87","last_reissued_at":"2026-05-17T23:38:46.873797Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:46.873797Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"WebThinker: Empowering Large Reasoning Models with Deep Research Capability","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"WebThinker lets large reasoning models search the web and draft reports autonomously during reasoning.","cross_cats":["cs.AI","cs.IR"],"primary_cat":"cs.CL","authors_text":"Guanting Dong, Hongjin Qian, Jiajie Jin, Ji-Rong Wen, Xiaoxi Li, Yongkang Wu, Yutao Zhu, Zhicheng Dou","submitted_at":"2025-04-30T16:25:25Z","abstract_excerpt":"Large reasoning models (LRMs), such as OpenAI-o1 and DeepSeek-R1, demonstrate impressive long-horizon reasoning capabilities. However, their reliance on static internal knowledge limits their performance on complex, knowledge-intensive tasks and hinders their ability to produce comprehensive research reports requiring synthesis of diverse web information. To address this, we propose WebThinker, a deep research agent that empowers LRMs to autonomously search the web, navigate among web pages, and draft reports during the reasoning process. WebThinker integrates a Deep Web Explorer module, enabl"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the Deep Web Explorer module can reliably locate, navigate, and extract accurate information from arbitrary web pages without introducing navigation errors or factual hallucinations that propagate into the final report.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"WebThinker equips large reasoning models with autonomous web exploration and interleaved reasoning-drafting via a Deep Web Explorer and RL-based DPO training, yielding gains on GPQA, GAIA, and report-generation benchmarks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"WebThinker lets large reasoning models search the web and draft reports autonomously during reasoning.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"3db15b16a255aa86d7c1b8f203a47221f75bfb09cac96dc1478706c5bea109bd"},"source":{"id":"2504.21776","kind":"arxiv","version":2},"verdict":{"id":"bbacda62-c9bb-4c50-932c-6f78389003ec","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T19:09:52.921915Z","strongest_claim":"Extensive experiments on complex reasoning benchmarks (GPQA, GAIA, WebWalkerQA, HLE) and scientific report generation tasks (Glaive) demonstrate that WebThinker significantly outperforms existing methods and strong proprietary systems.","one_line_summary":"WebThinker equips large reasoning models with autonomous web exploration and interleaved reasoning-drafting via a Deep Web Explorer and RL-based DPO training, yielding gains on GPQA, GAIA, and report-generation benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the Deep Web Explorer module can reliably locate, navigate, and extract accurate information from arbitrary web pages without introducing navigation errors or factual hallucinations that propagate into the final report.","pith_extraction_headline":"WebThinker lets large reasoning models search the web and draft reports autonomously during reasoning."},"references":{"count":91,"sample":[{"doi":"","year":2024,"title":"Self-rag: Learn- ing to retrieve, generate, and critique through self-reflection","work_id":"195a1a15-2803-42f1-bf5b-3ad204d104dc","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning","work_id":"cc9775d9-2fbd-4690-a641-2b50ae4a59dc","ref_index":2,"cited_arxiv_id":"2503.19470","is_internal_anchor":true},{"doi":"","year":2025,"title":"Towards Reasoning Era: A Survey of Long Chain-of-Thought for Reasoning Large Language Models","work_id":"0b361fed-cf2a-4b90-b61a-de88de4b8840","ref_index":3,"cited_arxiv_id":"2503.09567","is_internal_anchor":true},{"doi":"","year":2025,"title":"An empirical study on eliciting and improving r1-like reasoning models","work_id":"5d962724-fdfd-471c-b457-0c9fd4d5aa44","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Self-play with execution feedback: Improving instruction-following capabilities of large language models","work_id":"c7b45c2c-cd3d-468a-9501-0e7877cd44f8","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":91,"snapshot_sha256":"a461ebf681d1385ac3cba82c9c48b3191ad9face0c8748fc0eba458329c0bc6c","internal_anchors":23},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2504.21776","created_at":"2026-05-17T23:38:46.873899+00:00"},{"alias_kind":"arxiv_version","alias_value":"2504.21776v2","created_at":"2026-05-17T23:38:46.873899+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2504.21776","created_at":"2026-05-17T23:38:46.873899+00:00"},{"alias_kind":"pith_short_12","alias_value":"2JJKI5UQGMKX","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"2JJKI5UQGMKXPVPX","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"2JJKI5UQ","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":29,"internal_anchor_count":29,"sample":[{"citing_arxiv_id":"2505.04588","citing_title":"ZeroSearch: Incentivize the Search Capability of LLMs without Searching","ref_index":23,"is_internal_anchor":true},{"citing_arxiv_id":"2506.11060","citing_title":"Code Researcher: Deep Research Agent for Large Systems Code and Commit History","ref_index":20,"is_internal_anchor":true},{"citing_arxiv_id":"2605.21463","citing_title":"Mem-$\\pi$: Adaptive Memory through Learning When and What to Generate","ref_index":21,"is_internal_anchor":true},{"citing_arxiv_id":"2605.17561","citing_title":"Automated Root-Cause Subclassification and No-Code Fix Generation for Invalid Bug Reports","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2604.27859","citing_title":"Rethinking Agentic Reinforcement Learning In Large Language Models","ref_index":43,"is_internal_anchor":true},{"citing_arxiv_id":"2509.02547","citing_title":"The Landscape of Agentic Reinforcement Learning for LLMs: A Survey","ref_index":282,"is_internal_anchor":true},{"citing_arxiv_id":"2509.23330","citing_title":"Structured In-context Environment Scaling for Large Language Model Reasoning","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2509.08827","citing_title":"A Survey of Reinforcement Learning for Large Reasoning Models","ref_index":286,"is_internal_anchor":true},{"citing_arxiv_id":"2511.11793","citing_title":"MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2505.04588","citing_title":"ZeroSearch: Incentivize the Search Capability of LLMs without Searching","ref_index":23,"is_internal_anchor":true},{"citing_arxiv_id":"2507.02592","citing_title":"WebSailor: Navigating Super-human Reasoning for Web Agent","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2506.11763","citing_title":"DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2508.07407","citing_title":"A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems","ref_index":50,"is_internal_anchor":true},{"citing_arxiv_id":"2603.21440","citing_title":"KG-Hopper: Empowering Compact Open LLMs with Knowledge Graph Reasoning via Reinforcement Learning","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12518","citing_title":"TimelineReasoner: Advancing Timeline Summarization with Large Reasoning Models","ref_index":23,"is_internal_anchor":true},{"citing_arxiv_id":"2605.13034","citing_title":"ViDR: Grounding Multimodal Deep Research Reports in Source Visual Evidence","ref_index":14,"is_internal_anchor":true},{"citing_arxiv_id":"2604.01348","citing_title":"Procedural Knowledge at Scale Improves Reasoning","ref_index":17,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04017","citing_title":"GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces","ref_index":26,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12004","citing_title":"Learning Agentic Policy from Action Guidance","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2604.27859","citing_title":"Rethinking Agentic Reinforcement Learning In Large Language Models","ref_index":43,"is_internal_anchor":true},{"citing_arxiv_id":"2604.27221","citing_title":"Web2BigTable: A Bi-Level Multi-Agent LLM System for Internet-Scale Information Search and Extraction","ref_index":12,"is_internal_anchor":true},{"citing_arxiv_id":"2605.10530","citing_title":"Personalized Deep Research: A User-Centric Framework, Dataset, and Hybrid Evaluation for Knowledge Discovery","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25256","citing_title":"AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery","ref_index":43,"is_internal_anchor":true},{"citing_arxiv_id":"2604.27859","citing_title":"Rethinking Agentic Reinforcement Learning In Large Language Models","ref_index":43,"is_internal_anchor":true},{"citing_arxiv_id":"2604.20486","citing_title":"ProMMSearchAgent: A Generalizable Multimodal Search Agent Trained with Process-Oriented Rewards","ref_index":19,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC","json":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC.json","graph_json":"https://pith.science/api/pith-number/2JJKI5UQGMKXPVPXS5K573WHRC/graph.json","events_json":"https://pith.science/api/pith-number/2JJKI5UQGMKXPVPXS5K573WHRC/events.json","paper":"https://pith.science/paper/2JJKI5UQ"},"agent_actions":{"view_html":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC","download_json":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC.json","view_paper":"https://pith.science/paper/2JJKI5UQ","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2504.21776&json=true","fetch_graph":"https://pith.science/api/pith-number/2JJKI5UQGMKXPVPXS5K573WHRC/graph.json","fetch_events":"https://pith.science/api/pith-number/2JJKI5UQGMKXPVPXS5K573WHRC/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC/action/timestamp_anchor","attest_storage":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC/action/storage_attestation","attest_author":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC/action/author_attestation","sign_citation":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC/action/citation_signature","submit_replication":"https://pith.science/pith/2JJKI5UQGMKXPVPXS5K573WHRC/action/replication_record"}},"created_at":"2026-05-17T23:38:46.873899+00:00","updated_at":"2026-05-17T23:38:46.873899+00:00"}