{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2025:LZ6OY7ZG7SCVNYQAJRQHISO6OO","short_pith_number":"pith:LZ6OY7ZG","schema_version":"1.0","canonical_sha256":"5e7cec7f26fc8556e2004c607449de739f0c8b28b1a00a7814fc440c8c89c788","source":{"kind":"arxiv","id":"2507.02592","version":1},"attestation_state":"computed","paper":{"title":"WebSailor: Navigating Super-human Reasoning for Web Agent","license":"http://creativecommons.org/publicdomain/zero/1.0/","headline":"WebSailor equips open-source models with the ability to reduce extreme uncertainty in web navigation, allowing them to match proprietary agents on complex information-seeking tasks.","cross_cats":["cs.AI"],"primary_cat":"cs.CL","authors_text":"Baixuan Li, Dingchu Zhang, Fei Huang, Huifeng Yin, Jialong Wu, Jingren Zhou, Junkai Zhang, Kuan Li, Litu Ou, Liwen Zhang, Ming Yan, Pengjun Xie, Weizhou Shen, Wenbiao Yin, Xinyu Wang, Xixi Wu, Yong Jiang, Zhengwei Tao, Zhongwang Zhang","submitted_at":"2025-07-03T12:59:07Z","abstract_excerpt":"Transcending human cognitive limitations represents a critical frontier in LLM training. Proprietary agentic systems like DeepResearch have demonstrated superhuman capabilities on extremely complex information-seeking benchmarks such as BrowseComp, a feat previously unattainable. We posit that their success hinges on a sophisticated reasoning pattern absent in open-source models: the ability to systematically reduce extreme uncertainty when navigating vast information landscapes. Based on this insight, we introduce WebSailor, a complete post-training methodology designed to instill this crucia"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":true,"formal_links_present":true},"canonical_record":{"source":{"id":"2507.02592","kind":"arxiv","version":1},"metadata":{"license":"http://creativecommons.org/publicdomain/zero/1.0/","primary_cat":"cs.CL","submitted_at":"2025-07-03T12:59:07Z","cross_cats_sorted":["cs.AI"],"title_canon_sha256":"79e675065d2edfb74313555e238a8c2d9770828a0d4661060612556a3673e1bf","abstract_canon_sha256":"c08830c2337f29570f837f0009ab613f6836d0547c3ee775478b5746678a4a52"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-17T23:38:13.685627Z","signature_b64":"TS86XUONVsIpAVCMr3ZZtU9NAEJ/dUm/LIlD+kXfz7PZH4grVJ2z9c6EUO6W1vk+hdR8/kqm3skyVxz+6ddUAA==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"5e7cec7f26fc8556e2004c607449de739f0c8b28b1a00a7814fc440c8c89c788","last_reissued_at":"2026-05-17T23:38:13.684893Z","signature_status":"signed_v1","first_computed_at":"2026-05-17T23:38:13.684893Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"WebSailor: Navigating Super-human Reasoning for Web Agent","license":"http://creativecommons.org/publicdomain/zero/1.0/","headline":"WebSailor equips open-source models with the ability to reduce extreme uncertainty in web navigation, allowing them to match proprietary agents on complex information-seeking tasks.","cross_cats":["cs.AI"],"primary_cat":"cs.CL","authors_text":"Baixuan Li, Dingchu Zhang, Fei Huang, Huifeng Yin, Jialong Wu, Jingren Zhou, Junkai Zhang, Kuan Li, Litu Ou, Liwen Zhang, Ming Yan, Pengjun Xie, Weizhou Shen, Wenbiao Yin, Xinyu Wang, Xixi Wu, Yong Jiang, Zhengwei Tao, Zhongwang Zhang","submitted_at":"2025-07-03T12:59:07Z","abstract_excerpt":"Transcending human cognitive limitations represents a critical frontier in LLM training. Proprietary agentic systems like DeepResearch have demonstrated superhuman capabilities on extremely complex information-seeking benchmarks such as BrowseComp, a feat previously unattainable. We posit that their success hinges on a sophisticated reasoning pattern absent in open-source models: the ability to systematically reduce extreme uncertainty when navigating vast information landscapes. Based on this insight, we introduce WebSailor, a complete post-training methodology designed to instill this crucia"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"With this integrated pipeline, WebSailor significantly outperforms all opensource agents in complex information-seeking tasks, matching proprietary agents' performance and closing the capability gap.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"Their success hinges on a sophisticated reasoning pattern absent in open-source models: the ability to systematically reduce extreme uncertainty when navigating vast information landscapes.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"WebSailor trains open-source web agents to match proprietary performance on complex information-seeking tasks by generating high-uncertainty scenarios and using a new RL method called DUPO.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"WebSailor equips open-source models with the ability to reduce extreme uncertainty in web navigation, allowing them to match proprietary agents on complex information-seeking tasks.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"dd9adb24443e3f43e6e01095010d2c4dfa43d4e5d0a6b8d501aaa004633717dc"},"source":{"id":"2507.02592","kind":"arxiv","version":1},"verdict":{"id":"3124949b-8443-42c8-8b53-449fc068e563","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-17T15:30:30.256512Z","strongest_claim":"With this integrated pipeline, WebSailor significantly outperforms all opensource agents in complex information-seeking tasks, matching proprietary agents' performance and closing the capability gap.","one_line_summary":"WebSailor trains open-source web agents to match proprietary performance on complex information-seeking tasks by generating high-uncertainty scenarios and using a new RL method called DUPO.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"Their success hinges on a sophisticated reasoning pattern absent in open-source models: the ability to systematically reduce extreme uncertainty when navigating vast information landscapes.","pith_extraction_headline":"WebSailor equips open-source models with the ability to reduce extreme uncertainty in web navigation, allowing them to match proprietary agents on complex information-seeking tasks."},"references":{"count":34,"sample":[{"doi":"","year":null,"title":"Concrete Problems in AI Safety","work_id":"c8d14fbe-6eab-464a-95b3-778aabd82fa3","ref_index":1,"cited_arxiv_id":"1606.06565","is_internal_anchor":true},{"doi":"","year":null,"title":"FireAct: Toward language agent fine-tuning.arXiv preprint arXiv:2310.05915","work_id":"382904a8-a33f-42d0-ae77-b61c9f0cb7cb","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":null,"title":"SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models","work_id":"a521360c-8673-4d0d-a3a3-6eb9f7a71b90","ref_index":3,"cited_arxiv_id":"2504.11468","is_internal_anchor":true},{"doi":"","year":null,"title":"Evaluating Large Language Models Trained on Code","work_id":"042493e9-b26f-4b4e-bbde-382072ca9b08","ref_index":4,"cited_arxiv_id":"2107.03374","is_internal_anchor":true},{"doi":"","year":null,"title":"SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training","work_id":"258dd934-025c-47f5-b4f6-5a0c1c338cc6","ref_index":5,"cited_arxiv_id":"2501.17161","is_internal_anchor":true}],"resolved_work":34,"snapshot_sha256":"e673bf4ccaf2c5b81bcc60a4c933c791b642167d29806c992a15692a852cf8a7","internal_anchors":21},"formal_canon":{"evidence_count":3,"snapshot_sha256":"05e5473653a6dff9cc01c86aff6a6859df22eeb2a24dbfd27d334ea5e71cbcf6"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2507.02592","created_at":"2026-05-17T23:38:13.685012+00:00"},{"alias_kind":"arxiv_version","alias_value":"2507.02592v1","created_at":"2026-05-17T23:38:13.685012+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2507.02592","created_at":"2026-05-17T23:38:13.685012+00:00"},{"alias_kind":"pith_short_12","alias_value":"LZ6OY7ZG7SCV","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"LZ6OY7ZG7SCVNYQA","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"LZ6OY7ZG","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":28,"internal_anchor_count":28,"sample":[{"citing_arxiv_id":"2605.22138","citing_title":"Efficient Agentic Reasoning Through Self-Regulated Simulative Planning","ref_index":110,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16217","citing_title":"Argus: Evidence Assembly for Scalable Deep Research Agents","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2605.16217","citing_title":"Argus: Evidence Assembly for Scalable Deep Research Agents","ref_index":30,"is_internal_anchor":true},{"citing_arxiv_id":"2605.18181","citing_title":"Scalable Environments Drive Generalizable Agents","ref_index":19,"is_internal_anchor":true},{"citing_arxiv_id":"2605.15224","citing_title":"ICRL: Learning to Internalize Self-Critique with Reinforcement Learning","ref_index":10,"is_internal_anchor":true},{"citing_arxiv_id":"2509.02547","citing_title":"The Landscape of Agentic Reinforcement Learning for LLMs: A Survey","ref_index":113,"is_internal_anchor":true},{"citing_arxiv_id":"2509.07969","citing_title":"Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search","ref_index":19,"is_internal_anchor":true},{"citing_arxiv_id":"2511.02805","citing_title":"MemSearcher: Training LLMs to Reason, Search and Manage Memory via End-to-End Reinforcement Learning","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2509.08827","citing_title":"A Survey of Reinforcement Learning for Large Reasoning Models","ref_index":278,"is_internal_anchor":true},{"citing_arxiv_id":"2511.06101","citing_title":"SynthAgent: Adapting Web Agents with Synthetic Supervision","ref_index":3,"is_internal_anchor":true},{"citing_arxiv_id":"2511.11793","citing_title":"MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling","ref_index":13,"is_internal_anchor":true},{"citing_arxiv_id":"2601.12538","citing_title":"Agentic Reasoning for Large Language Models","ref_index":41,"is_internal_anchor":true},{"citing_arxiv_id":"2601.15808","citing_title":"Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2508.05748","citing_title":"WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent","ref_index":16,"is_internal_anchor":true},{"citing_arxiv_id":"2603.15262","citing_title":"Probe-then-Plan: Environment-Aware Planning for Industrial E-commerce Search","ref_index":11,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12882","citing_title":"CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.04017","citing_title":"GeoBrowse: A Geolocation Benchmark for Agentic Tool Use with Expert-Annotated Reasoning Traces","ref_index":23,"is_internal_anchor":true},{"citing_arxiv_id":"2509.02544","citing_title":"UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning","ref_index":31,"is_internal_anchor":true},{"citing_arxiv_id":"2605.12004","citing_title":"Learning Agentic Policy from Action Guidance","ref_index":29,"is_internal_anchor":true},{"citing_arxiv_id":"2604.25256","citing_title":"AutoResearchBench: Benchmarking AI Agents on Complex Scientific Literature Discovery","ref_index":19,"is_internal_anchor":true},{"citing_arxiv_id":"2604.24978","citing_title":"Don\\'t Stop Early: Scalable Enterprise Deep Research with Controlled Information Flow and Evidence-Aware Termination","ref_index":2,"is_internal_anchor":true},{"citing_arxiv_id":"2605.05191","citing_title":"LongSeeker: Elastic Context Orchestration for Long-Horizon Search Agents","ref_index":5,"is_internal_anchor":true},{"citing_arxiv_id":"2605.01489","citing_title":"SciResearcher: Scaling Deep Research Agents for Frontier Scientific Reasoning","ref_index":18,"is_internal_anchor":true},{"citing_arxiv_id":"2604.07720","citing_title":"Towards Knowledgeable Deep Research: Framework and Benchmark","ref_index":17,"is_internal_anchor":true},{"citing_arxiv_id":"2604.06777","citing_title":"Walk the Talk: Bridging the Reasoning-Action Gap for Thinking with Images via Multimodal Agentic Policy Optimization","ref_index":49,"is_internal_anchor":true}]},"formal_canon":{"evidence_count":3,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO","json":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO.json","graph_json":"https://pith.science/api/pith-number/LZ6OY7ZG7SCVNYQAJRQHISO6OO/graph.json","events_json":"https://pith.science/api/pith-number/LZ6OY7ZG7SCVNYQAJRQHISO6OO/events.json","paper":"https://pith.science/paper/LZ6OY7ZG"},"agent_actions":{"view_html":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO","download_json":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO.json","view_paper":"https://pith.science/paper/LZ6OY7ZG","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2507.02592&json=true","fetch_graph":"https://pith.science/api/pith-number/LZ6OY7ZG7SCVNYQAJRQHISO6OO/graph.json","fetch_events":"https://pith.science/api/pith-number/LZ6OY7ZG7SCVNYQAJRQHISO6OO/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO/action/timestamp_anchor","attest_storage":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO/action/storage_attestation","attest_author":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO/action/author_attestation","sign_citation":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO/action/citation_signature","submit_replication":"https://pith.science/pith/LZ6OY7ZG7SCVNYQAJRQHISO6OO/action/replication_record"}},"created_at":"2026-05-17T23:38:13.685012+00:00","updated_at":"2026-05-17T23:38:13.685012+00:00"}