WebThinker equips large reasoning models with autonomous web exploration and interleaved reasoning-drafting via a Deep Web Explorer and RL-based DPO training, yielding gains on GPQA, GAIA, and report-generation benchmarks.
Title resolution pending
3 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 3representative citing papers
LLM agents improve adaptability by first using an interaction budget for systematic exploration measured via Exploration Checkpoint Coverage before executing tasks.
Hierarchical Re-Basin merging induces stronger adversarial and perturbation robustness in combined models as more participants are added, but produces larger performance drops than previously reported.
citing papers explorer
-
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
WebThinker equips large reasoning models with autonomous web exploration and interleaved reasoning-drafting via a Deep Web Explorer and RL-based DPO training, yielding gains on GPQA, GAIA, and report-generation benchmarks.
-
Look Before You Leap: Autonomous Exploration for LLM Agents
LLM agents improve adaptability by first using an interaction budget for systematic exploration measured via Exploration Checkpoint Coverage before executing tasks.
-
Robustness and Regularization in Hierarchical Re-Basin
Hierarchical Re-Basin merging induces stronger adversarial and perturbation robustness in combined models as more participants are added, but produces larger performance drops than previously reported.