DeepVerifier enables self-evolving deep research agents via rubric-guided verification at test time, delivering 8-11% accuracy gains on GAIA and XBench-DeepSearch subsets.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.AI 2representative citing papers
Magentic-One is a modular multi-agent system that matches state-of-the-art performance on GAIA, AssistantBench, and WebArena using an orchestrator-led team of specialized agents.
citing papers explorer
-
Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification
DeepVerifier enables self-evolving deep research agents via rubric-guided verification at test time, delivering 8-11% accuracy gains on GAIA and XBench-DeepSearch subsets.
-
Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks
Magentic-One is a modular multi-agent system that matches state-of-the-art performance on GAIA, AssistantBench, and WebArena using an orchestrator-led team of specialized agents.