Agentic Time Machine reconstructs historical web states for offline evaluation of forecasting agents, with a multi-agent framework achieving top ranks on FutureX live and past benchmarks.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
WorldReasoner supplies 345 resolved forecasting tasks built from 14,141 articles to score LM agents on outcome quality, evidence quality, and reasoning quality against time-bounded evidence and hindsight graphs.
citing papers explorer
-
Agentic Time Machine as an Infrastructure for Future-Event Forecasting
Agentic Time Machine reconstructs historical web states for offline evaluation of forecasting agents, with a multi-agent framework achieving top ranks on FutureX live and past benchmarks.
-
WorldReasoner: Evaluating Whether Language Model Agents Forecast Events with Valid Reasoning
WorldReasoner supplies 345 resolved forecasting tasks built from 14,141 articles to score LM agents on outcome quality, evidence quality, and reasoning quality against time-bounded evidence and hindsight graphs.