Agentic Time Machine reconstructs historical web states for offline evaluation of forecasting agents, with a multi-agent framework achieving top ranks on FutureX live and past benchmarks.
Inferring Events from Time Series using Language Models
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
A common goal in analyzing time series data is to understand how events cause observed variations. We study whether Large Language Models (LLMs) can infer natural language events associated with time series data. We introduce an automated method for generating tasks that test a model's ability to reason about events associated with time series data based on sports data, and develop a new benchmarking method. In experiments spanning 18 LLMs, we prompt LLMs to infer unobserved events given time series data and observe surprising successes, even when providing minimal context. We then show that combining distillation with Reinforcement Learning (RL) can improve the performance for small language models to approach that of large proprietary reasoning models. All resources needed to reproduce our work are available: https://github.com/hartvigsen-group/GAMETime
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Agentic Time Machine as an Infrastructure for Future-Event Forecasting
Agentic Time Machine reconstructs historical web states for offline evaluation of forecasting agents, with a multi-agent framework achieving top ranks on FutureX live and past benchmarks.