TimeSage-MT introduces a multi-turn benchmark for agentic time series reasoning and shows frontier LLMs drop sharply on decision-oriented tasks due to memory and uncertainty failures.
Fin- MTM: A multi-turn multimodal benchmark for financial reasoning and agent evaluation.arXiv preprint arXiv:2602.03130, 2026
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
TimeSage-MT: A Multi-Turn Benchmark for Evaluating Agentic Time Series Reasoning
TimeSage-MT introduces a multi-turn benchmark for agentic time series reasoning and shows frontier LLMs drop sharply on decision-oriented tasks due to memory and uncertainty failures.