Context is key: A benchmark for forecasting with essential textual information

· 2024 · arXiv 2410.18959

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

read on arXiv browse 5 citing papers

citation-role summary

background 1 baseline 1

citation-polarity summary

background 1 baseline 1

representative citing papers

LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics

cs.AI · 2026-04-19 · unverdicted · novelty 7.0

LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.

TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale

cs.AI · 2026-04-11 · conditional · novelty 7.0

TimeSeriesExamAgent combines templates and LLM agents to generate scalable time series reasoning benchmarks, demonstrating that current LLMs have limited performance on both abstract and domain-specific tasks.

CaTS-Bench: Can Language Models Describe Time Series?

cs.LG · 2025-09-25 · unverdicted · novelty 7.0

Introduces CaTS-Bench with human gold-standard captions and a synthetic generation pipeline to evaluate vision-language models on time series captioning and numeric reasoning.

When Do We Need LLMs? A Diagnostic for Language-Driven Bandits

cs.AI · 2026-04-07 · unverdicted · novelty 6.0

Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.

From Time Series Analysis to Question Answering: A Survey in the LLM Era

cs.LG · 2025-06-13 · accept · novelty 6.0

A survey proposing a taxonomy of Injective, Bridging, and Internal Alignment paradigms to evolve TSA into user-driven Time Series Question Answering with LLMs.

citing papers explorer

Showing 5 of 5 citing papers.

LLaTiSA: Towards Difficulty-Stratified Time Series Reasoning from Visual Perception to Semantics cs.AI · 2026-04-19 · unverdicted · none · ref 10
LLaTiSA is a vision-language model trained on a new 83k-sample hierarchical time series reasoning dataset that shows superior performance and out-of-distribution generalization on stratified TSR tasks.
TimeSeriesExamAgent: Creating Time Series Reasoning Benchmarks at Scale cs.AI · 2026-04-11 · conditional · none · ref 47
TimeSeriesExamAgent combines templates and LLM agents to generate scalable time series reasoning benchmarks, demonstrating that current LLMs have limited performance on both abstract and domain-specific tasks.
CaTS-Bench: Can Language Models Describe Time Series? cs.LG · 2025-09-25 · unverdicted · none · ref 2
Introduces CaTS-Bench with human gold-standard captions and a synthetic generation pipeline to evaluate vision-language models on time series captioning and numeric reasoning.
When Do We Need LLMs? A Diagnostic for Language-Driven Bandits cs.AI · 2026-04-07 · unverdicted · none · ref 52
Lightweight numerical bandits on text embeddings match or exceed LLM accuracy in contextual bandits at a fraction of the cost, with an embedding-based diagnostic to choose between them.
From Time Series Analysis to Question Answering: A Survey in the LLM Era cs.LG · 2025-06-13 · accept · none · ref 107
A survey proposing a taxonomy of Injective, Bridging, and Internal Alignment paradigms to evolve TSA into user-driven Time Series Question Answering with LLMs.

Context is key: A benchmark for forecasting with essential textual information

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer