Key filtering criteria: - If the two time-series have completely non-overlapping time ranges, then the question should be filtered out

Anomaly Indicator The anomaly indicator question is a paired query question that asks the user to identify whether some anomaly in the first time-series is a leading or lagging ind

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

ARFBench: Benchmarking Time Series Question Answering Ability for Software Incident Response

cs.LG · 2026-04-23 · unverdicted · novelty 6.0

ARFBench shows vision-language models lead time series question answering for software incidents at 62.7% accuracy, a hybrid TSFM+VLM matches them, and a model-expert oracle reaches 87.2% accuracy.

citing papers explorer

Showing 1 of 1 citing paper.

ARFBench: Benchmarking Time Series Question Answering Ability for Software Incident Response cs.LG · 2026-04-23 · unverdicted · none · ref 26
ARFBench shows vision-language models lead time series question answering for software incidents at 62.7% accuracy, a hybrid TSFM+VLM matches them, and a model-expert oracle reaches 87.2% accuracy.

Key filtering criteria: - If the two time-series have completely non-overlapping time ranges, then the question should be filtered out

fields

years

verdicts

representative citing papers

citing papers explorer