Wolter, SparQ – A Spatial Reasoning Toolbox., in: AAAI Spring Symposium: Benchmarking of Qualitative Spatial and Temporal Rea- soning Systems, 2009, p

· 2009

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

QSTRBench: a New Benchmark to Evaluate the Ability of Language Models to Reason with Qualitative Spatial and Temporal Calculi

cs.AI · 2026-05-18 · accept · novelty 8.0

QSTRBench is a new benchmark evaluating LLMs on compositional reasoning, converse relations, and conceptual neighbourhoods across QSTR calculi including a newly published RCC-22 CN, showing models exceed chance but fail to achieve consistent correctness.

citing papers explorer

Showing 1 of 1 citing paper.

QSTRBench: a New Benchmark to Evaluate the Ability of Language Models to Reason with Qualitative Spatial and Temporal Calculi cs.AI · 2026-05-18 · accept · none · ref 34
QSTRBench is a new benchmark evaluating LLMs on compositional reasoning, converse relations, and conceptual neighbourhoods across QSTR calculi including a newly published RCC-22 CN, showing models exceed chance but fail to achieve consistent correctness.

Wolter, SparQ – A Spatial Reasoning Toolbox., in: AAAI Spring Symposium: Benchmarking of Qualitative Spatial and Temporal Rea- soning Systems, 2009, p

fields

years

verdicts

representative citing papers

citing papers explorer