RTSGameBench is a new extensible benchmark for VLMs using diverse RTS matchups, diagnostic mini-games targeting individual competencies, and a self-evolving query-to-game generator, with results showing poor VLM performance on tight coordination and large-scale tasks.
acl-long.378/ Beyond All Reason: Benchmarking Strategic Reasoning in VLMs 17
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RTSGameBench: An RTS Benchmark for Strategic Reasoning by Vision-Language Models
RTSGameBench is a new extensible benchmark for VLMs using diverse RTS matchups, diagnostic mini-games targeting individual competencies, and a self-evolving query-to-game generator, with results showing poor VLM performance on tight coordination and large-scale tasks.