Open-Ended Wargames with Large Language Models

“Open-Ended Wargames with Large Language Models · 2024 · arXiv 2404.11446

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

PokeGym: A Visually-Driven Long-Horizon Benchmark for Vision-Language Models

cs.CV · 2026-04-09 · unverdicted · novelty 7.0

PokeGym is a new benchmark that tests VLMs on long-horizon tasks in a complex 3D game using only visual observations, identifying deadlock recovery as the primary failure mode.

Multi-Agent Strategic Games with LLMs

cs.GT · 2026-05-05 · unverdicted · novelty 6.0 · 2 refs

LLMs in extended security dilemma games show increased conflict with more players, unraveling in finite games, and reduced conflict with communication, providing a new way to probe IR mechanisms.

citing papers explorer

Showing 2 of 2 citing papers.

PokeGym: A Visually-Driven Long-Horizon Benchmark for Vision-Language Models cs.CV · 2026-04-09 · unverdicted · none · ref 21
PokeGym is a new benchmark that tests VLMs on long-horizon tasks in a complex 3D game using only visual observations, identifying deadlock recovery as the primary failure mode.
Multi-Agent Strategic Games with LLMs cs.GT · 2026-05-05 · unverdicted · none · ref 14 · 2 links
LLMs in extended security dilemma games show increased conflict with more players, unraveling in finite games, and reduced conflict with communication, providing a new way to probe IR mechanisms.

Open-Ended Wargames with Large Language Models

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer