Frontier LLMs now match or exceed state-of-the-art classical planners on IPC planning tasks, with Gemini 3.1 Pro solving 245 of 360 tasks versus 234 for the best baseline.
Chen, Johannes Zenn, Tristan Cinquin, and Sheila A
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Frontier Large Language Models Rival State-of-the-Art Planners
Frontier LLMs now match or exceed state-of-the-art classical planners on IPC planning tasks, with Gemini 3.1 Pro solving 245 of 360 tasks versus 234 for the best baseline.