hub

Llms can’t plan, but can help planning in llm-modulo frameworks

Subbarao Kambhampati, Karthik Valmeekam, Lin Guan, Kaya Stechly, Mudit Verma, Siddhant Bhambri, Lucas Saldyt, Anil Murthy · 2024 · arXiv 2402.01817

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

read on arXiv browse 11 citing papers

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 2

representative citing papers

Zero-Shot Goal Recognition with Large Language Models

cs.AI · 2026-05-14 · unverdicted · novelty 7.0

Frontier LLMs show uneven zero-shot performance on goal recognition in PDDL domains: some scale with accumulating evidence toward landmark-based accuracy while others stay anchored to world-knowledge priors.

Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym

cs.AI · 2026-04-10 · unverdicted · novelty 7.0

Spatial-Gym benchmark shows the best tested model solves only 16% of pathfinding tasks versus 98% for humans, with step-by-step and backtracking formats producing mixed effects across model strengths.

Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

cs.CL · 2025-11-25 · unverdicted · novelty 6.0

Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.

Novelty-based Tree-of-Thought Search for LLM Reasoning and Planning

cs.AI · 2026-05-07 · unverdicted · novelty 5.0

Novelty estimation via LLM prompts enables pruning in Tree-of-Thought search, reducing overall token usage on language planning benchmarks.

Do LLMs have core beliefs?

cs.LG · 2026-05-05 · unverdicted · novelty 5.0

LLMs generally fail to maintain stable worldviews under adversarial conversational pressure, indicating they lack core beliefs akin to those in human cognition.

U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning

cs.AI · 2026-05-04 · unverdicted · novelty 5.0

U-Define improves user control in LLM planning by letting people define hard rules and soft preferences in natural language with matching verification methods, raising usefulness and satisfaction scores.

Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents

cs.AI · 2026-04-30 · unverdicted · novelty 5.0

ValuePlanner is a hierarchical architecture that uses LLMs to generate value-based subgoals and PDDL planners to produce executable actions, enabling self-directed behavior in embodied agents.

Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems

cs.AI · 2026-04-22 · unverdicted · novelty 5.0

A hybrid system augments LLMs with an automated external RDF/OWL ontology layer for long-term memory, SHACL/OWL validation, and improved multi-step reasoning on tasks like Tower of Hanoi.

End-to-end PDDL Planning with Hardcoded and Dynamic Agents

cs.AI · 2025-12-10 · unverdicted · novelty 5.0

An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.

LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning

cs.RO · 2025-09-20 · unverdicted · novelty 5.0

LLM-TALE steers RL exploration using LLM-generated plans at task and affordance levels with online suboptimality correction, improving sample efficiency and success rates on pick-and-place tasks without human supervision.

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

cs.AI · 2025-03-31 · unverdicted · novelty 2.0

This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.

citing papers explorer

Showing 11 of 11 citing papers.

Zero-Shot Goal Recognition with Large Language Models cs.AI · 2026-05-14 · unverdicted · none · ref 5
Frontier LLMs show uneven zero-shot performance on goal recognition in PDDL domains: some scale with accumulating evidence toward landmark-based accuracy while others stay anchored to world-knowledge priors.
Mind the Gap Between Spatial Reasoning and Acting! Step-by-Step Evaluation of Agents With Spatial-Gym cs.AI · 2026-04-10 · unverdicted · none · ref 1
Spatial-Gym benchmark shows the best tested model solves only 16% of pathfinding tasks versus 98% for humans, with step-by-step and backtracking formats producing mixed effects across model strengths.
Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory cs.CL · 2025-11-25 · unverdicted · none · ref 203
Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and reasoning datasets.
Novelty-based Tree-of-Thought Search for LLM Reasoning and Planning cs.AI · 2026-05-07 · unverdicted · none · ref 26
Novelty estimation via LLM prompts enables pruning in Tree-of-Thought search, reducing overall token usage on language planning benchmarks.
Do LLMs have core beliefs? cs.LG · 2026-05-05 · unverdicted · none · ref 6
LLMs generally fail to maintain stable worldviews under adversarial conversational pressure, indicating they lack core beliefs akin to those in human cognition.
U-Define: Designing User Workflows for Hard and Soft Constraints in LLM-Based Planning cs.AI · 2026-05-04 · unverdicted · none · ref 49
U-Define improves user control in LLM planning by letting people define hard rules and soft preferences in natural language with matching verification methods, raising usefulness and satisfaction scores.
Bridging Values and Behavior: A Hierarchical Framework for Proactive Embodied Agents cs.AI · 2026-04-30 · unverdicted · none · ref 17
ValuePlanner is a hierarchical architecture that uses LLMs to generate value-based subgoals and PDDL planners to produce executable actions, enabling self-directed behavior in embodied agents.
Automatic Ontology Construction Using LLMs as an External Layer of Memory, Verification, and Planning for Hybrid Intelligent Systems cs.AI · 2026-04-22 · unverdicted · none · ref 18
A hybrid system augments LLMs with an automated external RDF/OWL ontology layer for long-term memory, SHACL/OWL validation, and improved multi-step reasoning on tasks like Tower of Hanoi.
End-to-end PDDL Planning with Hardcoded and Dynamic Agents cs.AI · 2025-12-10 · unverdicted · none · ref 19
An end-to-end LLM framework refines natural language into valid PDDL domains and problems via hardcoded and dynamic agents, generates plans with standard engines, and returns readable output.
LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning cs.RO · 2025-09-20 · unverdicted · none · ref 15
LLM-TALE steers RL exploration using LLM-generated plans at task and affordance levels with online suboptimality correction, improving sample efficiency and success rates on pick-and-place tasks without human supervision.
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems cs.AI · 2025-03-31 · unverdicted · none · ref 205
This survey frames foundation agents using brain-inspired modular architectures and reviews challenges in evolution, collaboration, and safety.

Llms can’t plan, but can help planning in llm-modulo frameworks

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer