arxiv: 2602.04129 · v2 · submitted 2026-02-04 · 💻 cs.RO · cs.AI· cs.ET· cs.MA

Recognition: no theorem link

KGLAMP: Knowledge Graph-guided Language model for Adaptive Multi-robot Planning and Replanning

Chak Lam Shek , Faizan M. Tariq , Sangjae Bae , David Isele , Piyush Gupta

Authors on Pith no claims yet

Pith reviewed 2026-05-16 08:05 UTC · model grok-4.3

classification 💻 cs.RO cs.AIcs.ETcs.MA

keywords knowledge graphmulti-robot planninglarge language modelsPDDLadaptive replanningheterogeneous robotsdynamic environments

0 comments

The pith

A knowledge graph guides an LLM to build and update accurate PDDL plans for heterogeneous robot teams in dynamic settings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents KGLAMP as a way to combine a structured knowledge graph with large language models for multi-robot planning. The graph stores facts about objects, reachability, and each robot's capabilities, then directs the LLM to output correct PDDL problem descriptions instead of relying on the model to invent everything from text alone. When new observations arrive, the graph updates and checks for inconsistencies that would break the current plan, prompting the LLM to generate a revised PDDL file for replanning. Experiments on the MAT-THOR benchmark demonstrate at least a 25.3 percent performance gain compared with both pure LLM planners and classical PDDL planners that lack this memory structure. The approach therefore tackles the manual-modeling burden of traditional planners and the inconsistency problems of unstructured language-model planning in long-horizon, uncertain environments.

Core claim

KGLAMP maintains a structured knowledge graph encoding object relations, spatial reachability, and robot capabilities, which guides the LLM in generating accurate PDDL problem specifications. The knowledge graph serves as a persistent, dynamically updated memory that incorporates new observations and triggers replanning upon detecting inconsistencies, enabling symbolic plans to adapt to evolving world states.

What carries the argument

The knowledge graph that encodes object relations, spatial reachability, and robot capabilities to direct the LLM toward correct PDDL outputs and to detect when replanning is required.

If this is right

Plans stay consistent with changing observations without requiring a human to rewrite the entire symbolic model.
Heterogeneous teams coordinate more reliably because capability differences are explicitly represented in the shared graph.
Replanning occurs only when the graph flags an inconsistency, avoiding unnecessary full replans.
The same graph can be reused across multiple tasks, reducing the cost of starting each new mission from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

In real deployments the graph could be populated directly from onboard perception pipelines rather than simulated observations.
Extending the graph with temporal relations might allow the system to anticipate future inconsistencies before they occur.
The framework's separation of persistent memory from the LLM could be applied to single-robot tasks that still require long-horizon symbolic reasoning.

Load-bearing premise

The knowledge graph can be kept accurate from robot observations and the LLM will reliably turn that graph into correct, consistent PDDL specifications.

What would settle it

Running the MAT-THOR experiments and observing that KGLAMP does not improve success rate by at least 25.3 percent over the LLM-only and PDDL baselines, or that plans fail because the generated PDDL files contain errors.

Figures

Figures reproduced from arXiv: 2602.04129 by Chak Lam Shek, David Isele, Faizan M. Tariq, Piyush Gupta, Sangjae Bae.

**Figure 1.** Figure 1: Impact of relational knowledge on task planning. (a) Without relational graphs, PDDL models miss object relationships, leading to failed plans. (b) Incorporating relationship, property, and reachability graphs enables accurate PDDL generation and feasible plans. heterogeneous multi-robot systems remains challenging [9], as most approaches assume shared action models and identical capabilities, limiting r… view at source ↗

**Figure 2.** Figure 2: Minimal STRIPS PDDL example illustrating (a) Domain PDDL and (b) Problem PDDL A. Multi-Agent Planning (MAP) Formulation We formulate the problem as a cooperative MultiAgent Planning (MAP) framework, defined by ⟨R, D, {Ai} n i=1, P, I, G⟩, where R = {r1, . . . , rn} is the set of robots. Each robot ri has a domain di ∈ D capturing its capabilities and constraints, and an action set Ai defining its state tr… view at source ↗

**Figure 3.** Figure 3: Overview of KGLAMP framework. Environment and robot information are encoded as relationship, property, and reachability knowledge graphs. LLM agents generate goal, relational, property, and reachability predicates in a dependency-aware manner to synthesize a PDDL problem, execute the resulting plan, and iteratively update the graphs and replan upon execution failures. LLM agents. It encodes instructions an… view at source ↗

**Figure 4.** Figure 4: An example knowledge graph. (a) Grelation captures semantic and geometric relationships among objects. (b) Gproperty encodes object attributes and robot capabilities. (c) Greach models spatial connectivity [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: An example of LLM prompt for LLMrelation. This prompt utilizes contextual examples, scenario definition, spatial data, and output constraints to extract relevant spatial tuples. action feasibility and ordering (e.g., reachability or containment). Identifying these relations before action generation is therefore critical, as they define the structural constraints governing valid action sequences. This infe… view at source ↗

**Figure 6.** Figure 6: Qualitative example of planning and replanning. In the task Put the watch and keychain inside the drawer, the robot fails when placing the watch into a closed drawer. It recovers by replanning to open the drawer and completes the task. We compare against six representative baselines. (i) LLMas-Planner [34] treats the language model as a standalone planner that directly generates action sequences without s… view at source ↗

read the original abstract

Heterogeneous multi-robot systems are increasingly used in long-horizon missions requiring coordinated planning across diverse capabilities. However, existing planning approaches struggle to construct accurate symbolic representations and maintain plan consistency in dynamic environments. Classical PDDL planners require manually crafted symbolic models, while LLM-based planners often ignore agent heterogeneity and environmental uncertainty. We introduce KGLAMP, a knowledge-graph-guided LLM planning framework for heterogeneous multi-robot teams. The framework maintains a structured knowledge graph encoding object relations, spatial reachability, and robot capabilities, which guides the LLM in generating accurate PDDL problem specifications. The knowledge graph serves as a persistent, dynamically updated memory that incorporates new observations and triggers replanning upon detecting inconsistencies, enabling symbolic plans to adapt to evolving world states. Experiments on the MAT-THOR benchmark show that KGLAMP improves performance by at least 25.3% over both LLM-only and PDDL-based variants.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

KGLAMP integrates a knowledge graph with an LLM to generate and adapt PDDL plans for heterogeneous robot teams, but the 25.3% gain rests on aggregate numbers without supporting metrics or protocol details.

read the letter

The paper's core idea is to keep a knowledge graph as live memory that pulls in observations, encodes robot capabilities and spatial relations, and steers an LLM toward usable PDDL problem files. When the graph detects inconsistencies with the current world state, it triggers replanning. This loop for long-horizon tasks with mixed robot types is the concrete new piece; it sits between hand-built symbolic models and free-form LLM outputs in a way that prior work does not spell out exactly this way.

Referee Report

2 major / 1 minor

Summary. The paper introduces KGLAMP, a framework that maintains a dynamically updated knowledge graph encoding object relations, spatial reachability, and robot capabilities to guide an LLM in generating accurate PDDL problem specifications for planning and replanning in heterogeneous multi-robot teams. The KG acts as persistent memory that incorporates observations and triggers replanning on detected inconsistencies. The central claim is that KGLAMP achieves at least 25.3% performance improvement over LLM-only and classical PDDL baselines on the MAT-THOR benchmark.

Significance. If the reported gains are substantiated with controlled experiments and component-level validation, the work would offer a practical integration of symbolic representations with LLM flexibility for adaptive multi-robot planning under uncertainty and heterogeneity, addressing limitations of purely classical or neural approaches.

major comments (2)

[Experiments] The experimental evaluation reports a 25.3% improvement on MAT-THOR but supplies no protocol details, statistical tests, error bars, baseline code or hyperparameter settings, or controls for heterogeneity/uncertainty; without these the central empirical claim cannot be verified or attributed to the KG-LLM-PDDL loop.
[Framework and Evaluation] No separate metrics are provided for knowledge-graph triple precision/recall from raw observations or for syntactic/semantic correctness of LLM-generated PDDL; these are load-bearing assumptions for the replanning mechanism, yet only aggregate task success is reported.

minor comments (1)

[Abstract] The abstract states the improvement percentage without defining the exact metric (success rate, completion time, etc.) or the number of trials.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We agree that additional experimental details and component-level metrics are necessary to substantiate the claims and will revise the manuscript accordingly. Below we respond to each major comment.

read point-by-point responses

Referee: [Experiments] The experimental evaluation reports a 25.3% improvement on MAT-THOR but supplies no protocol details, statistical tests, error bars, baseline code or hyperparameter settings, or controls for heterogeneity/uncertainty; without these the central empirical claim cannot be verified or attributed to the KG-LLM-PDDL loop.

Authors: We acknowledge the need for greater transparency in the experimental protocol. In the revised manuscript we will add: (i) a full description of the evaluation protocol including number of independent trials per scenario, random seeds, and environment variations; (ii) statistical significance tests (paired t-tests or Wilcoxon signed-rank tests with p-values) comparing KGLAMP against baselines; (iii) error bars (standard deviation or 95% confidence intervals) on all reported success rates; (iv) explicit hyperparameter settings for the LLM (temperature, prompt templates) and classical planner; (v) links or pseudocode for baseline implementations; and (vi) dedicated ablation studies that isolate the contributions of heterogeneity handling and uncertainty detection. These additions will allow readers to reproduce the 25.3% aggregate improvement and attribute it specifically to the KG-LLM-PDDL loop. revision: yes
Referee: [Framework and Evaluation] No separate metrics are provided for knowledge-graph triple precision/recall from raw observations or for syntactic/semantic correctness of LLM-generated PDDL; these are load-bearing assumptions for the replanning mechanism, yet only aggregate task success is reported.

Authors: We agree that aggregate task success alone is insufficient to validate the core mechanisms. In the revision we will introduce and report two new evaluation sections: (1) Knowledge-graph quality metrics—precision, recall, and F1 for triples extracted from raw observations, measured against ground-truth annotations on a held-out subset of MAT-THOR episodes; (2) PDDL generation quality—syntactic validity rate (percentage of outputs accepted by a PDDL parser) and semantic correctness (percentage of generated problem files whose initial state and goal match the observed world state, verified by automated simulation or manual inspection on sampled cases). These metrics will be presented alongside the replanning frequency and overall success rates to demonstrate that the KG update and LLM-PDDL steps are reliable. revision: yes

Circularity Check

0 steps flagged

No circularity: framework combines standard components without self-referential derivations or fitted predictions

full rationale

The manuscript presents KGLAMP as an engineering combination of knowledge graphs for state tracking, LLMs for PDDL generation, and classical planners for execution, with dynamic updates and replanning on inconsistency detection. No equations, parameters, or derivations appear in the provided text that reduce by construction to inputs (e.g., no fitted scale parameters renamed as predictions, no uniqueness theorems imported from self-citations, no ansatzes smuggled via prior work). The 25.3% performance delta is an empirical claim on the MAT-THOR benchmark rather than a logical tautology. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that an LLM can translate graph-encoded relations into valid PDDL when suitably prompted; no free parameters or new invented entities are introduced in the abstract.

axioms (1)

domain assumption LLMs guided by a structured knowledge graph can generate accurate and consistent PDDL problem specifications for heterogeneous robots.
This assumption is required for the LLM component to produce usable symbolic plans from the graph.

pith-pipeline@v0.9.0 · 5474 in / 1382 out tokens · 64933 ms · 2026-05-16T08:05:34.094192+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 4 internal anchors

[1]

Multi-robot coordination and layout design for au- tomated warehousing

Yulun Zhang, Matthew C Fontaine, Varun Bhatt, Stefanos Nikolaidis, and Jiaoyang Li. Multi-robot coordination and layout design for au- tomated warehousing. InProceedings of the International Symposium on Combinatorial Search, volume 17, pages 305–306, 2024

work page 2024
[2]

Multi- robot task planning under individual and collaborative temporal logic specifications

Ruofei Bai, Ronghao Zheng, Meiqin Liu, and Senlin Zhang. Multi- robot task planning under individual and collaborative temporal logic specifications. InInternational Conference on Intelligent Robots and Systems (IROS), pages 6382–6389. IEEE, 2021

work page 2021
[3]

In- centivizing collaboration in heterogeneous teams via common-pool resource games.IEEE Transactions on Automatic Control, 68(3):1902– 1909, 2022

Piyush Gupta, Shaunak D Bopardikar, and Vaibhav Srivastava. In- centivizing collaboration in heterogeneous teams via common-pool resource games.IEEE Transactions on Automatic Control, 68(3):1902– 1909, 2022

work page 1902
[4]

Achiev- ing efficient collaboration in decentralized heterogeneous teams using common-pool resource games

Piyush Gupta, Shaunak D Bopardikar, and Vaibhav Srivastava. Achiev- ing efficient collaboration in decentralized heterogeneous teams using common-pool resource games. In58th Conference on Decision and Control (CDC), pages 6924–6929. IEEE, 2019

work page 2019
[5]

Autonomous robot task execution in flexible manufacturing: Integrating PDDL and behavior trees in ARIAC 2023.Biomimetics, 9(10):612, 2024

Ruikai Liu, Guangxi Wan, Maowei Jiang, Haojie Chen, and Peng Zeng. Autonomous robot task execution in flexible manufacturing: Integrating PDDL and behavior trees in ARIAC 2023.Biomimetics, 9(10):612, 2024

work page 2023
[6]

LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

Bo Liu, Yuqian Jiang, Xiaohan Zhang, Qiang Liu, Shiqi Zhang, Joydeep Biswas, and Peter Stone. LLM+P: Empowering large lan- guage models with optimal planning proficiency.arXiv preprint arXiv:2304.11477, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[7]

Robots that ask for help: Uncertainty alignment for large language model planners.arXiv preprint arXiv:2307.01928, 2023

Allen Z Ren, Anushri Dixit, Alexandra Bodrova, Sumeet Singh, Stephen Tu, Noah Brown, Peng Xu, Leila Takayama, Fei Xia, Jake Varley, et al. Robots that ask for help: Uncertainty alignment for large language model planners.arXiv preprint arXiv:2307.01928, 2023

work page arXiv 2023
[8]

Leveraging pre-trained large language models to construct and utilize world models for model-based task planning

Lin Guan, Karthik Valmeekam, Sarath Sreedharan, and Subbarao Kambhampati. Leveraging pre-trained large language models to construct and utilize world models for model-based task planning. Advances in Neural Information Processing Systems, 36:79081–79094, 2023

work page 2023
[9]

Distributed allocation and scheduling of tasks with cross-schedule dependencies for heteroge- neous multi-robot teams.IEEE access, 12:74327–74342, 2024

Barbara Arbanas Ferreira, Tamara Petrovi ´c, Matko Orsag, J Ramiro Mart´ınez-de Dios, and Stjepan Bogdan. Distributed allocation and scheduling of tasks with cross-schedule dependencies for heteroge- neous multi-robot teams.IEEE access, 12:74327–74342, 2024

work page 2024
[10]

LaMMA-P: Generalizable multi-agent long-horizon task allocation and planning with LM-driven PDDL planner

Xiaopan Zhang, Hao Qin, Fuquan Wang, Yue Dong, and Jiachen Li. LaMMA-P: Generalizable multi-agent long-horizon task allocation and planning with LM-driven PDDL planner. InInternational Conference on Robotics and Automation, pages 10221–10221. IEEE, 2025

work page 2025
[11]

Iterative Formalization and Planning in Partially Observable Environments

Liancheng Gong, Wang Zhu, Jesse Thomason, and Li Zhang. Zero- shot iterative formalization and planning in partially observable envi- ronments.arXiv preprint arXiv:2505.13126, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[12]

NOVELGYM: A flexible ecosystem for hybrid planning and learning agents designed for open worlds.arXiv preprint arXiv:2401.03546, 2024

Shivam Goel, Yichen Wei, Panagiotis Lymperopoulos, Kl ´ara Chur ´a, Matthias Scheutz, and Jivko Sinapov. NOVELGYM: A flexible ecosystem for hybrid planning and learning agents designed for open worlds.arXiv preprint arXiv:2401.03546, 2024

work page arXiv 2024
[13]

GFlowVLM: Enhancing multi-step reasoning in vision-language models with generative flow networks

Haoqiang Kang, Enna Sachdeva, Piyush Gupta, Sangjae Bae, and Kwonjoon Lee. GFlowVLM: Enhancing multi-step reasoning in vision-language models with generative flow networks. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 3815–3825, 2025

work page 2025
[14]

Generalized mission planning for heterogeneous multi-robot teams via LLM-constructed hierarchical trees

Piyush Gupta, David Isele, Enna Sachdeva, Pin-Hao Huang, Behzad Dariush, Kwonjoon Lee, and Sangjae Bae. Generalized mission planning for heterogeneous multi-robot teams via LLM-constructed hierarchical trees. In2025 IEEE International Conference on Robotics and Automation (ICRA), pages 10187–10193, 2025

work page 2025
[15]

Robot behavior-tree-based task generation with large language models.arXiv preprint arXiv:2302.12927, 2023

Yue Cao and CS Lee. Robot behavior-tree-based task generation with large language models.arXiv preprint arXiv:2302.12927, 2023

work page arXiv 2023
[16]

Skill reinforcement learning and planning for open-world long-horizon tasks.arXiv preprint arXiv:2303.16563, 2023

Haoqi Yuan, Chi Zhang, Hongcheng Wang, Feiyang Xie, Penglin Cai, Hao Dong, and Zongqing Lu. Skill reinforcement learning and planning for open-world long-horizon tasks.arXiv preprint arXiv:2303.16563, 2023

work page arXiv 2023
[17]

PLAN-AND-ACT: Improving planning of agents for long-horizon tasks.arXiv preprint arXiv:2503.09572, 2025

Lutfi Eren Erdogan, Nicholas Lee, Sehoon Kim, Suhong Moon, Hiroki Furuta, Gopala Anumanchipalli, Kurt Keutzer, and Amir Gholami. PLAN-AND-ACT: Improving planning of agents for long-horizon tasks.arXiv preprint arXiv:2503.09572, 2025

work page arXiv 2025
[18]

Smart-LLM: Smart multi-agent robot task planning using large language models

Shyam Sundar Kannan, Vishnunandan LN Venkatesh, and Byung- Cheol Min. Smart-LLM: Smart multi-agent robot task planning using large language models. In2024 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 12140–12147. IEEE, 2024

work page 2024
[19]

Graph-grounded LLMs: Leveraging graphical function calling to minimize LLM hallucinations

Piyush Gupta, Sangjae Bae, and David Isele. Graph-grounded LLMs: Leveraging graphical function calling to minimize LLM hallucinations. arXiv preprint arXiv:2503.10941, 2025

work page arXiv 2025
[20]

Graph-enhanced large language models in asynchronous plan reasoning.arXiv preprint arXiv:2402.02805, 2024

Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony Cohn, and Janet B Pierrehumbert. Graph-enhanced large language models in asynchronous plan reasoning.arXiv preprint arXiv:2402.02805, 2024

work page arXiv 2024
[21]

Compositional coordination for multi-robot teams with large language models.arXiv preprint arXiv:2507.16068, 2025

Zhehui Huang, Guangyao Shi, Yuwei Wu, Vijay Kumar, and Gaurav S Sukhatme. Compositional coordination for multi-robot teams with large language models.arXiv preprint arXiv:2507.16068, 2025

work page arXiv 2025
[22]

COHERENT: Collaboration of heterogeneous multi-robot system with large language models

Kehui Liu, Zixin Tang, Dong Wang, Zhigang Wang, Xuelong Li, and Bin Zhao. COHERENT: Collaboration of heterogeneous multi-robot system with large language models. InInternational Conference on Robotics and Automation, pages 10208–10214. IEEE, 2025

work page 2025
[23]

M2PA: A multi-memory planning agent for open worlds inspired by cognitive theory

YanfangZhou YanfangZhou, Xiaodong Li, Yuntao Liu, Yongqiang Zhao, Xintong Wang, Zhenyu Li, Jinlong Tian, and Xinhai Xu. M2PA: A multi-memory planning agent for open worlds inspired by cognitive theory. InFindings of the Association for Computational Linguistics: ACL 2025, pages 23204–23220, 2025

work page 2025
[24]

RAP: Retrieval-augmented planning with contextual memory for multimodal LLM agents.arXiv preprint arXiv:2402.03610, 2024

Tomoyuki Kagaya, Thong Jing Yuan, Yuxuan Lou, Jayashree Karlekar, Sugiri Pranata, Akira Kinose, Koki Oguri, Felix Wick, and Yang You. RAP: Retrieval-augmented planning with contextual memory for multimodal LLM agents.arXiv preprint arXiv:2402.03610, 2024

work page arXiv 2024
[25]

REMEMBER: Building and reasoning over long-horizon spatio-temporal memory for robot navigation

Abrar Anwar, John Welsh, Joydeep Biswas, Soha Pouya, and Yan Chang. REMEMBER: Building and reasoning over long-horizon spatio-temporal memory for robot navigation. InInternational Con- ference on Robotics and Automation, pages 2838–2845. IEEE, 2025

work page 2025
[26]

OPTIMUS-1: Hybrid multimodal memory empowered agents excel in long-horizon tasks.Advances in neural information processing systems, 37:49881–49913, 2024

Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, and Liqiang Nie. OPTIMUS-1: Hybrid multimodal memory empowered agents excel in long-horizon tasks.Advances in neural information processing systems, 37:49881–49913, 2024

work page 2024
[27]

KARMA: Augmenting embodied ai agents with long-and-short term memory systems

Zixuan Wang, Bo Yu, Junzhe Zhao, Wenhao Sun, Sai Hou, Shuai Liang, Xing Hu, Yinhe Han, and Yiming Gan. KARMA: Augmenting embodied ai agents with long-and-short term memory systems. In International Conference on Robotics and Automation, pages 1–8. IEEE, 2025

work page 2025
[28]

L3M+ P: Lifelong planning with large language models.arXiv preprint arXiv:2508.01917, 2025

Krish Agarwal, Yuqian Jiang, Jiaheng Hu, Bo Liu, and Peter Stone. L3M+ P: Lifelong planning with large language models.arXiv preprint arXiv:2508.01917, 2025

work page arXiv 2025
[29]

Learning STRIPS action models with classical planning

Diego Aineto, Sergio Jim ´enez, and Eva Onaindia. Learning STRIPS action models with classical planning. InProceedings of the Interna- tional Conference on Automated Planning and Scheduling, volume 28, pages 399–407, 2018

work page 2018
[30]

The fast downward planning system.Journal of Artificial Intelligence Research, 26:191–246, 2006

Malte Helmert. The fast downward planning system.Journal of Artificial Intelligence Research, 26:191–246, 2006

work page 2006
[31]

GPT-5.https://openai.com/gpt-5/, 2025

OpenAI. GPT-5.https://openai.com/gpt-5/, 2025. Ac- cessed: 2025-12-03

work page 2025
[32]

Scale-Plan: Scalable language-enabled task planning for heterogeneous multi-robot teams.arXiv preprint arXiv:2603.08814, 2026

Piyush Gupta, Sangjae Bae, Jiachen Li, and David Isele. Scale-Plan: Scalable language-enabled task planning for heterogeneous multi-robot teams.arXiv preprint arXiv:2603.08814, 2026

work page arXiv 2026
[33]

AI2-THOR: An Interactive 3D Environment for Visual AI

Eric Kolve, Roozbeh Mottaghi, Winson Han, Eli VanderBilt, Luca Weihs, Alvaro Herrasti, Matt Deitke, Kiana Ehsani, Daniel Gordon, Yuke Zhu, et al. AI2-THOR: An interactive 3D environment for visual AI.arXiv preprint arXiv:1712.05474, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[34]

Large language model as a policy teacher for training reinforcement learning agents.arXiv preprint arXiv:2311.13373, 2023

Zihao Zhou, Bin Hu, Chenyang Zhao, Pu Zhang, and Bin Liu. Large language model as a policy teacher for training reinforcement learning agents.arXiv preprint arXiv:2311.13373, 2023

work page arXiv 2023
[35]

Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems, 35:24824–24837, 2022

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models.Advances in neural information processing systems, 35:24824–24837, 2022

work page 2022
[36]

SayPlan: grounding large language models using 3d scene graphs for scalable task planning

Krishan Rana, Jesse Haviland, Sourav Garg, Jad Abou-Chakra, Ian Reid, and Niko Suenderhauf. SayPlan: grounding large language models using 3d scene graphs for scalable task planning. In7th Annual Conference on Robot Learning, 2023

work page 2023
[37]

Hierarchical planning for complex tasks with knowledge graph-rag and symbolic verification

Flavio Petruzzellis, Cristina Cornelio, and Pietro Lio. Hierarchical planning for complex tasks with knowledge graph-rag and symbolic verification. InForty-Second International Conference on Machine Learning (ICML), 2025

work page 2025
[38]

The Llama 3 herd of models, 2024

Llama Team, AI @ Meta. The Llama 3 herd of models, 2024

work page 2024
[39]

Phi-4 Technical Report

Marah Abdin, Jyoti Aneja, Harkirat Behl, S ´ebastien Bubeck, Ronen Eldan, Suriya Gunasekar, Michael Harrison, Russell J Hewett, Mojan Javaheripi, Piero Kauffmann, et al. Phi-4 technical report.arXiv preprint arXiv:2412.08905, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[40]

Jiang, Alexandre Sablayrolles, Antoine Roux, et al

Albert Q. Jiang, Alexandre Sablayrolles, Antoine Roux, et al. Mistral 7b, 2023

work page 2023
[41]

Qwen2 technical report, 2024

An Yang, Baosong Yang, Binyuan Hui, et al. Qwen2 technical report, 2024

work page 2024
[42]

Towards scalable & efficient interaction-aware planning in autonomous vehicles using knowledge distillation

Piyush Gupta, David Isele, and Sangjae Bae. Towards scalable & efficient interaction-aware planning in autonomous vehicles using knowledge distillation. In2024 IEEE Intelligent Vehicles Symposium (IV), pages 2735–2742. IEEE, 2024

work page 2024