pith. sign in

arxiv: 2505.13126 · v3 · submitted 2025-05-19 · 💻 cs.AI · cs.CL

Iterative Formalization and Planning in Partially Observable Environments

Pith reviewed 2026-05-22 14:21 UTC · model grok-4.3

classification 💻 cs.AI cs.CL
keywords PDDLpartially observable environmentsLLM planningiterative formalizationepisode decompositionplanning robustnessknowledge transfer
0
0 comments X

The pith

PDDLego improves planning success in partial observation settings

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how large language models can handle planning when the environment state is only partially visible. It breaks the overall task into smaller episodes that are fully observable, formalizes each in PDDL, solves them one by one, and composes the results. This process requires no model fine-tuning, no example prompts, and no demonstration trajectories. If the approach holds, it would allow more reliable planning in realistic incomplete-information settings and let learned domain knowledge carry over to new tasks.

Core claim

PDDLego is a framework that iteratively formalizes, plans, grows, and refines PDDL representations by decomposing the environment and the goal into fully observable episodes. Without finetuning, in-context exemplars, or trajectories, PDDLego improves planning success and exhibits robustness against problem complexity compared to end-to-end approaches in partially observable environments. The domain knowledge captured after a successful trial can benefit future tasks.

What carries the argument

Iterative decomposition of the partially observable environment and goal into a sequence of fully observable episodes, each formalized in PDDL and composed into an overall plan.

Load-bearing premise

The language model can reliably split the partial observations and goal into fully observable episodes without dropping hidden information that later connects the solutions.

What would settle it

A test environment in which episode-level plans succeed individually yet the combined plan fails when run against the original partial observations due to missing hidden-state details.

Figures

Figures reproduced from arXiv: 2505.13126 by Jesse Thomason, Liancheng Gong, Li Zhang, Wang Zhu.

Figure 1
Figure 1. Figure 1: Unlike fully-observable environments (up [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: An illustration of PDDLego+, using LLM-as￾formalizer. Input environmental observations into LLM to generate PDDL representations, which are input into solver to output an action plan. The plan is executed in simulation resulting in new observations to grow PDDL. When errors occur, the LLM refines the PDDL. Unlike PDDLego, which assumes a fixed domain file, PDDL ego+ revises both DF and PF throughout intera… view at source ↗
Figure 3
Figure 3. Figure 3: An illustration of a framework based on LLM￾as-planner which we consider as a baseline. The LLM directly generates an action plan to be executed. for LLM-as-formalizer due to the completeness assumption of planning languages like PDDL, re￾quiring techniques such as goal decomposition and iterative generation (Zhang et al., 2024). Proposed Method. Our PDDLego+ framework is illustrated in [PITH_FULL_IMAGE:f… view at source ↗
Figure 4
Figure 4. Figure 4: Illustration of the CoinCollector environment. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Success rate of two baselines PlanGen and PDDLego and our method PDDLego+ across four models. PDDL ego+ shows higher success in 6 out 8 model-simulation combinations. In the more challenging ALFWorld, PDDL ego+ outperforms PlanGen for every model. QwQ-32B, Llama-3.1-70B, GPT-4o-mini, and DeepSeek-R1-Distill-Qwen-32B, but found they perform significantly worse (in line with the find￾ings of Huang and Zhang … view at source ↗
Figure 7
Figure 7. Figure 7: Average number of successful actions exe [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 9
Figure 9. Figure 9: Solver (blue) and simulation (red) error counts [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗
Figure 11
Figure 11. Figure 11: Error breakdown for randomly selected er [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Ablation study of the o3-mini + PDDLego+ framework on ALFWorld, comparing four prompt vari￾ants: plain, plain + hint, plain + goal, and detailed. For the PlanGen, the prompt starts with a sin￾gle instruction that positions the LLM as a deci￾sion maker to generate a plan. A brief header re￾minds the model that all actions must be strictly grounded in the current textual observation and that it may not inve… view at source ↗
read the original abstract

Using LLMs not to predict plans but to formalize an environment into the Planning Domain Definition Language (PDDL) has been shown to improve performance and control. While most existing methodology only applies to fully observable environments, we adapt to the more realistic and challenging partially observable environments without sufficient information to make a complete plan. We propose PDDLego, a framework to iteratively formalize, plan, grow, and refine PDDL representations by decomposing the environment and the goal into fully observable episodes. Without finetuning, in-context exemplars, or trajectories, PDDLego improves planning success and exhibits robustness against problem complexity compared to end-to-end approaches. We also show that the domain knowledge captured after a successful trial can benefit future tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces PDDLego, a framework in which LLMs are used to iteratively formalize a partially observable environment and goal into PDDL by decomposing the problem into a sequence of fully observable episodes. Within each episode a classical planner is invoked; the resulting plan and observations are used to grow and refine the PDDL domain. The central claims are that this procedure yields higher planning success and greater robustness to increasing problem complexity than end-to-end LLM planners, requires neither fine-tuning nor in-context exemplars nor trajectories, and that the acquired domain knowledge transfers to subsequent tasks.

Significance. If the empirical claims are substantiated, the work would provide a concrete, training-free bridge between LLM-based environment modeling and symbolic planning in realistic POMDPs. The iterative decomposition-plus-refinement loop and the demonstrated knowledge reuse across tasks are the most distinctive contributions.

major comments (2)
  1. The load-bearing assumption that decomposition into fully observable episodes preserves all hidden-state dependencies necessary for later episodes is not accompanied by a formal invariant or by a systematic empirical stress test. Section 3 describes the iterative formalization and refinement loop but supplies no argument showing that variables revealed only after an action (object locations, preconditions, etc.) are correctly threaded across episode boundaries; an early omission would render subsequent PDDL domains irrecoverable and would undermine the robustness-to-complexity claim.
  2. The abstract asserts performance gains and robustness, yet the soundness assessment notes the absence of quantitative results, baselines, or error analysis in the provided summary. The experimental section must report success rates, problem-complexity scaling curves, and controlled comparisons against end-to-end LLM planners (with identical prompt engineering) so that the claimed improvements can be verified.
minor comments (2)
  1. Clarify the precise criteria used by the LLM to decide when an episode is fully observable and when refinement of the PDDL domain is triggered.
  2. Add a reproducibility checklist or pseudocode for the overall PDDLego loop, including how observations are mapped back into PDDL predicates.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to strengthen the presentation of our approach and results.

read point-by-point responses
  1. Referee: The load-bearing assumption that decomposition into fully observable episodes preserves all hidden-state dependencies necessary for later episodes is not accompanied by a formal invariant or by a systematic empirical stress test. Section 3 describes the iterative formalization and refinement loop but supplies no argument showing that variables revealed only after an action (object locations, preconditions, etc.) are correctly threaded across episode boundaries; an early omission would render subsequent PDDL domains irrecoverable and would undermine the robustness-to-complexity claim.

    Authors: We agree that a formal invariant would provide stronger theoretical grounding. The PDDLego loop is designed so that each episode's observations and planner outcomes are used to extend the domain with newly revealed predicates and objects before the next episode begins. We have expanded Section 3 with an explicit description of this threading mechanism and added a new appendix containing systematic stress tests on POMDPs engineered to expose early-omission risks. These experiments show that the refinement process recovers the necessary state information in the evaluated domains. revision: yes

  2. Referee: The abstract asserts performance gains and robustness, yet the soundness assessment notes the absence of quantitative results, baselines, or error analysis in the provided summary. The experimental section must report success rates, problem-complexity scaling curves, and controlled comparisons against end-to-end LLM planners (with identical prompt engineering) so that the claimed improvements can be verified.

    Authors: The full experimental section already reports success rates, problem-complexity scaling curves, and direct comparisons against end-to-end LLM planners that use identical prompt engineering. We have also included an error analysis of failure cases. In the revision we have reorganized the experimental section to make these quantitative results and baseline details more prominent and have added a summary table of key metrics for easier verification. revision: partial

Circularity Check

0 steps flagged

No circularity: procedural framework with independent empirical claims

full rationale

The paper presents PDDLego as an iterative procedural method that decomposes POMDPs into fully observable PDDL episodes, formalizes them, plans, and refines without any closed-form equations, fitted parameters, or derivations. No step reduces a claimed prediction or success metric to a quantity defined by the same inputs or by self-citation chains. The central claims rest on described algorithmic steps and reported empirical robustness rather than self-referential definitions or load-bearing prior results from the same authors. The derivation chain is therefore self-contained and does not collapse to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the untested premise that LLMs can produce accurate and composable PDDL fragments from partial observations; no free parameters, mathematical axioms, or new invented entities are introduced beyond standard PDDL semantics and LLM prompting.

axioms (1)
  • domain assumption LLMs can produce syntactically valid and semantically useful PDDL descriptions of environment dynamics and goals from natural-language or partial-state input.
    Invoked throughout the abstract as the mechanism enabling iterative formalization without finetuning.

pith-pipeline@v0.9.0 · 5647 in / 1243 out tokens · 28972 ms · 2026-05-22T14:21:06.874203+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. KGLAMP: Knowledge Graph-guided Language model for Adaptive Multi-robot Planning and Replanning

    cs.RO 2026-02 unverdicted novelty 6.0

    KGLAMP uses a dynamically updated knowledge graph to guide LLMs in creating and replanning PDDL specifications for heterogeneous multi-robot teams, reporting at least 25.3% better performance than LLM-only or classica...

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    Bill Yuchen Lin, Yicheng Fu, Karina Yang, Faeze Brah- man, Shiyu Huang, Chandra Bhagavatula, Prithviraj Ammanabrolu, Yejin Choi, and Xiang Ren

    Embodied agent interface: Benchmarking llms for embodied decision making.arXiv preprint arXiv:2410.07166. Bill Yuchen Lin, Yicheng Fu, Karina Yang, Faeze Brah- man, Shiyu Huang, Chandra Bhagavatula, Prithviraj Ammanabrolu, Yejin Choi, and Xiang Ren. 2023. Swiftsage: A generative agent with fast and slow thinking for complex interactive tasks.Advances in N...

  2. [2]

    LLM+P: Empowering Large Language Models with Optimal Planning Proficiency

    Llm+ p: Empowering large language mod- els with optimal planning proficiency.arXiv preprint arXiv:2304.11477. Qing Lyu, Shreya Havaldar, Adam Stein, Li Zhang, Delip Rao, Eric Wong, Marianna Apidianaki, and Chris Callison-Burch. 2023. Faithful chain-of- thought reasoning. InProceedings of the 13th In- ternational Joint Conference on Natural Language Proces...

  3. [3]

    Clin: A continually learning language agent for rapid task adaptation and generalization.Preprint, arXiv:2310.10134. Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Le, Swaroop Mishra, Hossein Mobahi, Jindong Gu, Zifeng Wang, Hootan Nakhost, Chitta Baral, Chen-Yu Lee, Tomas Pfister, and Hamid Palangi

  4. [4]

    Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Côté, Yonatan Bisk, Adam Trischler, and Matthew Hausknecht

    Plangen: A multi-agent framework for gener- ating planning and reasoning trajectories for complex problem solving.Preprint, arXiv:2502.16111. Mohit Shridhar, Xingdi Yuan, Marc-Alexandre Côté, Yonatan Bisk, Adam Trischler, and Matthew Hausknecht. 2021. ALFWorld: Aligning Text and Embodied Environments for Interactive Learning. InProceedings of the Internat...

  5. [5]

    Karthik Valmeekam, Sarath Sreedharan, Matthew Mar- quez, Alberto Olmo, and Subbarao Kambhampati

    Planbench: An extensible benchmark for eval- uating large language models on planning and reason- ing about change.Advances in Neural Information Processing Systems, 36. Karthik Valmeekam, Sarath Sreedharan, Matthew Mar- quez, Alberto Olmo, and Subbarao Kambhampati

  6. [6]

    Solver failed to produce a valid plan

    On the planning abilities of large language models (a critical investigation with a proposed benchmark).Preprint, arXiv:2302.06706. Ruoyao Wang, Graham Todd, Eric Yuan, Ziang Xiao, Marc-Alexandre Côté, and Peter Jansen. 2023. Byte- sized32: A corpus and challenge task for generating task-specific world models expressed as text games. Preprint, arXiv:2305....

  7. [7]

    You are in the backyard

    move south EXEC RESULT OK, patio door opened, revealing backyard, agent in backyard (= patio) Observation 2 You open the patio door, revealing the backyard. You are in the backyard. Through an open patio door, to the North you see the kitchen. To the South you see the driveway. To the East you see the street. To the West you see a closed patio door. DF 2 ...

  8. [8]

    No door to open

    move south DeepSeek-R1 GPT-4.1 o3-mini Metric PlanGen PDDLego PDDLego+ PlanGen PDDLego PDDLego+ PlanGen PDDLego PDDLego+ trial_count 100 100 100 100 100 100 100 100 100 succeed_count 55 5 41 94 12 48 52 49 86 success_rate 55% 5% 41% 94% 12% 48% 52% 49% 86% total_solver_errors – – 177 – – 114 – – 43 total_solver_fixed – – 134 – – 67 – – 32 solver_error_fix...

  9. [9]

    (Error fixed!) Observation 3 You are in the driveway

    move south EXEC RESULT OK, moved to driveway. (Error fixed!) Observation 3 You are in the driveway. To the North you see the backyard. DF 3 (unchanged) PF 3 (:init (at driveway) (no-door backyard driveway south) (no-door backyard street east) (*@\textbf{(no-door driveway backyard north)}@*) ) (:goal (at street) ) SOLVER PLAN

  10. [10]

    Three planning frameworks per LLM; dashes mark metrics that are not defined for that framework

    move east DeepSeek-R1 GPT-4.1 o3-mini Metric PlanGen PDDLego PDDLego+ PlanGen PDDLego PDDLego+ PlanGen PDDLego PDDLego+ trial_count 100 100 100 100 100 100 100 100 100 succeed_count 17 4 19 5 1 7 5 3 38 success_rate 17% 4% 19% 5% 1% 7% 5% 3% 38% total_solver_errors – – 359 – – 181 – – 296 total_solver_fixed – – 292 – – 104 – – 266 solver_error_fix_rate – ...

  11. [11]

    I’m not sure what you mean

    move north EXEC RESULT OK, moved to the supermarket and saw a coin there! Task accomplished. SUMMARY Errors encountered and fixed Missing (no-door ...) relation -> added after sim-error All subsequent plans executed without error Final state DF: full action model with door-exists / door DeepSeek-R1 GPT-4.1 Metric simple detailed simple detailed trial_coun...

  12. [13]

    df": "

    :action move :parameters (?from - location ?to - location ?dir - direction) You should have a goal in the problem file like this: (:goal (at ?location) ) where location should be somewhere not visited Note: in problem file's init, you shouldn't have "not ()" but only the single status E.2: CoinCollector – DetailedPDDLego+ Prompt (after generatingPFandDF) ...

  13. [15]

    df": "

    :action move :parameters (?from - location ?to - location ?dir - direction) You should have a goal in the problem file like this: (:goal (at ?location) ) where location should be somewhere not visited Note: in problem file's init, you shouldn't have "not ()" but only the single status This is previous domain file: (define (domain exploration) (:requiremen...

  14. [17]

    df": "

    :action move :parameters (?from - location ?to - location ?dir - direction) Note: in problem file's init, you shouldn't have "not ()" but only the single status E.4: CoinCollector – SimplePDDLego+Prompt (after generatingPFandDF) Please provide the output in strict JSON format, without any additional text or explanation, including a PDDL domain file as'df'...

  15. [18]

    :action open-door :parameters (?loc1 - location ?loc2 - location ?dir - direction)

  16. [19]

    No additional memory available

    :action move :parameters (?from - location ?to - location ?dir - direction) Note: in problem file's init, you shouldn't have "not ()" but only the single status This is previous domain file: (define (domain exploration) (:requirements :strips) (:types location direction) (:predicates (at ?loc - location) (connected ?loc1 - location ?loc2 - location ?dir -...

  17. [35]

    For example, if you need to slice and then heat an object, first focus on slicing it, and then move on to heating it

    If there are multiple actions needed to complete the task, you can break them down into smaller subgoals. For example, if you need to slice and then heat an object, first focus on slicing it, and then move on to heating it. In summary, the first stage is all about finding the object. This might involve going to an unvisited receptacle and opening it if ne...

  18. [38]

    df": "

    Do not enter stage 2 when not finishing stage 1. Note: Always include :negative preconditions in your :requirements whenever you use (not) or delete effects, and never leave an : precondition or :effect block empty, either omit it or include at least one literal. E.7: ALFWorld – Detailed Prompt (after generatingPFandDF) Please provide the output in strict...

  19. [46]

    clean an object using a receptacle :action CleanObject :parameters (?o - object ?r - sinkbasinReceptacle)

  20. [48]

    The process involves two main stages:

    slice an object using a sharp object :action SliceObject :parameters (?r - receptacle ?co - object ?sharp_o - sharpObject) You must go to a receptacle first in order to use/open it or take/put objects from/on it. The process involves two main stages:

  21. [49]

    Some receptacles cannot be opened so you can directly see what objects after you go to that receptacle

    Always searching for the aim Object first!!! In this stage, your goal is to go to and may need to open new, unvisited recepatacles until you find the object mentioned in the task. Some receptacles cannot be opened so you can directly see what objects after you go to that receptacle. You can only use the GotoLocation action to travel to a new location and ...

  22. [50]

    Remember your goal is Your task is to: put some cloth on bathtubbasin

    After you seeing the aim object in any receptacle, using the Object to Complete the Task: After you have located the object (the object may have some numbers added), you should always first pick up the object from that receptacle and update your goal to focus on how the object is used to complete the task. Remember your goal is Your task is to: put some c...

  23. [51]

    For example, go to fridge, then cool the object with receptacle

    If you want to heat, clean, and cool an object, after you go to that aim receptacle, do not put the object in the receptacle but do the action directly. For example, go to fridge, then cool the object with receptacle

  24. [52]

    Don't forget to put the sharp object back to the receptacle after you finish slicing

    If you want to slice an object, you should first go to the receptacle where both the sharp object and the aim object are located and ONLY pick up the sharp object then do the slice action. Don't forget to put the sharp object back to the receptacle after you finish slicing

  25. [53]

    You don't need to take the lamp but directly use it

    If you want to examine or look at an object with a lamp, you should first go to the receptacle where the object is located and then pick it up and take the USE action of the lamp. You don't need to take the lamp but directly use it

  26. [54]

    For example, if you need to slice and then heat an object, first focus on slicing it, and then move on to heating it

    If there are multiple actions needed to complete the task, you can break them down into smaller subgoals. For example, if you need to slice and then heat an object, first focus on slicing it, and then move on to heating it. In summary, the first stage is all about finding the object, this might involve going to an unvisited receptacle and opening it if ne...

  27. [57]

    df": "

    Do not enter stage 2 when not finishing stage 1. Note: Always include :negative preconditions in your :requirements whenever you use (not) or delete effects, and never leave an : precondition or :effect block empty, either omit it or include at least one literal. This is previous domain file: (define (domain room_env) (:requirements :strips :typing :negat...

  28. [61]

    take an object from another receptacle :action PickupObject :parameters (?o - object ?r - receptacle)

  29. [62]

    put object into/on/in another receptacle :action PutObject :parameters (?o - object ?r - receptacle)

  30. [63]

    using an object/receptacle by turning it on/ off with a switch :action useObject :parameters (?o - object)

  31. [65]

    clean an object using a receptacle :action CleanObject :parameters (?o - object ?r - receptacle)

  32. [67]

    Goal 1.2: If the receptacle is closed, use the OpenObject action to reveal its contents

    slice an object using a sharp object :action SliceObject :parameters (?r - receptacle ?co - object ? sharp_o - object) Your process involves two main stages with the following subgoals: Stage 1: Search for the Target Object Goal 1.1: Move to a new, unvisited receptacle using the GotoLocation action. Goal 1.2: If the receptacle is closed, use the OpenObjec...

  33. [70]

    df": "

    Do not enter stage 2 when not finishing stage 1. Note: Always include :negative preconditions in your :requirements whenever you use (not) or delete effects, and never leave an : precondition or :effect block empty, either omit it or include at least one literal. E.9: ALFWorld – Simple Prompt (after generatingPFandDF) Please provide the output in strict J...

  34. [71]

    go to a receptacle :action GotoLocation :parameters (?from - receptacle ?to - receptacle)

  35. [72]

    open a receptacle if it is closed :action OpenObject :parameters (?r - receptacle)

  36. [73]

    close a receptacle :action CloseObject :parameters (?r - receptacle)

  37. [74]

    take an object from another receptacle :action PickupObject :parameters (?o - object ?r - receptacle )

  38. [75]

    put object into/on/in another receptacle :action PutObject :parameters (?o - object ?r - receptacle )

  39. [76]

    using an object/receptacle by turning it on/off with a switch :action useObject :parameters (?o - object)

  40. [77]

    heat an object using a receptacle :action HeatObject :parameters (?o - object ?r - microwaveReceptacle)

  41. [78]

    clean an object using a receptacle :action CleanObject :parameters (?o - object ?r - receptacle )

  42. [79]

    cool an object using a receptacle :action CoolObject :parameters (?o - object ?r - fridgeReceptacle)

  43. [80]

    Goal 1.2: If the receptacle is closed, use the OpenObject action to reveal its contents

    slice an object using a sharp object :action SliceObject :parameters (?r - receptacle ?co - object ?sharp_o - object) Your process involves two main stages with the following subgoals: Stage 1: Search for the Target Object Goal 1.1: Move to a new, unvisited receptacle using the GotoLocation action. Goal 1.2: If the receptacle is closed, use the OpenObject...

  44. [83]

    Do not enter stage 2 when not finishing stage 1. Note: Always include :negative preconditions in your :requirements whenever you use (not) or delete effects, and never leave an : precondition or :effect block empty, either omit it or include at least one literal. This is previous domain file: (define (domain cleaning_task) (:requirements :strips :typing :...

  45. [84]

    Some receptacles cannot be opened so you can directly see what objects after you go to that receptacle

    Always searching for the aim Object first!!! In this stage, your goal is to go to and may need to open new, unvisited recepatacles until you find the object mentioned in the task. Some receptacles cannot be opened so you can directly see what objects after you go to that receptacle. You can only use the GotoLocation action to travel to a new location and ...

  46. [85]

    This may involve more than simply transferring it from one place to another

    Using the Object to Complete the Task: Once you have located and picked up the object, update your goal to focus on how the object is used to complete the task. This may involve more than simply transferring it from one place to another. For example: You might examine the object or a nearby receptacle to gather information. You may need to use another too...

  47. [86]

    towelholder1

    some receptacles have numbers in their names. Always keep them as they are. For example, "towelholder1" should not be changed to " towelholder"

  48. [87]

    Your initial goal should always be to go to a new location instead of put something into somewhere

  49. [88]

    actions": [

    Do not enter stage 2 when not finishing stage 1. Memory of past steps: Action: look around You are in the middle of a room. Looking quickly around you, you see a armchair 2, a armchair 1, a coffeetable 2, a coffeetable 1, a diningtable 1, a garbagecan 1, a sidetable 2, a sidetable 1, and a sofa 1. If there are errors or obstacles, here is the message: No ...