Recognition: 1 theorem link
· Lean TheoremProgPrompt: Generating Situated Robot Task Plans using Large Language Models
Pith reviewed 2026-05-17 04:17 UTC · model grok-4.3
The pith
Structuring LLM prompts as executable programs lets robots generate valid task plans across any environment and capabilities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ProgPrompt is a prompt structure that supplies the LLM with program-like specifications of the available actions and objects in the current environment together with example executable programs; this format produces plans that remain functional across different situated environments, robot capabilities, and tasks without generating actions impossible in the robot's context.
What carries the argument
The ProgPrompt structure: a prompt containing code-style definitions of actions and objects plus example programs that the robot can actually execute.
If this is right
- Plans are produced without needing to enumerate every possible next action for scoring.
- The same prompt template works for robots with different capabilities and in different environments.
- State-of-the-art success rates are achieved on VirtualHome household tasks.
- The method transfers to physical robot arms performing tabletop tasks.
Where Pith is reading between the lines
- The same style of prompt could be tried for planning problems outside robotics, such as software task automation.
- Varying the number or complexity of example programs in the prompt might improve reliability on longer-horizon tasks.
- If the LLM occasionally still suggests invalid steps, a lightweight execution check could filter them without losing the benefits of the prompt structure.
Load-bearing premise
That giving the LLM program-like lists of actions and objects plus example programs will stop it from outputting actions impossible in the robot's present context.
What would settle it
Run the prompt on a new scene containing an object the prompt declares unavailable and check whether the generated plan ever references that object or an unavailable action.
read the original abstract
Task planning can require defining myriad domain knowledge about the world in which a robot needs to act. To ameliorate that effort, large language models (LLMs) can be used to score potential next actions during task planning, and even generate action sequences directly, given an instruction in natural language with no additional domain information. However, such methods either require enumerating all possible next steps for scoring, or generate free-form text that may contain actions not possible on a given robot in its current context. We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed. We make concrete recommendations about prompt structure and generation constraints through ablation experiments, demonstrate state of the art success rates in VirtualHome household tasks, and deploy our method on a physical robot arm for tabletop tasks. Website at progprompt.github.io
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a programmatic LLM prompt structure—specifying available actions and objects in program-like form along with executable example programs—enables generation of situated robot task plans that remain functional across environments, robot capabilities, and tasks. It reports ablation experiments on prompt structure and generation constraints, state-of-the-art success rates on VirtualHome household tasks, and successful physical deployment on a robot arm for tabletop tasks.
Significance. If the empirical results hold, the work offers a practical reduction in domain-knowledge engineering for robot planning by leveraging LLMs through structured prompts rather than free-form generation or exhaustive enumeration of next steps.
major comments (1)
- [Abstract; Method description of prompt structure] The central claim that the prompt structure reliably produces only contextually feasible plans rests on the LLM's implicit adherence to the supplied action/object specifications and examples. The abstract notes ablation experiments on prompt structure and generation constraints, yet the manuscript provides no description of an explicit runtime validity filter, precondition checker, or post-generation verification step; this leaves open the possibility that stochastic outputs can still include syntactically valid but semantically impossible actions (e.g., referencing absent objects or unmet preconditions) in novel settings or longer horizons.
minor comments (2)
- [Experiments section] In the VirtualHome results, clarify the precise success-rate numbers, number of trials, and exact baseline methods used for the SOTA comparison.
- [Ablation experiments] Specify the exact generation constraints (e.g., temperature, sampling method, or output formatting rules) applied during LLM inference.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback and for recognizing the practical value of structured prompting for reducing domain-knowledge engineering in robot task planning. We address the major comment below and will incorporate clarifications into the revised manuscript.
read point-by-point responses
-
Referee: [Abstract; Method description of prompt structure] The central claim that the prompt structure reliably produces only contextually feasible plans rests on the LLM's implicit adherence to the supplied action/object specifications and examples. The abstract notes ablation experiments on prompt structure and generation constraints, yet the manuscript provides no description of an explicit runtime validity filter, precondition checker, or post-generation verification step; this leaves open the possibility that stochastic outputs can still include syntactically valid but semantically impossible actions (e.g., referencing absent objects or unmet preconditions) in novel settings or longer horizons.
Authors: We thank the referee for this observation. Our method is deliberately designed without an explicit runtime validity filter, precondition checker, or post-generation verification step; the core idea is to leverage the LLM through a structured prompt that supplies the current environment's available actions and objects in program-like form together with executable example programs. This prompt is dynamically instantiated for each situated context, so the LLM is instructed to generate plans using only the listed primitives. Ablation experiments (reported in the manuscript) confirm that removing the action/object specifications or the examples substantially degrades success rates, supporting the value of this structure. While the stochastic nature of LLMs means invalid outputs remain theoretically possible, our VirtualHome results and physical robot deployments demonstrate that such cases are infrequent when the recommended prompt structure and generation constraints are used. We will revise the manuscript to (1) explicitly state the absence of an external verifier, (2) elaborate on how the prompt construction process enforces contextual adherence, and (3) add a short discussion of observed failure modes and behavior on longer horizons. revision: yes
Circularity Check
No significant circularity; empirical validation on external benchmarks
full rationale
The paper presents a prompting method for LLMs to generate situated robot task plans and supports its claims through ablation experiments, success rates on the VirtualHome benchmark, and physical robot deployment. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the derivation. The central technique relies on explicit prompt structure (action/object specs plus executable examples) whose effectiveness is tested against independent external environments rather than reducing to inputs by construction. This is the most common honest outcome for an empirical methods paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption LLMs possess sufficient commonsense and programming knowledge to produce valid executable plans when given environment specifications and examples
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present a programmatic LLM prompt structure that enables plan generation functional across situated environments, robot capabilities, and tasks. Our key insight is to prompt the LLM with program-like specifications of the available actions and objects in an environment, as well as with example programs that can be executed.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 17 Pith papers
-
Using large language models for embodied planning introduces systematic safety risks
LLM planners for robots often produce dangerous plans even when planning succeeds, with safety awareness staying flat as model scale improves planning ability.
-
ST-BiBench: Benchmarking Multi-Stream Multimodal Coordination in Bimanual Embodied Tasks for MLLMs
ST-BiBench reveals a coordination paradox in which MLLMs show strong high-level strategic reasoning yet fail at fine-grained 16-dimensional bimanual action synthesis and multi-stream fusion.
-
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
VoxPoser uses LLMs to compose 3D value maps via VLM interaction for model-based synthesis of robust robot trajectories on open-set language-specified manipulation tasks.
-
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager achieves superior lifelong learning in Minecraft by combining an automatic exploration curriculum, a library of executable skills, and iterative LLM prompting with environment feedback, yielding 3.3x more uniq...
-
LLM+P: Empowering Large Language Models with Optimal Planning Proficiency
LLM+P lets LLMs solve planning problems optimally by converting them to PDDL for classical planners and back to natural language.
-
When Robots Do the Chores: A Benchmark and Agent for Long-Horizon Household Task Execution
LongAct benchmark reveals top VLMs reach only 59% goal completion and 16% full success on long-horizon household tasks, while HoloMind agent improves results via DAG planner, multimodal spatial memory, episodic memory...
-
From Reaction to Anticipation: Proactive Failure Recovery through Agentic Task Graph for Robotic Manipulation
AgentChord models manipulation tasks as directed graphs enriched with anticipatory recovery branches, using specialized agents to enable immediate, low-latency failure responses and improve success on long-horizon bim...
-
Re$^2$MoGen: Open-Vocabulary Motion Generation via LLM Reasoning and Physics-Aware Refinement
Re²MoGen generates open-vocabulary motions via MCTS-enhanced LLM keyframe planning, pose-prior optimization with dynamic temporal matching fine-tuning, and physics-aware RL post-training, claiming SOTA performance.
-
A Physical Agentic Loop for Language-Guided Grasping with Execution-State Monitoring
A physical agentic loop with execution-state monitoring improves robustness of language-guided grasping over open-loop execution by converting noisy telemetry into discrete outcome events that trigger retries or user ...
-
SoK: Agentic Skills -- Beyond Tool Use in LLM Agents
The paper systematizes agentic skills beyond tool use, providing design pattern and representation-scope taxonomies plus security analysis of malicious skill infiltration in agent marketplaces.
-
Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory
GITM uses LLMs to generate action plans from text knowledge and memory, enabling agents to complete long-horizon Minecraft tasks at much higher success rates than prior RL methods.
-
Reasoning with Language Model is Planning with World Model
RAP turns LLMs into dual world-model and planning agents via MCTS to generate better reasoning paths, outperforming CoT baselines and achieving 33% relative gains over GPT-4 CoT using LLaMA-33B on plan generation.
-
PaLM-E: An Embodied Multimodal Language Model
PaLM-E is a single 562B-parameter multimodal model that performs embodied reasoning tasks like robotic manipulation planning and visual question answering by interleaving vision, state, and text inputs with positive t...
-
Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents
DEPS combines LLM-based interactive planning with a trainable goal selector to create a zero-shot multi-task agent that completes 70+ Minecraft tasks and nearly doubles prior performance.
-
Externalization in LLM Agents: A Unified Review of Memory, Skills, Protocols and Harness Engineering
LLM agent progress depends on externalizing cognitive functions into memory, skills, protocols, and harness engineering that coordinates them reliably.
-
A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation
HECG combines multi-dimensional metrics for strategy choice, ten-type error classification with recoverability details, and causal-context graphs to improve LLM agent reliability in complex tasks.
-
LEO-RobotAgent: A General-purpose Robotic Agent for Language-driven Embodied Operator
LEO-RobotAgent is a general-purpose framework that enables LLMs to independently plan, use tools, and collaborate with humans while operating multiple robot types for unpredictable tasks.
Reference graph
Works this paper leans on
-
[1]
Inner Monologue: Embodied Reasoning through Planning with Language Models
W. Huang, F. Xia, T. Xiao, H. Chan, J. Liang, P. Florence, A. Zeng, J. Tompson, I. Mordatch, Y . Chebotar, P. Sermanet, N. Brown, T. Jack- son, L. Luu, S. Levine, K. Hausman, and B. Ichter, “Inner monologue: Embodied reasoning through planning with language models,” inarXiv preprint arXiv:2207.05608, 2022
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[2]
Language models as zero-shot planners: Extracting actionable knowledge for embodied agents,
W. Huang, P. Abbeel, D. Pathak, and I. Mordatch, “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents,” arXiv preprint arXiv:2201.07207 , 2022
-
[3]
Socratic models: Composing zero-shot multimodal reasoning with language,
A. Zeng, M. Attarian, B. Ichter, K. Choromanski, A. Wong, S. Welker, F. Tombari, A. Purohit, M. Ryoo, V . Sindhwani, J. Lee, V . Vanhoucke, and P. Florence, “Socratic models: Composing zero-shot multimodal reasoning with language,” arXiv, 2022
work page 2022
-
[4]
Do as i can, not as i say: Grounding language in robotic affordances,
M. Ahn, A. Brohan, N. Brown, Y . Chebotar, O. Cortes, B. David, C. Finn, K. Gopalakrishnan, K. Hausman, A. Herzog, D. Ho, J. Hsu, J. Ibarz, B. Ichter, A. Irpan, E. Jang, R. J. Ruano, K. Jeffrey, S. Jesmonth, N. J. Joshi, R. Julian, D. Kalashnikov, Y . Kuang, K.- H. Lee, S. Levine, Y . Lu, L. Luu, C. Parada, P. Pastor, J. Quiambao, K. Rao, J. Rettinghouse,...
work page 2022
-
[5]
Strips: A new approach to the application of theorem proving to problem solving,
R. E. Fikes and N. J. Nilsson, “Strips: A new approach to the application of theorem proving to problem solving,” in Proceedings of the 2nd International Joint Conference on Artificial Intelligence , ser. IJCAI’71. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 1971, p. 608–620
work page 1971
-
[6]
Pddlstream: Integrating symbolic planners and blackbox samplers via optimistic adaptive planning,
C. R. Garrett, T. Lozano-P ´erez, and L. P. Kaelbling, “Pddlstream: Integrating symbolic planners and blackbox samplers via optimistic adaptive planning,” Proceedings of the International Conference on Automated Planning and Scheduling, vol. 30, no. 1, pp. 440–448, Jun. 2020
work page 2020
-
[7]
Task planning in robotics: an empirical comparison of pddl-based and asp-based systems,
Y . Jiang, S. Zhang, P. Khandelwal, and P. Stone, “Task planning in robotics: an empirical comparison of pddl-based and asp-based systems,” 2018
work page 2018
-
[8]
Virtualhome: Simulating household activities via programs,
X. Puig, K. Ra, M. Boben, J. Li, T. Wang, S. Fidler, and A. Torralba, “Virtualhome: Simulating household activities via programs,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition , 2018, pp. 8494–8502
work page 2018
-
[9]
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks,
M. Shridhar, J. Thomason, D. Gordon, Y . Bisk, W. Han, R. Mottaghi, L. Zettlemoyer, and D. Fox, “ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , 2020
work page 2020
-
[10]
A heuristic search approach to planning with temporally extended preferences,
J. A. Baier, F. Bacchus, and S. A. McIlraith, “A heuristic search approach to planning with temporally extended preferences,” in Pro- ceedings of the 20th International Joint Conference on Artifical Intel- ligence, ser. IJCAI’07. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2007, p. 1808–1815
work page 2007
-
[11]
Ff: The fast-forward planning system,
J. Hoffmann, “Ff: The fast-forward planning system,” AI Magazine , vol. 22, no. 3, p. 57, Sep. 2001
work page 2001
-
[12]
The fast downward planning system,
M. Helmert, “The fast downward planning system,” J. Artif. Int. Res. , vol. 26, no. 1, p. 191–246, jul 2006
work page 2006
-
[13]
A tutorial on planning graph based reachability heuristics,
D. Bryce and S. Kambhampati, “A tutorial on planning graph based reachability heuristics,” AI Magazine, vol. 28, no. 1, p. 47, Mar. 2007
work page 2007
-
[14]
Search on the replay buffer: Bridging planning and reinforcement learning,
B. Eysenbach, R. R. Salakhutdinov, and S. Levine, “Search on the replay buffer: Bridging planning and reinforcement learning,” in Advances in Neural Information Processing Systems , H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019
work page 2019
-
[15]
Neural task programming: Learning to generalize across hierarchical tasks,
D. Xu, S. Nair, Y . Zhu, J. Gao, A. Garg, L. Fei-Fei, and S. Savarese, “Neural task programming: Learning to generalize across hierarchical tasks,” in 2018 IEEE International Conference on Robotics and Automation (ICRA), 2018, pp. 3795–3802
work page 2018
-
[16]
D. Xu, R. Mart ´ın-Mart´ın, D.-A. Huang, Y . Zhu, S. Savarese, and L. F. Fei-Fei, “Regression planning networks,” in Advances in Neural Infor- mation Processing Systems , H. Wallach, H. Larochelle, A. Beygelz- imer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019
work page 2019
-
[17]
Inventing relational state and action abstractions for effective and efficient bilevel planning,
T. Silver, R. Chitnis, N. Kumar, W. McClinton, T. Lozano-Perez, L. P. Kaelbling, and J. Tenenbaum, “Inventing relational state and action abstractions for effective and efficient bilevel planning,” in The Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM) , 2022
work page 2022
-
[18]
Value function spaces: Skill-centric state abstractions for long-horizon reasoning,
D. Shah, A. T. Toshev, S. Levine, and brian ichter, “Value function spaces: Skill-centric state abstractions for long-horizon reasoning,” in International Conference on Learning Representations , 2022
work page 2022
-
[19]
Universal planning networks: Learning generalizable representations for visuo- motor control,
A. Srinivas, A. Jabri, P. Abbeel, S. Levine, and C. Finn, “Universal planning networks: Learning generalizable representations for visuo- motor control,” in Proceedings of the 35th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, J. Dy and A. Krause, Eds., vol. 80. PMLR, 10–15 Jul 2018, pp. 4732–4741
work page 2018
-
[20]
Learning plannable representations with causal infogan,
T. Kurutach, A. Tamar, G. Yang, S. J. Russell, and P. Abbeel, “Learning plannable representations with causal infogan,” in Advances in Neural Information Processing Systems , S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, Eds., vol. 31. Curran Associates, Inc., 2018
work page 2018
-
[21]
Grounding language to autonomously-acquired skills via goal genera- tion,
A. Akakzia, C. Colas, P.-Y . Oudeyer, M. CHETOUANI, and O. Sigaud, “Grounding language to autonomously-acquired skills via goal genera- tion,” in International Conference on Learning Representations, 2021
work page 2021
-
[22]
S. Nair and C. Finn, “Hierarchical foresight: Self-supervised learning of long-horizon tasks via visual subgoal generation,” in International Conference on Learning Representations , 2020
work page 2020
-
[23]
Language as an abstraction for hierarchical deep reinforcement learning,
Y . Jiang, S. S. Gu, K. P. Murphy, and C. Finn, “Language as an abstraction for hierarchical deep reinforcement learning,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alch´e-Buc, E. Fox, and R. Garnett, Eds., vol. 32. Curran Associates, Inc., 2019
work page 2019
-
[24]
Ella: Exploration through learned language abstraction,
S. Mirchandani, S. Karamcheti, and D. Sadigh, “Ella: Exploration through learned language abstraction,” in Advances in Neural Infor- mation Processing Systems, M. Ranzato, A. Beygelzimer, Y . Dauphin, P. Liang, and J. W. Vaughan, Eds., vol. 34. Curran Associates, Inc., 2021, pp. 29 529–29 540
work page 2021
-
[25]
Skill induction and planning with latent language,
P. Sharma, A. Torralba, and J. Andreas, “Skill induction and planning with latent language,” in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Dublin, Ireland: Association for Computational Linguistics, May 2022, pp. 1713–1726
work page 2022
-
[26]
Hierarchical planning for long-horizon manipulation with geometric and symbolic scene graphs,
Y . Zhu, J. Tremblay, S. Birchfield, and Y . Zhu, “Hierarchical planning for long-horizon manipulation with geometric and symbolic scene graphs,” 2020
work page 2020
-
[27]
Language models are few-shot learners,
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhari- wal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-V oss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. A...
work page 2020
-
[28]
Evaluating large language models trained on code,
M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. d. O. Pinto, J. Kaplan, H. Edwards, Y . Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavar- ian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert...
work page 2021
-
[29]
Effective approaches to attention-based neural machine translation,
T. Luong, H. Pham, and C. D. Manning, “Effective approaches to attention-based neural machine translation,” inProceedings of the 2015 Conference on Empirical Methods in Natural Language Processing . Lisbon, Portugal: Association for Computational Linguistics, Sept. 2015, pp. 1412–1421
work page 2015
-
[30]
Challenges in data-to-document generation,
S. Wiseman, S. Shieber, and A. Rush, “Challenges in data-to-document generation,” in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing . Copenhagen, Denmark: Association for Computational Linguistics, Sept. 2017, pp. 2253–2263
work page 2017
-
[31]
The curious case of neural text degeneration,
A. Holtzman, J. Buys, L. Du, M. Forbes, and Y . Choi, “The curious case of neural text degeneration,” in International Conference on Learning Representations, 2020
work page 2020
-
[32]
P. Jansen, “Visually-grounded planning without vision: Language models infer detailed plans from high-level instructions,” in Findings of the Association for Computational Linguistics: EMNLP 2020 . Online: Association for Computational Linguistics, Nov. 2020, pp. 4412–4417
work page 2020
-
[33]
Pre-trained language models for interactive decision-making,
S. Li, X. Puig, C. Paxton, Y . Du, C. Wang, L. Fan, T. Chen, D.- A. Huang, E. Aky ¨urek, A. Anandkumar, J. Andreas, I. Mordatch, A. Torralba, and Y . Zhu, “Pre-trained language models for interactive decision-making,” 2022
work page 2022
-
[34]
Mapping language models to grounded conceptual spaces,
R. Patel and E. Pavlick, “Mapping language models to grounded conceptual spaces,” in International Conference on Learning Repre- sentations, 2022
work page 2022
-
[35]
P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre- train, prompt, and predict: A systematic survey of prompting methods in natural language processing,” 2021
work page 2021
-
[36]
Chain of thought prompting elicits reasoning in large language models,
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. Chi, Q. Le, and D. Zhou, “Chain of thought prompting elicits reasoning in large language models,” 2022
work page 2022
-
[37]
Object rearrangement using learned implicit collision functions,
M. Danielczuk, A. Mousavian, C. Eppner, and D. Fox, “Object rearrangement using learned implicit collision functions,” IEEE In- ternational Conference on Robotics and Automation (ICRA) , 2021
work page 2021
-
[38]
Contact- graspnet: Efficient 6-dof grasp generation in cluttered scenes,
M. Sundermeyer, A. Mousavian, R. Triebel, and D. Fox, “Contact- graspnet: Efficient 6-dof grasp generation in cluttered scenes,” in 2021 IEEE International Conference on Robotics and Automation (ICRA) , 2021, pp. 13 438–13 444
work page 2021
-
[39]
Open-vocabulary object detec- tion via vision and language knowledge distillation,
X. Gu, T.-Y . Lin, W. Kuo, and Y . Cui, “Open-vocabulary object detec- tion via vision and language knowledge distillation,” in International Conference on Learning Representations , 2022
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.