ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Pith reviewed 2026-05-23 23:44 UTC · model grok-4.3
The pith
A framework integrates LLMs with ROS so non-experts can program robots through natural language chat with automatic behavior extraction and feedback.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that connecting an AI agent to open-source and commercial LLMs inside ROS enables automatic extraction of behaviors from LLM output, their execution as ROS actions or services, support for sequence, behavior tree, and state machine modes, imitation learning to enlarge the action library, and reflection on human or environment feedback, with experiments confirming the setup handles diverse robotic scenarios.
What carries the argument
Automatic extraction of behaviors from LLM output and their direct mapping to executable ROS actions/services, which turns natural language into runnable robot programs across multiple structured modes.
If this is right
- Non-experts can specify task requirements through a chat interface without writing code.
- The same framework supports long-horizon tasks, tabletop rearrangements, and remote supervisory control.
- Imitation learning expands the library of available robot actions over time.
- LLM reflection improves execution by incorporating feedback from humans and the physical environment.
- Open-source release of the code allows others to reproduce results and extend the system.
Where Pith is reading between the lines
- The approach could let non-programmers deploy robots in settings such as homes or small workshops where expert coders are unavailable.
- Similar LLM-to-action pipelines might apply to other middleware besides ROS if the extraction step generalizes.
- Long-term use could reveal whether repeated feedback loops reduce the rate of mapping errors over successive tasks.
Load-bearing premise
The automatic extraction of behaviors from LLM output can be mapped reliably to executable ROS actions/services without frequent human intervention or failure across varied prompts and environments.
What would settle it
A test set of new prompts and environments in which the system fails to produce correct ROS mappings from LLM outputs in more than a small fraction of trials.
Figures
read the original abstract
We present a framework for intuitive robot programming by non-experts, leveraging natural language prompts and contextual information from the Robot Operating System (ROS). Our system integrates large language models (LLMs), enabling non-experts to articulate task requirements to the system through a chat interface. Key features of the framework include: integration of ROS with an AI agent connected to a plethora of open-source and commercial LLMs, automatic extraction of a behavior from the LLM output and execution of ROS actions/services, support for three behavior modes (sequence, behavior tree, state machine), imitation learning for adding new robot actions to the library of possible actions, and LLM reflection via human and environment feedback. Extensive experiments validate the framework, showcasing robustness, scalability, and versatility in diverse scenarios, including long-horizon tasks, tabletop rearrangements, and remote supervisory control. To facilitate the adoption of our framework and support the reproduction of our results, we have made our code open-source. You can access it at: https://github.com/huawei-noah/HEBO/tree/master/ROSLLM.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents ROS-LLM, a framework integrating large language models with ROS to enable non-experts to program robots via natural language chat. It features an AI agent supporting multiple LLMs, automatic extraction of LLM outputs into one of three behavior representations (sequence, behavior tree, state machine) for mapping to ROS actions/services, imitation learning to extend the action library, and LLM reflection using human/environment feedback. The authors claim extensive experiments demonstrate robustness, scalability, and versatility across long-horizon tasks, tabletop rearrangements, and remote supervisory control, with open-source code provided for reproduction.
Significance. If the automatic extraction and execution pipeline proves reliable with minimal intervention, the framework could lower barriers to embodied AI deployment by combining structured reasoning modes with ROS primitives. The open-source release is a clear strength for reproducibility. However, without quantitative validation the practical significance remains difficult to assess against prior ROS-LLM integrations.
major comments (2)
- [Abstract] Abstract: the claim that 'extensive experiments validate the framework, showcasing robustness, scalability, and versatility' is unsupported by any reported metrics (success rates, parsing failure counts, human intervention frequency, or baselines). This directly affects the central assertion that automatic extraction of behaviors from LLM output can be mapped reliably to executable ROS actions/services without frequent human intervention.
- [Key features / Experimental validation] The description of the extraction and reflection mechanism (key features paragraph) asserts reliable mapping across prompt variations and environments, yet no quantitative evidence (e.g., extraction accuracy, retry counts, or task completion rates) is supplied for the long-horizon or tabletop scenarios. This leaves the 'intuitive for non-experts' requirement unverified.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major comment below and agree that revisions are required to ensure claims are supported by the presented material.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'extensive experiments validate the framework, showcasing robustness, scalability, and versatility' is unsupported by any reported metrics (success rates, parsing failure counts, human intervention frequency, or baselines). This directly affects the central assertion that automatic extraction of behaviors from LLM output can be mapped reliably to executable ROS actions/services without frequent human intervention.
Authors: We agree that the abstract claim is not supported by quantitative metrics, as the manuscript presents only qualitative demonstrations of the framework in long-horizon tasks, tabletop rearrangements, and remote control scenarios. We will revise the abstract to describe these as illustrative examples of the framework's capabilities rather than claiming validation of robustness, scalability, or versatility through metrics. The central assertion about reliable mapping will also be qualified to reflect the absence of such data. revision: yes
-
Referee: [Key features / Experimental validation] The description of the extraction and reflection mechanism (key features paragraph) asserts reliable mapping across prompt variations and environments, yet no quantitative evidence (e.g., extraction accuracy, retry counts, or task completion rates) is supplied for the long-horizon or tabletop scenarios. This leaves the 'intuitive for non-experts' requirement unverified.
Authors: We acknowledge that assertions of reliable mapping and intuitiveness for non-experts in the key features and experimental sections lack supporting quantitative evidence such as accuracy rates or intervention counts. The current text relies on descriptive examples. We will revise these sections to remove or qualify claims of reliability and to clarify that the non-expert usability is a design goal illustrated by the chat interface and behavior modes, without empirical verification in the manuscript. A limitations discussion will be added if appropriate. revision: yes
Circularity Check
No circularity: engineering integration validated externally
full rationale
The paper describes a software framework integrating ROS with LLMs for robot task execution via natural language. No mathematical derivations, fitted parameters, or equations are present. Validation relies on external robot performance in experiments (long-horizon tasks, rearrangements, remote control), not on self-referential definitions or self-citation chains. The extraction and mapping steps are implementation details whose reliability is claimed to be measured by system behavior, not reduced to inputs by construction. This matches the default non-circular case for systems papers.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLM-generated text can be automatically parsed into valid sequence, behavior tree, or state machine structures that execute correctly on ROS.
- standard math ROS actions and services provide a stable interface for execution and feedback collection.
Forward citations
Cited by 2 Pith papers
-
From Prompt to Physical Action: Structured Backdoor Attacks on LLM-Mediated Robotic Control Systems
Backdoor attacks aligned with JSON command formats in LLM robot controllers achieve 83% attack success rate while preserving over 93% clean accuracy and sub-second latency.
-
ORICF -- Open Robotics Inference and Control Framework
ORICF is a declarative, model-agnostic robotics framework with YAML specs and edge offloading that reduces robot compute utilization by up to 83% and energy by 66% in a ROS2 demo combining ASR, LLM, and CNN.
Reference graph
Works this paper leans on
-
[1]
Andrew G. Barto and Sridhar Mahadevan. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems, 13(4):341–379, Oct 2003
work page 2003
-
[2]
Trac-ik: An open-source library for improved solving of generic inverse kinematics
Patrick Beeson and Barrett Ames. Trac-ik: An open-source library for improved solving of generic inverse kinematics. In 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pages 928–935, 2015
work page 2015
-
[3]
Rt-h: Action hierarchies using language, 2024
Suneel Belkhale, Tianli Ding, Ted Xiao, Pierre Sermanet, Quon Vuong, Jonathan Tompson, Yevgen Chebotar, Debidatta Dwibedi, and Dorsa Sadigh. Rt-h: Action hierarchies using language, 2024
work page 2024
-
[4]
Calin Belta, Antonio Bicchi, Magnus Egerstedt, Emilio Frazzoli, Eric Klavins, and George J. Pappas. Symbolic planning and control of robot motion [grand challenges of robotics]. IEEE Robotics & Automation Magazine, 14(1):61–70, 2007
work page 2007
-
[5]
H. Bruyninckx. Open robot control software: the orocos project. In Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164) , volume 3, pages 2523–2528 vol.3, 2001
work page 2001
-
[6]
Yue Cao and C. S. George Lee. Robot behavior-tree-based task generation with large language models, 2023
work page 2023
-
[7]
Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, Alex Ray, Raul Puri, Gretchen Krueger, Michael Petrov, Heidy Khlaaf, Girish Sastry, Pamela Mishkin, Brooke Chan, Scott Gray, Nick Ryder, Mikhail Pavlov, Alethea Power, Lukasz Kaiser, Mohammad Bavarian...
work page 2021
-
[8]
ros_control: A generic and simple control framework for ros
Sachin Chitta, Eitan Marder-Eppstein, Wim Meeussen, Vijay Pradeep, Adolfo Rodríguez Tsouroukdissian, Jonathan Bohren, David Coleman, Bence Magyar, Gennaro Raiola, Mathias Lüdtke, and Enrique Fernandez Perdomo. ros_control: A generic and simple control framework for ros. Journal of Open Source Software, 2(20):456, 2017
work page 2017
-
[9]
Reducing the Barrier to Entry of Complex Robotic Software: a MoveIt! Case Study
David Coleman, Ioan Sucan, Sachin Chitta, and Nikolaus Correll. Reducing the barrier to entry of complex robotic software: a moveit! case study. arXiv preprint arXiv:1404.3785, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[10]
John G. Everett and Alexander H. Slocum. Automation and robotics opportunities: Construction versus manufacturing. Journal of Construction Engineering and Management , 120(2):443–452, 1994
work page 1994
-
[11]
Tully Foote. tf: The transform library. In 2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA), pages 1–6, 2013
work page 2013
-
[12]
Mathematical capabilities of chatgpt, 2023
Simon Frieder, Luca Pinchetti, Alexis Chevalier, Ryan-Rhys Griffiths, Tommaso Salvatori, Thomas Lukasiewicz, Philipp Christian Petersen, and Julius Berner. Mathematical capabilities of chatgpt, 2023
work page 2023
-
[13]
Daya Guo, Qihao Zhu, Dejian Yang, Zhenda Xie, Kai Dong, Wentao Zhang, Guanting Chen, Xiao Bi, Y . Wu, Y . K. Li, Fuli Luo, Yingfei Xiong, and Wenfeng Liang. Deepseek-coder: When the large language model meets programming – the rise of code intelligence, 2024
work page 2024
-
[14]
Muzhi Han, Yifeng Zhu, Song-Chun Zhu, Ying Nian Wu, and Yuke Zhu. Interpret: Interactive predicate learning from language feedback for generalizable task planning, 2024
work page 2024
- [15]
-
[16]
Huang, Edwin Olson, and David C
Albert S. Huang, Edwin Olson, and David C. Moore. Lcm: Lightweight communications and marshalling. In 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems , pages 4057–4062, 2010
work page 2010
-
[17]
Language models as zero-shot planners: Extracting actionable knowledge for embodied agents, 2022
Wenlong Huang, Pieter Abbeel, Deepak Pathak, and Igor Mordatch. Language models as zero-shot planners: Extracting actionable knowledge for embodied agents, 2022
work page 2022
-
[18]
Grounded decoding: Guiding text generation with grounded models for embodied agents, 2023
Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, and Brian Ichter. Grounded decoding: Guiding text generation with grounded models for embodied agents, 2023
work page 2023
-
[19]
Mower, Sebastien Ourselin, Tom Vercauteren, and Christos Bergeles
Martin Huber, Christopher E. Mower, Sebastien Ourselin, Tom Vercauteren, and Christos Bergeles. Lbr-stack: Ros 2 and python integration of kuka fri for med and iiwa robots, 2024
work page 2024
-
[20]
Multimodal detection and classification of robot manipulation failures
Arda Inceoglu, Eren Erdal Aksoy, and Sanem Sariel. Multimodal detection and classification of robot manipulation failures. IEEE Robotics and Automation Letters , 9(2):1396–1403, 2024
work page 2024
-
[21]
A survey of behavior trees in robotics and ai
Matteo Iovino, Edvards Scukins, Jonathan Styrud, Petter Ögren, and Christian Smith. A survey of behavior trees in robotics and ai. Robotics and Autonomous Systems , 154:104096, 2022
work page 2022
-
[22]
Btgenbot: Behavior tree generation for robotic tasks with lightweight llms, 2024
Riccardo Andrea Izzo, Gianluca Bardaro, and Matteo Matteucci. Btgenbot: Behavior tree generation for robotic tasks with lightweight llms, 2024
work page 2024
-
[23]
Vima: General robot manipulation with multimodal prompts, 2023
Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, and Linxi Fan. Vima: General robot manipulation with multimodal prompts, 2023
work page 2023
-
[24]
HA- GRID: A human-llm collaborative dataset for generative information-seeking with attribution
Ehsan Kamalloo, Aref Jafari, Xinyu Zhang, Nandan Thakur, and Jimmy Lin. HA- GRID: A human-llm collaborative dataset for generative information-seeking with attribution. arXiv:2307.16883, 2023
-
[25]
Understanding large-language model (llm)- powered human-robot interaction
Callie Y Kim, Christine P Lee, and Bilge Mutlu. Understanding large-language model (llm)- powered human-robot interaction. In Proceedings of the 2024 ACM/IEEE International Confer- ence on Human-Robot Interaction , pages 371–380, 2024. 23
work page 2024
-
[26]
Design and use paradigms for gazebo, an open-source multi-robot simulator
Nathan Koenig and Andrew Howard. Design and use paradigms for gazebo, an open-source multi-robot simulator. In IEEE/RSJ International Conference on Intelligent Robots and Systems , pages 2149–2154, Sendai, Japan, Sep 2004
work page 2004
-
[27]
Language models as zero-shot trajectory generators, 2023
Teyun Kwon, Norman Di Palo, and Edward Johns. Language models as zero-shot trajectory generators, 2023
work page 2023
-
[28]
Chain of code: Reasoning with a language model-augmented code emulator, 2023
Chengshu Li, Jacky Liang, Andy Zeng, Xinyun Chen, Karol Hausman, Dorsa Sadigh, Sergey Levine, Li Fei-Fei, Fei Xia, and Brian Ichter. Chain of code: Reasoning with a language model-augmented code emulator, 2023
work page 2023
-
[29]
Clmasp: Coupling large language models with answer set programming for robotic task planning, 2024
Xinrui Lin, Yangfan Wu, Huanyu Yang, Yu Zhang, Yanyong Zhang, and Jianmin Ji. Clmasp: Coupling large language models with answer set programming for robotic task planning, 2024
work page 2024
-
[30]
Interactive robot learning from verbal correction, 2023
Huihan Liu, Alice Chen, Yuke Zhu, Adith Swaminathan, Andrey Kolobov, and Ching-An Cheng. Interactive robot learning from verbal correction, 2023
work page 2023
-
[31]
Llm-brain: Ai-driven fast generation of robot behaviour tree based on large language model, 2023
Artem Lykov and Dzmitry Tsetserukou. Llm-brain: Ai-driven fast generation of robot behaviour tree based on large language model, 2023
work page 2023
-
[32]
Robot op- erating system 2: Design, architecture, and uses in the wild
Steven Macenski, Tully Foote, Brian Gerkey, Chris Lalancette, and William Woodall. Robot op- erating system 2: Design, architecture, and uses in the wild. Science Robotics, 7(66):eabm6074, 2022
work page 2022
-
[33]
Natural language as policies: Reasoning for coordinate-level embodied control with llms, 2024
Yusuke Mikami, Andrew Melnik, Jun Miura, and Ville Hautamäki. Natural language as policies: Reasoning for coordinate-level embodied control with llms, 2024
work page 2024
-
[34]
Ros-pybullet interface: A framework for reliable contact simulation and human-robot interaction
Christopher Mower, Theodoros Stouraitis, Joao Moura, Christian Rauch, Lei Yan, Nazanin Za- mani Behabadi, Michael Gienger, Tom Vercauteren, Christos Bergeles, and Sethu Vijayakumar. Ros-pybullet interface: A framework for reliable contact simulation and human-robot interaction. In Conference on Robot Learning, pages 1411–1423. PMLR, 2023
work page 2023
-
[35]
Christopher E Mower, Joao Moura, and Sethu Vijayakumar. Skill-based Shared Control. In Proceedings of Robotics: Science and Systems , Virtual, July 2021
work page 2021
-
[36]
Christopher E. Mower, João Moura, Nazanin Zamani Behabadi, Sethu Vijayakumar, Tom Vercauteren, and Christos Bergeles. Optas: An optimization-based task specification library for trajectory optimization and model predictive control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) , pages 9118–9124, 2023
work page 2023
-
[37]
Embodiedgpt: Vision-language pre-training via embodied chain of thought, 2023
Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, and Ping Luo. Embodiedgpt: Vision-language pre-training via embodied chain of thought, 2023
work page 2023
-
[38]
Apriltag: A robust and flexible visual fiducial system
Edwin Olson. Apriltag: A robust and flexible visual fiducial system. In2011 IEEE International Conference on Robotics and Automation , pages 3400–3407, 2011
work page 2011
-
[39]
OpenAI, Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Flo- rencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, Red Avila, Igor Babuschkin, Suchir Balaji, Valerie Balcom, Paul Baltescu, Haiming Bao, Mohammad Bavarian, Jeff Belgum, Irwan Bello, Jake Berdine, Gabriel Bernadett-Shapiro, Christopher Bern...
work page 2024
-
[40]
Pablo Pueyo, Eduardo Montijano, Ana C. Murillo, and Mac Schwager. Clipswarm: Generating drone shows from text prompts with vision-language models, 2024
work page 2024
-
[41]
Ros: an open-source robot operating system
Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, Andrew Y Ng, et al. Ros: an open-source robot operating system. In ICRA workshop on open source software, volume 3, page 5. Kobe, Japan, 2009
work page 2009
-
[42]
Robust speech recognition via large-scale weak supervision
Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning, pages 28492–28518. PMLR, 2023
work page 2023
-
[43]
Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, and Chelsea Finn
Lucy Xiaoyang Shi, Zheyuan Hu, Tony Z. Zhao, Archit Sharma, Karl Pertsch, Jianlan Luo, Sergey Levine, and Chelsea Finn. Yell at your robot: Improving on-the-fly from language corrections, 2024
work page 2024
-
[44]
Tenenbaum, Leslie Pack Kaelbling, and Michael Katz
Tom Silver, Soham Dan, Kavitha Srinivas, Joshua B. Tenenbaum, Leslie Pack Kaelbling, and Michael Katz. Generalized planning in pddl domains with pretrained large language models, 2023
work page 2023
-
[45]
Progprompt: Generating situated robot task plans using large language models, 2022
Ishika Singh, Valts Blukis, Arsalan Mousavian, Ankit Goyal, Danfei Xu, Jonathan Tremblay, Dieter Fox, Jesse Thomason, and Animesh Garg. Progprompt: Generating situated robot task plans using large language models, 2022
work page 2022
-
[46]
Karan Singhal, Shekoofeh Azizi, Tao Tu, S. Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, Perry Payne, Martin Senevi- ratne, Paul Gamble, Chris Kelly, Abubakr Babiker, Nathanael Schärli, Aakanksha Chowdhery, Philip Mansfield, Dina Demner-Fushman, Blaise Agüera y Arcas, Dale Webster, Greg S. Corrad...
work page 2023
-
[47]
R. Smits. KDL: Kinematics and Dynamics Library. http://www.orocos.org/kdl
-
[48]
Llm-planner: Few-shot grounded planning for embodied agents with large language models
Chan Hee Song, Jiaman Wu, Clayton Washington, Brian M Sadler, Wei-Lun Chao, and Yu Su. Llm-planner: Few-shot grounded planning for embodied agents with large language models. In Proceedings of the IEEE/CVF International Conference on Computer Vision , pages 2998–3009, 2023. 25
work page 2023
-
[49]
To help or not to help: Llm-based attentive support for human-robot group interactions, 2024
Daniel Tanneberg, Felix Ocker, Stephan Hasler, Joerg Deigmoeller, Anna Belardinelli, Chao Wang, Heiko Wersing, Bernhard Sendhoff, and Michael Gienger. To help or not to help: Llm-based attentive support for human-robot group interactions, 2024
work page 2024
-
[50]
Llama: Open and efficient foundation language models, 2023
Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timo- thée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, and Guillaume Lample. Llama: Open and efficient foundation language models, 2023
work page 2023
-
[51]
Trieu H. Trinh, Yuhuai Wu, Quoc V . Le, He He, and Thang Luong. Solving olympiad geometry without human demonstrations. Nature, 625(7995):476–482, Jan 2024
work page 2024
-
[52]
Why can large language models generate correct chain-of-thoughts? 2023
Rasul Tutunov, Antoine Grosnit, Juliusz Ziomek, Jun Wang, and Haitham Bou-Ammar. Why can large language models generate correct chain-of-thoughts? 2023
work page 2023
-
[53]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017
work page 2017
-
[54]
Performance and usability evaluation scheme for mobile manipulator teleopera- tion
Yuhui Wan, Jingcheng Sun, Christopher Peers, Joseph Humphreys, Dimitrios Kanoulas, and Chengxu Zhou. Performance and usability evaluation scheme for mobile manipulator teleopera- tion. IEEE Transactions on Human-Machine Systems, 2023
work page 2023
-
[55]
Llm granularity for on-the-fly robot control, 2024
Peng Wang, Mattia Robbiani, and Zhihao Guo. Llm granularity for on-the-fly robot control, 2024
work page 2024
-
[56]
Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed H. Chi, Quoc V . Le, and Denny Zhou. Chain-of-thought prompting elicits reasoning in large language models. 2022
work page 2022
-
[57]
Large language models for verifiable sequential decision-making in autonomous systems
Yunhao Yang, Jean-Raphael Gaglione, Cyrus Neary, et al. Large language models for verifiable sequential decision-making in autonomous systems. In 2nd Workshop on Language and Robot Learning: Language as Grounding , 2023
work page 2023
-
[58]
Takuma Yoneda, Jiading Fang, Peng Li, Huanyu Zhang, Tianchong Jiang, Shengjie Lin, Ben Picker, David Yunis, Hongyuan Mei, and Matthew R. Walter. Statler: State-maintaining language models for embodied reasoning, 2023
work page 2023
-
[59]
Socratic models: Composing zero-shot multimodal reasoning with language, 2022
Andy Zeng, Maria Attarian, Brian Ichter, Krzysztof Choromanski, Adrian Wong, Stefan Welker, Federico Tombari, Aveek Purohit, Michael Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, and Pete Florence. Socratic models: Composing zero-shot multimodal reasoning with language, 2022. 26
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.