pith. machine review for the scientific record. sign in

arxiv: 2605.07877 · v1 · submitted 2026-05-08 · 💻 cs.RO

Recognition: no theorem link

Melding LLM and temporal logic for reliable human-swarm collaboration in complex scenarios

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:25 UTC · model grok-4.3

classification 💻 cs.RO
keywords human-swarm collaborationtemporal logiclarge language modelstask planningneuro-symbolic frameworkrobot swarmsuncertainty-aware scheduling
0
0 comments X

The pith

Temporal logic constraints guide LLMs to generate valid, scene-grounded task sequences for robot swarms

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes a framework that combines formal temporal logic formulas and task automata with large language models to create executable subtask plans for groups of robots. This conditioning on rules and current scene data aims to eliminate invalid task orderings and infeasible actions that often occur with unguided LLMs during long missions. An uncertainty-aware scheduler then distributes the subtasks across a mixed fleet of robots to increase parallel work and maintain performance when disruptions occur. The approach adds an event-triggered protocol so human operators only provide occasional high-level input rather than constant oversight. Deployment tests on physical heterogeneous robots show the plans remain workable despite variations in hardware and communication.

Core claim

The authors formalize mission goals and operational rules as temporal logic formulas with admissible task orderings expressed as task automata. Conditioned on these formal constraints plus live perceptual context, LLMs produce executable subtask sequences that satisfy the rules and stay grounded in the observed scene. An uncertainty-aware scheduler assigns the subtasks across the heterogeneous swarm to maximize parallelism and resilience to disruptions. An event-triggered interaction protocol restricts operator involvement to sparse high-level confirmation and guidance. Tests on a heterogeneous robotic fleet produce comparable outcomes while staying robust to hardware-specific actuation and

What carries the argument

Neuro-symbolic framework that conditions LLMs on temporal logic formulas and task automata to produce rule-compliant subtask sequences, followed by an uncertainty-aware scheduler for swarm assignment

If this is right

  • Subtask sequences that obey mission rules and avoid infeasible robot actions in changing conditions.
  • Task assignments that maximize parallel execution across heterogeneous robots while tolerating uncertainties.
  • Operator involvement reduced to sparse high-level confirmations via event-triggered protocol.
  • Robust execution on physical fleets despite hardware and communication variations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same conditioning approach could be tested on other multi-agent systems such as drone fleets or autonomous vehicle platoons.
  • Tighter integration with onboard perception might further reduce the need for any external scene description.
  • Longer autonomous runs become feasible if the reduction in invalid plans holds across varied mission types.

Load-bearing premise

Conditioning LLMs on temporal logic formulas and task automata will consistently prevent invalid task orderings and infeasible actions in dynamic scenarios without needing frequent manual adjustments.

What would settle it

An experiment in which the LLM, when given the temporal logic rules, task automata, and current scene data, outputs a subtask sequence that still violates the rules or proposes an action the robots cannot perform in the observed environment.

Figures

Figures reproduced from arXiv: 2605.07877 by An Zhuo, Guanghui Wen, Junfeng Chen, Meng Guo, Shuo Zhang, Xintong Zhang, Xiwang Dong, Yuxiao Zhu, Zhongkui Li.

Figure 1
Figure 1. Figure 1: Fig.1. An illustration of the proposed method for task planning [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Fig.2. Framework Overview. (A) [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Fig.3. Planning reliability comparison [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Human-swarm collaboration experiment. (A) Human–robot interaction and experimental setup of volunteer study: (i) Four kinds of human–swarm interaction interfaces. (ii) An experiment with 18 volunteers recording synchronous galvanic skin response (GSR) and heart rate (HR) via professional equipment. (B) Post-experiment: (i) NASA-Task Load Index (NASA-TLX) and System Usability Scale (SUS) questionnaires were… view at source ↗
Figure 5
Figure 5. Figure 5: Fig.5. Complex real [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
read the original abstract

Robot swarms promise scalable assistance in complex and hazardous environments. Task planning lies at the core of human-swarm collaboration, translating the operator's intent into coordinated swarm actions and helping determine when validation or intervention is required during execution. In long-horizon missions under dynamic scenarios, however, reliable task planning becomes difficult to maintain: emerging events and changing conditions demand continual adaptation, and sustained operator oversight imposes substantial cognitive burden. Existing LLM-based planning tools can support plan generation, yet they remain susceptible to invalid task orderings and infeasible robot actions, resulting in frequent manual adjustment. Here we introduce a neuro-symbolic framework for long-horizon human-swarm collaboration that tightly melds verifiable task planning with context-grounded LLM reasoning. We formalize mission goals and operational rules as temporal logic formulas and admissible task orderings as task automata. Conditioned on these formal constraints and live perceptual context, LLMs generate executable subtask sequences that satisfy mission rules and remain grounded in the current scene. An uncertainty-aware scheduler then assigns subtasks across the heterogeneous swarm to maximize parallelisms while remaining resilient to disruptions. An event-triggered interaction protocol further limits operator involvement to sparse, high-level confirmation and guidance. Deployment on a heterogeneous robotic fleet yields similar results while remaining robust to hardware-specific actuation and communication uncertainties. Together, these results support a formal and scalable paradigm for reliable and low-overhead human-swarm collaboration in dynamic environments

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a neuro-symbolic framework for long-horizon human-swarm collaboration that formalizes mission goals as temporal logic formulas and task orderings as automata, then conditions LLMs on these constraints plus perceptual context to generate executable subtask sequences. An uncertainty-aware scheduler assigns subtasks across heterogeneous robots to maximize parallelism and resilience, while an event-triggered protocol limits operator input to sparse high-level guidance. The abstract claims that deployment on a real heterogeneous fleet produces similar results and remains robust to actuation and communication uncertainties.

Significance. If the central claims hold, the work would offer a concrete synthesis of LLM generation with formal verification tools that could reduce invalid plans in dynamic swarm settings and lower operator cognitive load. The explicit use of task automata and temporal logic to ground LLM outputs, together with the uncertainty-aware scheduler, provides a falsifiable structure that distinguishes it from purely prompting-based approaches; this is a strength worth preserving in revision.

major comments (2)
  1. [Abstract / framework description] Abstract and framework description: the central claim that 'Conditioned on these formal constraints and live perceptual context, LLMs generate executable subtask sequences that satisfy mission rules' is load-bearing yet rests on an unspecified conditioning procedure. No mention is made of constrained decoding, grammar-guided generation, post-generation verification against the task automaton, or rejection sampling; standard in-context prompting alone permits probabilistic violations of the automata or temporal logic. This mechanism gap directly affects the 'reliable' and 'verifiable' assertions.
  2. [Deployment results] Deployment results paragraph: the statement that 'Deployment on a heterogeneous robotic fleet yields similar results while remaining robust to hardware-specific actuation and communication uncertainties' is presented without metrics, baselines, error bars, or experimental protocol. Absence of quantitative evidence (success rate, recovery time under disruption, comparison to LLM-only or symbolic baselines) prevents assessment of whether the scheduler and protocol actually deliver the claimed resilience.
minor comments (1)
  1. [Abstract] The abstract would be strengthened by a single concrete example of a temporal logic formula and corresponding task automaton to illustrate the conditioning process.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help clarify key aspects of our framework. We address each major comment point by point below, indicating where revisions will be made to strengthen the manuscript.

read point-by-point responses
  1. Referee: [Abstract / framework description] Abstract and framework description: the central claim that 'Conditioned on these formal constraints and live perceptual context, LLMs generate executable subtask sequences that satisfy mission rules' is load-bearing yet rests on an unspecified conditioning procedure. No mention is made of constrained decoding, grammar-guided generation, post-generation verification against the task automaton, or rejection sampling; standard in-context prompting alone permits probabilistic violations of the automata or temporal logic. This mechanism gap directly affects the 'reliable' and 'verifiable' assertions.

    Authors: We agree that the abstract and high-level framework description leave the LLM conditioning procedure underspecified, which weakens the support for the reliability and verifiability claims. The full manuscript integrates temporal logic and task automata with the LLM, but does not explicitly detail the enforcement steps in the overview sections. In the revised manuscript, we will expand both the abstract and the framework description (Section 3) to specify the conditioning mechanism: formal constraints are embedded in structured prompts, followed by post-generation verification against the task automaton with rejection sampling of non-compliant sequences. This revision will directly address the concern about potential violations. revision: yes

  2. Referee: [Deployment results] Deployment results paragraph: the statement that 'Deployment on a heterogeneous robotic fleet yields similar results while remaining robust to hardware-specific actuation and communication uncertainties' is presented without metrics, baselines, error bars, or experimental protocol. Absence of quantitative evidence (success rate, recovery time under disruption, comparison to LLM-only or symbolic baselines) prevents assessment of whether the scheduler and protocol actually deliver the claimed resilience.

    Authors: We acknowledge that the abstract's deployment claim is stated without accompanying quantitative evidence, making it difficult to evaluate the scheduler's and protocol's contributions to resilience. The manuscript's experimental section reports results from the real heterogeneous fleet deployment, including performance under uncertainties. We will revise the abstract to include a concise summary of key metrics (e.g., success rates, recovery behavior), baselines, and protocol references, ensuring the claims are quantitatively grounded while pointing readers to the detailed results in the main text. revision: yes

Circularity Check

0 steps flagged

No circularity in framework synthesis

full rationale

The paper presents a neuro-symbolic framework that combines temporal logic formulas, task automata, LLM conditioning, and an uncertainty-aware scheduler as a new synthesis for human-swarm collaboration. No equations, derivations, or predictions are offered that reduce by construction to fitted parameters, self-definitions, or self-citation chains. The central claims describe an integration of existing formal methods with LLMs without any load-bearing step that equates outputs to inputs tautologically. The framework is self-contained as a proposed architecture, with no renaming of known results or smuggling of ansatzes via citation in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The framework rests on standard assumptions about temporal logic expressiveness and automata for task ordering, plus newly introduced components whose behavior is asserted rather than derived from prior evidence.

axioms (2)
  • domain assumption Mission goals and operational rules can be formalized as temporal logic formulas that LLMs can be conditioned on to produce valid plans.
    Invoked in the description of how LLMs generate subtask sequences.
  • domain assumption Admissible task orderings can be represented as task automata that constrain LLM outputs.
    Used to ensure executable sequences without invalid orderings.
invented entities (2)
  • Uncertainty-aware scheduler no independent evidence
    purpose: Assign subtasks to heterogeneous robots to maximize parallelism while handling disruptions.
    New component introduced to manage swarm assignment under uncertainty.
  • Event-triggered interaction protocol no independent evidence
    purpose: Limit operator involvement to sparse high-level confirmations and guidance.
    New protocol for reducing cognitive burden on humans.

pith-pipeline@v0.9.0 · 5580 in / 1426 out tokens · 32111 ms · 2026-05-11T02:25:32.029348+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

75 extracted references · 75 canonical work pages

  1. [1]

    Manzini, R

    T. Manzini, R. R. Murphy, E. Heim, C. Robinson, G. Zarrella, R. Gupta, Harnessing AI and robotics in humanitarian assistance and disaster response. Sci. Robot. 8, eadj2767 (2023)

  2. [2]

    X. Zhou, X. Wen, Z. Wang, Y. Gao, H. Li, Q. Wang, T. Yang, H. Lu, Y. Cao, C. Xu, F. Gao, Swarm of micro flying robots in the wild. Sci. Robot. 7, eabm5954 (2022)

  3. [3]

    Domí nguez, C

    R. Domí nguez, C. Pé rez-del-Pulgar, G. J. Paz-Delgado, F. Polisano, J. Babel, T. Germa, I. Dragomir, V. Ciarletti, A.-C. Berthet, L. C. Danter, F. Kirchner, Cooperative robotic exploration of a planetary skylight surface and lava cave. Sci. Robot. 10, eadj9699 (2025)

  4. [4]

    Kishor, U

    I. Kishor, U. Mamodiya, V. Patil, N. Naik, AI-Integrated autonomous robotics for solar panel cleaning and predictive maintenance using drone and ground-based systems. Nature 15, 32187 (2025)

  5. [5]

    B. R. van Manen, E. Fosch-Villaronga, M. Smits, Firefighting robots should be made responsibly. Nat. Mach. Intell. 7, 676–677 (2025)

  6. [6]

    D. Zhao, H. Luo, Y. Tu, C. Meng, T. L. Lam, Snail-inspired robotic swarms: a hybrid connector drives collective adaptation in unstructured outdoor environments. Nat. Commun. 15, 3647 (2024)

  7. [7]

    J. Tang, Z. Mao, H. Ma, Large-Scale Multirobot Coverage Path Planning on Grids With Path Deconfliction. IEEE Trans. Robot. 41, 3348–3367 (2025)

  8. [8]

    El Romeh, S

    A. El Romeh, S. Mirjalili, Multi robot exploration using an advanced multi-objective salp swarm algorithm for efficient coverage and performance. Sci. Rep. 15, 26196 (2025)

  9. [9]

    Jeong, H

    H. Jeong, H. Lee, C. Kim, S. Shin, H. Jeong, H. Lee, C. Kim, S. Shin, A Survey of Robot Intelligence with Large Language Models. Appl. Sci. 14 (2024)

  10. [10]

    A. KA, U. Subramaniam, A systematic literature review on multi-robot task allocation. ACM Comput. Surv. 57, 1–28 (2024). Page 22 of 26

  11. [11]

    Siean, B.-C

    A.-I. Siean, B.-C. Gradinaru, O.-I. Gherman, M. Danubianu, L.-D. Milici, Opportunities and Challenges in Human-Swarm Interaction: Systematic Review and Research Implications. Int. J. Adv. Comput. Sci. Appl. IJACSA 14 (2023)

  12. [12]

    Do as i can, not as i say: Grounding language in robotic affordances

    M. Ahn, others, “Do as i can, not as i say: Grounding language in robotic affordances” in Proceedings of The Conference on Robot Learning (CORL) (2023), pp. 287–381

  13. [13]

    LLM-Planner: Few- Shot Grounded Planning for Embodied Agents with Large Language Models

    C. H. Song, B. M. Sadler, J. Wu, W.-L. Chao, C. Washington, Y. Su, “LLM-Planner: Few- Shot Grounded Planning for Embodied Agents with Large Language Models” in Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2986–2997

  14. [14]

    Llmˆ 3: Large language model-based task and motion planning with motion failure reasoning

    S. Wang, M. Han, Z. Jiao, Z. Zhang, Y. N. Wu, S.-C. Zhu, H. Liu, “Llmˆ 3: Large language model-based task and motion planning with motion failure reasoning” in Proceedings of the 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2024), pp. 12086–12092

  15. [15]

    Lamma-p: Generalizable multi-agent long- horizon task allocation and planning with lm-driven pddl planner

    X. Zhang, H. Qin, F. Wang, Y. Dong, J. Li, “Lamma-p: Generalizable multi-agent long- horizon task allocation and planning with lm-driven pddl planner” in Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA) (2025), pp. 10221– 10221

  16. [16]

    Generalized Planning in PDDL Domains with Pretrained Large Language Models

    T. Silver, S. Dan, K. Srinivas, J. B. Tenenbaum, L. Kaelbling, M. Katz, “Generalized Planning in PDDL Domains with Pretrained Large Language Models” in Proceedings of the AAAI Conference on Artificial Intelligence (2024)vol. 38, pp. 20256–20264

  17. [17]

    LLM-as-BT-Planner: Leveraging LLMs for behavior tree generation in robot task planning

    J. Ao, F. Wu, Y. Wu, A. Swiki, S. Haddadin, “LLM-as-BT-Planner: Leveraging LLMs for behavior tree generation in robot task planning” in Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA) (2025), pp. 1233–1239

  18. [18]

    Smart-llm: Smart multi-agent robot task planning using large language models

    S. S. Kannan, others, “Smart-llm: Smart multi-agent robot task planning using large language models” in Proceedings of the 2024 IEEE International Conference on Intelligent Robots and Systems (IROS) (2024), pp. 12140–12147

  19. [19]

    COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models

    K. Liu, others, “COHERENT: Collaboration of Heterogeneous Multi-Robot System with Large Language Models” in Proceedings of the 2025 IEEE International Conference on Robotics and Automation (ICRA) (2025), pp. 10208–10214

  20. [20]

    Roco: Dialectic multi-robot collaboration with large language models

    Z. Mandi, others, “Roco: Dialectic multi-robot collaboration with large language models” in Proceedings of the 2024 IEEE International Conference on Robotics and Automation (ICRA) (2024), pp. 286–299

  21. [21]

    Reca: Integrated acceleration for real-time and efficient cooperative embodied autonomous agents

    Z. Wan, Y. Du, M. Ibrahim, J. Qian, J. Jabbour, Y. Zhao, T. Krishna, A. Raychowdhury, V. J. Reddi, “Reca: Integrated acceleration for real-time and efficient cooperative embodied autonomous agents” in Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 2 (2025), pp. 982–997

  22. [22]

    D. B. Choe, S. V. Sangeetha, S. Emanuel, C.-Y. Chiu, S. Coogan, S. Kousik, Seeing, Saying, Solving: An LLM-to-TL Framework for Cooperative Robots. arXiv: 2505.13376 [cs.RO] (2025)

  23. [23]

    L. Zhou, W. Schellaert, F. Martí nez-Plumed, Y. Moros-Daval, C. Ferri, J. Herná ndez-Orallo, Larger and more instructable language models become less reliable. Nature 634, 61–68 (2024)

  24. [24]

    X. Li, Z. Serlin, G. Yang, C. Belta, A formal methods approach to interpretable reinforcement learning for robotic planning. Sci. Robot. 4, eaay6276 (2019)

  25. [25]

    De Nicola, L

    R. De Nicola, L. Di Stefano, O. Inverso, Toward formal models and languages for verifiable multi-robot systems. Front. Robot. AI 5, 94 (2018)

  26. [26]

    Formal modeling and verification of multi-robot interactive scenarios in service settings

    L. Lestingi, C. Sbrolli, P. Scarmozzino, G. Romeo, M. M. Bersani, M. Rossi, “Formal modeling and verification of multi-robot interactive scenarios in service settings” in Page 23 of 26 Proceedings of the IEEE/ACM 10th International Conference on Formal Methods in Software Engineering (2022), pp. 80–90

  27. [27]

    Street, M

    C. Street, M. Mansouri, B. Lacerda, Formal Modelling for Multi-Robot Systems Under Uncertainty. Curr. Robot. Rep. 4, 55–64 (2023)

  28. [28]

    B. P. Gerkey, M. J. Matarić, A Formal Analysis and Taxonomy of Task Allocation in Multi- Robot Systems. Int. J. Robot. Res. 23, 939–954 (2004)

  29. [29]

    Z. Liu, M. Guo, Z. Li, Time minimization and online synchronization for multi-agent systems under collaborative temporal logic tasks. Automatica 159, 111377 (2024)

  30. [30]

    Z. Chen, Z. Kan, Real-time reactive task allocation and planning of large heterogeneous multi-robot systems with temporal logic specifications. Int. J. Robot. Res. 44, 640–664 (2025)

  31. [31]

    Jones, How good old-fashioned AI could spark the field’s next revolution

    N. Jones, How good old-fashioned AI could spark the field’s next revolution. Nature 647, 842–844 (2025)

  32. [32]

    Borate, B

    S. Borate, B. R. B, V. Pardeshi, M. Vadali, LLM-Based Generalizable Hierarchical Task Planning and Execution for Heterogeneous Robot Teams with Event-Driven Replanning. arXiv:2511.22354 [cs.RO] (2025)

  33. [33]

    Human- Robot Interaction in Extreme and Challenging Environments

    L. Tian, P. Carreno-Medrano, M. Giuliani, N. Hawes, R. Bhattacharyya, D. Kulic, “Human- Robot Interaction in Extreme and Challenging Environments” in Proceedings of the 20th ACM/IEEE International Conference on Human-Robot Interaction (HRI) (2025), pp. 1988– 1990

  34. [34]

    Sandrini, M

    S. Sandrini, M. Faroni, N. Pedrocchi, Learning and planning for optimal synergistic human– robot coordination in manufacturing contexts. Robot. Comput.-Integr. Manuf. 95, 103006 (2025)

  35. [35]

    Scaled autonomy: Enabling human operators to control robot fleets

    G. Swamy, S. Reddy, S. Levine, A. D. Dragan, “Scaled autonomy: Enabling human operators to control robot fleets” in Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA) (2020), pp. 5942–5948

  36. [36]

    Distributed planning and control of multi-robot systems under human presence,

    V. Nan Fernandez-Ayala, “Distributed planning and control of multi-robot systems under human presence,” thesis, KTH Royal Institute of Technology (2025)

  37. [37]

    W. Ji, H. Chen, M. Chen, G. Zhu, L. Xu, R. Groß , R. Zhou, M. Cao, S. Zhao, GenSwarm: Scalable Multi-Robot Code-Policy Generation and Deployment via Language Models. Npj Robot. 4, 5 (2026)

  38. [38]

    A. B. Asghar, S. Sundaram, S. L. Smith, Multirobot Persistent Monitoring: Minimizing Latency and Number of Robots With Recharging Constraints. IEEE Trans. Robot. 41, 236– 252 (2025)

  39. [39]

    Nicolescu, J

    M. Nicolescu, J. Blankenburg, B. A. Anima, M. Zagainova, P. Hoseini, M. Nicolescu, D. Feil-Seifer, Simulation theory of mind for heterogeneous human-robot teams. Front. Robot. AI 12 (2025)

  40. [40]

    K. Lin, C. Agia, T. Migimatsu, M. Pavone, J. Bohg, Text2Motion: from natural language instructions to feasible plans. Auton. Robots 47, 1345–1365 (2023)

  41. [41]

    LLM Assistant for heterogeneous multi-robot system dynamic task planning

    M. Guzmá n-Merino, N. S. Krause, “LLM Assistant for heterogeneous multi-robot system dynamic task planning” in Proceedings of the 35. Forum Bauinformatik (Fbi 2024), pp. 512– 520

  42. [42]

    Obata, others, LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning

    K. Obata, others, LiP-LLM: Integrating Linear Programming and dependency graph with Large Language Models for multi-robot task planning. IEEE Robot. Autom. Lett. 10, 1122– 1129 (2024)

  43. [43]

    Izzat, I

    N. Izzat, I. H, A. RF, THE MEASUREMENT OF NURSES’ MENTAL WORKLOAD USING NASA-TLX METHOD (A CASE STUDY). Malays. J. Public Health Med. 20, 60– 63 (2020)

  44. [44]

    Brooke, SUS: a retrospective

    J. Brooke, SUS: a retrospective. J. Usability Stud. 8 (2013). Page 24 of 26

  45. [45]

    Using Ollama

    F. S. Marcondes, A. Gala, R. Magalhães, F. Perez de Britto, D. Durães, P. Novais, “Using Ollama” in Proceeding of the 2025 Natural Language Analytics with Generative Large- Language Models: A Practical Approach with Ollama and Open-Source LLMs (Springer Nature Switzerland, Cham, 2025), pp. 23–35

  46. [46]

    Y. Gong, G. Sun, A. Nair, A. Bidwai, R. CS, J. Grezmak, G. Sartoretti, K. A. Daltorio, Legged robots for object manipulation: A review. Front. Mech. Eng. Volume 9-2023 (2023)

  47. [47]

    W. J. Jose, H. Zhang, Bilevel Learning for Dual-Legged Collaborative Transportation under Kinematic and Anisotropic Velocity Constraints. arXiv:2412.08644 [cs.RO] (2024)

  48. [48]

    Generalized task-parameterized skill learning

    Y. Huang, J. Silvério, L. Rozo, D. G. Caldwell, “Generalized task-parameterized skill learning” in Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA) (2018), pp. 5667–5474

  49. [49]

    Y. Liu, W. Chen, Y. Bai, X. Liang, G. Li, W. Gao, L. Lin, Aligning Cyber Space With Physical World: A Comprehensive Survey on Embodied AI. IEEEASME Trans. Mechatron. 30, 7253–7274 (2025)

  50. [50]

    Chopra, A

    M. Chopra, A. Barnawal, H. Vagadia, T. Banerjee, S. Tuli, S. Chakraborty, R. Paul, PhyPlan: Generalizable and Rapid Physical Task Planning with Physics Informed Skill Networks for Robot Manipulators. arXiv:2406.00001 [cs.RO] (2024)

  51. [51]

    H. Lu, M. Zhao, P. Zhou, K. Mao, Behavior Scheduling for Multi-Robot Path Planning in Unknown Environment With Communication Constraints. IEEE Trans. Autom. Sci. Eng. 22, 10417–10428 (2025)

  52. [52]

    iHERO: Interactive Human-oriented Exploration and Supervision Under Scarce Communication

    Z. Tian, Y. Zhang, J. Wei, M. Guo, “iHERO: Interactive Human-oriented Exploration and Supervision Under Scarce Communication” in Proceedings of the 2024 Robotics: Science and Systems (RSS) (Delft, Netherlands, 2024). pp. 115–131

  53. [53]

    Belta, B

    C. Belta, B. Yordanov, E. A. Gol, Formal Methods for Discrete-Time Dynamical Systems (Springer, 2017), vol. 89

  54. [54]

    LTL to Büchi automata translation: Fast and more deterministic

    T. Babiak, M. Křetínský, V. Řehák, J. Strejček, “LTL to Büchi automata translation: Fast and more deterministic” in Proceedings of the 2012 International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS) (2012), pp. 95–109

  55. [55]

    Motion and action planning under LTL specifications using navigation functions and action description language

    M. Guo, K. H. Johansson, D. V. Dimarogonas, “Motion and action planning under LTL specifications using navigation functions and action description language” in Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2013), pp. 240–245

  56. [56]

    A Sampling-Based Approach for Heterogeneous Coalition Scheduling with Temporal Uncertainty

    A. Messing, J. Banfi, M. Stadler, E. Stump, H. Ravichandar, N. Roy, S. Hutchinson, “A Sampling-Based Approach for Heterogeneous Coalition Scheduling with Temporal Uncertainty.” in Proceedings of the 2023 Robotics Science and Systems (RSS) (2023), pp. 101–115

  57. [57]

    Reasoning over Hierarchical Abstractions for Long-Horizon Planning in Robotics,

    C. P. Bradley, “Reasoning over Hierarchical Abstractions for Long-Horizon Planning in Robotics,” thesis, Massachusetts Institute of Technology (2025)

  58. [58]

    Language models as zero-shot planners: Extracting actionable knowledge for embodied agents

    W. Huang, others, “Language models as zero-shot planners: Extracting actionable knowledge for embodied agents” in Proceedings of the 2022 International Conference on Machine Learning (ICML) (2022), pp. 9118–9147

  59. [59]

    Sentence-BERT: Sentence Embeddings using Siamese BERT- Networks

    N. Reimers, I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT- Networks” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2019), pp. 11–19

  60. [60]

    Gurobi Optimization; https://www.gurobi.com

  61. [61]

    Y. Feng, J. Huang, S. Du, S. Ying, J.-H. Yong, Y. Li, G. Ding, R. Ji, Y. Gao, Hyper-yolo: When visual object detection meets hypergraph computation. IEEE Trans. Pattern. Anal. Mach. Intell. 4, 244–256 (2024). Page 25 of 26

  62. [62]

    Improved GPS sensor model for mobile robots in urban terrain

    D. Maier, A. Kleiner, “Improved GPS sensor model for mobile robots in urban terrain” in Proceedings of the 2010 IEEE International Conference on Robotics and Automation (ICRA) (2010), pp. 4385–4390

  63. [63]

    S.-J. Lee, B. Kim, D.-W. Yang, J. Kim, T. Parkinson, J. Billingham, C. Park, J. Yoon, D.-Y. Lee, A compact RTK-GNSS device for high-precision localization of outdoor mobile robots. J. Field Robot. 41, 1349–1365 (2024)

  64. [64]

    M. A. Shalaby, C. C. Cossette, J. Le Ny, J. R. Forbes, Multi-robot relative pose estimation and IMU preintegration using passive UWB transceivers. IEEE Trans. Robot. 40, 2410–2429 (2024)

  65. [65]

    W. Xu, F. Zhang, Fast-LIO: A Fast, Robust Lidar-Inertial Odometry Package by Tightly- Coupled Iterated Kalman Filter. IEEE Robot. Autom. Lett. 6, 3317–3324 (2021)

  66. [66]

    Macenski, I

    S. Macenski, I. Jambrecic, SLAM Toolbox: SLAM for the dynamic world. J. Open Source Softw. 6, 2783 (2021)

  67. [67]

    FIESTA: Fast Incremental Euclidean Distance Fields for Online Motion Planning of Aerial Robots

    L. Han, F. Gao, B. Zhou, S. Shen, “FIESTA: Fast Incremental Euclidean Distance Fields for Online Motion Planning of Aerial Robots” in Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019)

  68. [68]

    X. Zhou, Z. Wang, H. Ye, C. Xu, F. Gao, EGO-Planner: An ESDF-Free Gradient-Based Local Planner for Quadrotors. IEEE Robot. Autom. Lett. 6, 478–485 (2020)

  69. [69]

    T. M. Cabreira, L. B. Brisolara, F. J. Paulo R, Survey on coverage path planning with unmanned aerial vehicles. Drones 3, 4 (2019)

  70. [70]

    ROS Navigation: Concepts and Tutorial

    R. L. Guimarães, A. S. de Oliveira, J. A. Fabro, T. Becker, V. A. Brenner, “ROS Navigation: Concepts and Tutorial” in Robot Operating System (ROS): The Complete Reference, A. Koubaa, Ed. (Springer International Publishing, Cham, 2016), pp. 121–160

  71. [71]

    Zhang, S

    H. Zhang, S. Wang, Y. Liu, P. Ji, R. Yu, T. Chao, Efp: Efficient frontier-based autonomous uav exploration strategy for unknown environments. IEEE Robot. Autom. Lett. 9, 2941–2948 (2024)

  72. [72]

    The Marathon 2: A Navigation System

    S. Macenski, F. Martin, R. White, J. Gines Clavero, “The Marathon 2: A Navigation System” in Proceedings of the 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2020)

  73. [73]

    Geometric Tracking Control of a Quadrotor UAV on SE(3)

    T. Lee, M. Leok, N. H. McClamroch, “Geometric Tracking Control of a Quadrotor UAV on SE(3)” in Proceedings of the 49th IEEE Conference on Decision and Control (CDC) (2010), pp. 131–139

  74. [74]

    Unitree Go2; https://shop.unitree.com/products/unitree-go2

  75. [75]

    Task Specification and Comprehension under LTL Specifications

    SCOUT 2.0; https://global.agilex.ai/products/scout-2-0. Acknowledgments: We express our gratitude to Y. Deng, H. B. Sun, A. M. Li, Q. Shi, W. Li and K. Liu for the invaluable suggestions on the manuscript; Y. H. Luo and S. Zhang for the contribution in robot hardware debugging and for providing photography and audio recording services; and J. S. Wei, C. W...