pith. sign in

arxiv: 2605.04327 · v1 · submitted 2026-05-05 · 💻 cs.RO

From Language to Logic: A Theoretical Architecture for VLM-Grounded Safe Navigation

Pith reviewed 2026-05-08 16:58 UTC · model grok-4.3

classification 💻 cs.RO
keywords safe navigationsignal temporal logicvision-language modelsautonomous robotsnatural language instructionsunstructured environmentsrobot planningruntime monitoring
0
0 comments X

The pith

Natural-language safety rules translate into Signal Temporal Logic specifications to guide autonomous robot navigation via vision-language models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents an architecture for letting robots follow high-level safety rules and preferences given in natural language during outdoor navigation. Human instructions are converted into formal Signal Temporal Logic statements that direct path planning and enable runtime checks for compliance. Persistent rules and terrain preferences become part of a two-dimensional cost map, while time-varying conditions are expressed as logic formulas monitored as the robot moves. The approach assumes vision-language models can interpret scenes from images to connect words directly to real-world features and constraints without task-specific training. This setup produces a navigation model that aims to satisfy both strict logic requirements and softer operator preferences through embedded formal metrics.

Core claim

The architecture translates natural-language rules into Signal Temporal Logic specifications that guide planning and navigation during runtime. Persistent, environment-centric rules and terrain preferences are grounded into a 2D cost map, while temporally dynamic requirements are expressed as STL specifications to be monitored during runtime. Vision-Language Models enable zero-shot scene understanding that maps human instructions to semantic features and environmental constraints, supporting construction of an illustrative navigation model that satisfies the STL-encoded specifications and soft preferences through formal satisfaction metrics.

What carries the argument

Translation of natural language into Signal Temporal Logic (STL) specifications grounded by Vision-Language Models (VLMs) for zero-shot mapping of instructions to cost maps and runtime monitors.

If this is right

  • Persistent safety rules and operator preferences become encoded as costs in a 2D map used for path planning.
  • Temporally dynamic requirements can be checked continuously at runtime through STL monitoring.
  • The navigation planner can optimize paths to meet formal satisfaction metrics for both hard rules and soft preferences.
  • Zero-shot VLM grounding allows new rules to be added without retraining the system on specific environments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Operators without programming skills could define complex safety behaviors for robots in the field by describing them in words.
  • The architecture might reduce reliance on hand-tuned navigation parameters when robots enter unfamiliar outdoor areas.
  • Uncertainty or errors from the VLM could be handled by treating its outputs as probabilistic constraints rather than fixed ones.
  • The same language-to-logic pipeline might apply to other robot tasks such as manipulation or multi-agent coordination.

Load-bearing premise

Vision-language models can reliably perform zero-shot scene understanding to map human instructions to environmental constraints and semantic features in unstructured outdoor environments.

What would settle it

A demonstration in which a vision-language model misidentifies terrain or obstacles described in a safety rule, causing the robot to violate the corresponding STL specification or cost-map preference during a real navigation run.

Figures

Figures reproduced from arXiv: 2605.04327 by Kalonji Harrington, Kristy Sakano, Mumu Xu.

Figure 1
Figure 1. Figure 1: Autonomous robot navigation under the proposed theoretical view at source ↗
Figure 2
Figure 2. Figure 2: Overall theoretical architecture of our VLM-grounded safe navigation stack. We obtain view at source ↗
Figure 3
Figure 3. Figure 3: State-dependent navigation illustrates normal versus low-battery view at source ↗
read the original abstract

We propose an architecture for integrating high-level, human-provided safety rules and operator-aligned semantic preferences into autonomous robot navigation in unstructured outdoor environments. In our approach, natural-language rules are translated into Signal Temporal Logic (STL) specifications that guide planning and navigation during runtime. Persistent, environment-centric rules and terrain preferences are grounded into a 2D cost map, while temporally dynamic requirements are expressed as STL specifications to be monitored during runtime. We hypothesize the use of Vision-Language Models (VLMs) for zero-shot scene understanding, enabling mapping between human instructions, semantic features, and environmental constraints. Within this framework, we construct an illustrative navigation model that is designed to satisfy a set of STL-encoded specifications and soft operator preferences through formal satisfaction metrics embedded into environmental properties and runtime monitoring.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a theoretical architecture for safe autonomous robot navigation in unstructured outdoor environments. High-level natural-language safety rules and semantic preferences are translated into Signal Temporal Logic (STL) specifications for runtime monitoring of dynamic requirements and into 2D cost maps for persistent terrain preferences. The approach hypothesizes the use of Vision-Language Models (VLMs) for zero-shot scene understanding to ground instructions to environmental constraints, and constructs an illustrative navigation model intended to satisfy the resulting STL-encoded specifications and soft preferences via formal satisfaction metrics.

Significance. If the VLM zero-shot grounding hypothesis holds with sufficient reliability, the architecture could enable more interpretable and operator-aligned navigation with formal safety properties in complex settings where traditional methods struggle. The conceptual integration of language-to-STL translation, cost-map grounding, and runtime monitoring is a coherent framework that builds on existing STL planning techniques, though its significance remains prospective given the absence of supporting analysis or results.

major comments (2)
  1. [Abstract] Abstract: The safety and runtime satisfaction claims of the architecture rest on the unverified hypothesis that VLMs can perform reliable zero-shot mapping from natural-language rules to accurate semantic features and constraints; no error models, formal bounds on grounding accuracy, or fallback mechanisms for mis-grounding are described, leaving the formal guarantees unsubstantiated.
  2. [Illustrative navigation model] Illustrative navigation model section: The model is stated to satisfy STL specifications through embedded formal metrics, yet the manuscript supplies no derivations, simulation results, satisfaction analysis, or sensitivity study to demonstrate this property under the hypothesized VLM grounding.
minor comments (2)
  1. The distinction between persistent rules (cost maps) and temporally dynamic requirements (STL) is conceptually clear but could be reinforced with a diagram or pseudocode example of the full pipeline.
  2. Consider adding a dedicated limitations or assumptions subsection to explicitly discuss the scope of the VLM hypothesis.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the detailed and constructive review of our manuscript on the theoretical architecture for VLM-grounded safe navigation. We address the major comments point by point below, clarifying the scope of the work as a conceptual framework and outlining planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The safety and runtime satisfaction claims of the architecture rest on the unverified hypothesis that VLMs can perform reliable zero-shot mapping from natural-language rules to accurate semantic features and constraints; no error models, formal bounds on grounding accuracy, or fallback mechanisms for mis-grounding are described, leaving the formal guarantees unsubstantiated.

    Authors: We agree that the safety and runtime properties in the proposed architecture are conditional on reliable VLM grounding, which is presented as a hypothesis rather than an empirically verified component. The manuscript is explicitly framed as a theoretical architecture (see abstract and Section 1), with the VLM role stated as a zero-shot hypothesis to enable the language-to-logic mapping. To address this, we will revise the abstract to explicitly qualify the claims as holding under the assumption of accurate VLM-based scene understanding. We will also add a dedicated paragraph in the Discussion section outlining potential sources of grounding error, high-level considerations for error models, and fallback strategies (such as conservative default constraints or operator override), while noting these as important directions for future empirical work. revision: yes

  2. Referee: [Illustrative navigation model] Illustrative navigation model section: The model is stated to satisfy STL specifications through embedded formal metrics, yet the manuscript supplies no derivations, simulation results, satisfaction analysis, or sensitivity study to demonstrate this property under the hypothesized VLM grounding.

    Authors: The illustrative navigation model is introduced as a conceptual design that embeds formal satisfaction metrics (derived from STL robustness semantics) directly into the cost-map and planning pipeline, such that satisfaction holds by construction when the input constraints are correctly grounded. We acknowledge that the current manuscript provides only a high-level description without explicit derivations or quantitative analysis. We will expand the relevant section with a step-by-step outline of how the embedded metrics map to STL satisfaction (including a sketch of the robustness function application) and clarify the by-construction guarantee under accurate grounding. However, full simulation results, satisfaction analysis under VLM noise, or sensitivity studies are outside the scope of this theoretical paper. revision: partial

standing simulated objections not resolved
  • Providing simulation results, satisfaction analysis, or sensitivity studies for the illustrative navigation model, as the work is a theoretical architecture proposal without performed empirical evaluations or implementations.

Circularity Check

0 steps flagged

No circularity: proposal is self-contained conceptual architecture

full rationale

The manuscript presents a high-level architecture for mapping natural-language safety rules into STL specifications and 2D cost maps, then monitoring them at runtime. It explicitly labels the VLM zero-shot grounding step as a hypothesis rather than a derived result, and the illustrative navigation model is described as 'designed to satisfy' the specifications without any equations, fitted parameters, or self-citations that reduce the claims to their own inputs. No self-definitional loops, renamed empirical patterns, or load-bearing prior-author uniqueness theorems appear; the derivation chain therefore remains non-circular and externally falsifiable via the stated VLM assumption.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on domain assumptions about VLM capabilities and STL applicability without introducing free parameters or new entities.

axioms (2)
  • domain assumption Vision-Language Models can perform zero-shot scene understanding to map human instructions to environmental constraints and semantic features
    Hypothesized in the abstract as the grounding mechanism but not demonstrated or proven.
  • domain assumption Signal Temporal Logic specifications can be monitored in real-time to guide planning and navigation while satisfying formal satisfaction metrics
    Invoked as the core runtime mechanism for dynamic requirements.

pith-pipeline@v0.9.0 · 5430 in / 1308 out tokens · 86239 ms · 2026-05-08T16:58:26.565221+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

  1. [1]

    UA V Trajectory Planning for Static and Dynamic Environments,

    J. Ruz, O. Arevalo, G. Pajares, and J. M. D. La Cruz, “UA V Trajectory Planning for Static and Dynamic Environments,” inAerial Vehicles, T. Mung, Ed. InTech, Jan. 2009

  2. [2]

    Autonomous Systems in Unstructured Environments: AI Approaches for Robust Operation,

    M. Saleem Sultan and M. Shahid Sultan, “Autonomous Systems in Unstructured Environments: AI Approaches for Robust Operation,” International Journal of Science and Research (IJSR), vol. 13, no. 8, pp. 1348–1355, Aug. 2024

  3. [3]

    A Survey on Path Planning for Autonomous Ground Vehicles in Unstructured Environments,

    N. Wang, X. Li, K. Zhang, J. Wang, and D. Xie, “A Survey on Path Planning for Autonomous Ground Vehicles in Unstructured Environments,”Machines, vol. 12, no. 1, p. 31, Jan. 2024

  4. [4]

    Safety-critical advanced robots: A survey,

    J. Guiochet, M. Machin, and H. Waeselynck, “Safety-critical advanced robots: A survey,”Robotics and Autonomous Systems, vol. 94, pp. 43– 52, Aug. 2017

  5. [5]

    Real-Time Metric- Semantic Mapping for Autonomous Navigation in Outdoor Environments,

    J. Jiao, R. Geng, Y . Li, R. Xin, B. Yang, J. Wu, L. Wang, M. Liu, R. Fan, and D. Kanoulas, “Real-Time Metric- Semantic Mapping for Autonomous Navigation in Outdoor Environments,” vol. 22, pp. 5729–5740. [Online]. Available: https://ieeexplore.ieee.org/document/10620438/

  6. [6]

    ROS-Based Navigation and Obstacle Avoidance: A Study of Architectures, Methods, and Trends,

    Z. Wei, S. Wang, K. Chen, and F. Wang, “ROS-Based Navigation and Obstacle Avoidance: A Study of Architectures, Methods, and Trends,” Sensors, vol. 25, no. 14, p. 4306, Jan. 2025

  7. [7]

    Using RGB Image as Visual Input for Mapless Robot Navigation,

    L. Ma, Y . Liu*, and J. Chen, “Using RGB Image as Visual Input for Mapless Robot Navigation,” Apr. 2019

  8. [8]

    An Open-Source Low-Cost Mobile Robot System With an RGB-D Camera and Efficient Real-Time Navigation Algorithm,

    T. Kim, S. Lim, G. Shin, G. Sim, and D. Yun, “An Open-Source Low-Cost Mobile Robot System With an RGB-D Camera and Efficient Real-Time Navigation Algorithm,”IEEE Access, vol. 10, pp. 127 871– 127 881, 2022

  9. [9]

    Learning Transferable Visual Models From Natural Language Super- vision,

    A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, G. Krueger, and I. Sutskever, “Learning Transferable Visual Models From Natural Language Super- vision,” 2021

  10. [10]

    FLA V A: A Foundational Language And Vision Align- ment Model,

    A. Singh, R. Hu, V . Goswami, G. Couairon, W. Galuba, M. Rohrbach, and D. Kiela, “FLA V A: A Foundational Language And Vision Align- ment Model,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, Jun. 2022, pp. 15 617–15 629

  11. [11]

    Formal methods for autonomous systems,

    T. Wongpiromsarn, M. Ghasemi, M. Cubuktepe, G. Bakirtzis, S. Carr, M. O. Karabag, C. Neary, P. Gohari, and U. Topcu, “Formal Methods for Autonomous Systems,” vol. 10, no. 3–4, pp. 180–407. [Online]. Available: http://arxiv.org/abs/2311.01258

  12. [12]

    Motion planning with temporal-logic specifications: Progress and challenges,

    E. Plaku and S. Karaman, “Motion planning with temporal-logic specifications: Progress and challenges,”AI Communications, vol. 29, no. 1, pp. 151–162, Nov. 2014

  13. [13]

    A formal methods approach to interpretable reinforcement learning for robotic planning,

    X. Li, Z. Serlin, G. Yang, and C. Belta, “A formal methods approach to interpretable reinforcement learning for robotic planning,”Science Robotics, vol. 4, no. 37, p. eaay6276, Dec. 2019

  14. [14]

    Formal methods in robot policy learning and verification: A survey on current techniques and future directions,

    A. Manganaris, V . Giammarino, A. H. Qureshi, and S. Jagannathan, “Formal methods in robot policy learning and verification: A survey on current techniques and future directions,” 2026. [Online]. Available: https://arxiv.org/abs/2602.06971

  15. [15]

    LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action,

    D. Shah, B. Osinski, B. Ichter, and S. Levine, “LM-Nav: Robotic Navigation with Large Pre-Trained Models of Language, Vision, and Action,” 2022

  16. [16]

    VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning,

    Y . Du, T. Fu, Z. Chen, B. Li, S. Su, Z. Zhao, and C. Wang, “VL-Nav: Real-time Vision-Language Navigation with Spatial Reasoning,” Mar. 2025

  17. [17]

    Behav: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes,

    K. Weerakoon, M. Elnoor, G. Seneviratne, V . Rajagopal, S. H. Arul, J. Liang, M. K. M. Jaffar, and D. Manocha, “Behav: Behavioral Rule Guided Autonomy Using VLMs for Robot Navigation in Outdoor Scenes,” in2025 IEEE International Conference on Robotics and Automation (ICRA). Atlanta, GA, USA: IEEE, May 2025, pp. 7044– 7051

  18. [18]

    DV-VLN: Dual Verification for Reliable LLM-Based Vision-and-Language Navigation,

    Z. Li, S. Li, Z. Zhang, B. Li, and S. Zhou, “DV-VLN: Dual Verification for Reliable LLM-Based Vision-and-Language Navigation,” 2026

  19. [19]

    Runtime Assurance from Signal Temporal Logic Safety Specifications,

    L. Baird and S. Coogan, “Runtime Assurance from Signal Temporal Logic Safety Specifications,” in2023 American Control Conference (ACC). San Diego, CA, USA: IEEE, May 2023, pp. 3535–3540

  20. [20]

    R. Liu, A. Hou, X. Yu, and X. Yin. Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks. [Online]. Available: http://arxiv.org/abs/2501.13457

  21. [21]

    Trajec- tory Planning with Signal Temporal Logic Costs using Deterministic Path Integral Optimization

    P. Halder, H. Homburger, L. Kiltz, J. Reuter, and M. Althoff, “Trajec- tory Planning with Signal Temporal Logic Costs using Deterministic Path Integral Optimization.”

  22. [22]

    Kapoor, S

    P. Kapoor, S. Vemprala, and A. Kapoor. Logically Constrained Robotics Transformers for Enhanced Perception-Action Planning. [Online]. Available: http://arxiv.org/abs/2408.05336

  23. [23]

    B. Ye, J. Huang, Y . Liu, X. Qiao, and X. Yin. Bridging Perception and Planning: Towards End-to-End Planning for Signal Temporal Logic Tasks. [Online]. Available: http://arxiv.org/abs/2509.12813

  24. [24]

    Vision-Language Models for Vision Tasks: A Survey,

    J. Zhang, J. Huang, S. Jin, and S. Lu, “Vision-Language Models for Vision Tasks: A Survey,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 8, pp. 5625–5644, Aug. 2024

  25. [25]

    A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges,

    Z. Li, X. Wu, H. Du, F. Liu, H. Nghiem, and G. Shi, “A Survey of State of the Art Large Vision Language Models: Alignment, Benchmark, Evaluations and Challenges,” 2025

  26. [26]

    Visual instruction tuning,

    H. Liu, C. Li, Q. Wu, and Y . J. Lee, “Visual instruction tuning,” inAd- vances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 34 892–34 916

  27. [27]

    Qwen2.5-VL Technical Report,

    S. Bai, K. Chen, X. Liu, J. Wang, W. Ge, S. Song, K. Dang, P. Wang, S. Wang, J. Tang, H. Zhong, Y . Zhu, M. Yang, Z. Li, J. Wan, P. Wang, W. Ding, Z. Fu, Y . Xu, J. Ye, X. Zhang, T. Xie, Z. Cheng, H. Zhang, Z. Yang, H. Xu, and J. Lin, “Qwen2.5-VL Technical Report,” 2025

  28. [28]

    Open vocabulary scene parsing,

    H. Zhao, X. Puig, B. Zhou, S. Fidler, and A. Torralba, “Open vocabulary scene parsing,” in2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2021–2029

  29. [29]

    Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation,

    Y . Feng, Y . Liu, S. Yang, W. Cai, J. Zhang, Q. Zhan, Z. Huang, H. Yan, Q. Wan, C. Liu, J. Wang, J. Lv, Z. Liu, T. Shi, Q. Liu, and Y . Wang, “Vision-Language Model for Object Detection and Segmentation: A Review and Evaluation,” 2025

  30. [30]

    GroupViT: Semantic Segmentation Emerges from Text Supervision,

    J. Xu, S. De Mello, S. Liu, W. Byeon, T. Breuel, J. Kautz, and X. Wang, “GroupViT: Semantic Segmentation Emerges from Text Supervision,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). New Orleans, LA, USA: IEEE, Jun. 2022, pp. 18 113–18 123

  31. [31]

    CAT- Seg: Cost Aggregation for Open-V ocabulary Semantic Segmentation,

    S. Cho, H. Shin, S. Hong, A. Arnab, P. H. Seo, and S. Kim, “CAT- Seg: Cost Aggregation for Open-V ocabulary Semantic Segmentation,” in2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, W A, USA: IEEE, Jun. 2024, pp. 4113– 4123

  32. [32]

    LLMFormer: Large Language Model for Open-V ocabulary Semantic Segmentation,

    H. Shi, S. D. Dao, and J. Cai, “LLMFormer: Large Language Model for Open-V ocabulary Semantic Segmentation,”International Journal of Computer Vision, vol. 133, no. 2, pp. 742–759, Feb. 2025

  33. [33]

    Visual Language Maps for Robot Navigation,

    C. Huang, O. Mees, A. Zeng, and W. Burgard, “Visual Language Maps for Robot Navigation,” in2023 IEEE International Conference on Robotics and Automation (ICRA). London, United Kingdom: IEEE, May 2023, pp. 10 608–10 615

  34. [34]

    Any- Traverse: An off-road traversability framework with VLM and human operator in the loop,

    S. Sahu, A. Singh, K. Nambiar, S. Saripalli, and P. B. Sujit, “Any- Traverse: An off-road traversability framework with VLM and human operator in the loop,” 2025

  35. [35]

    VLM-Social-Nav: Socially Aware Robot Navigation Through Scoring Using Vision-Language Models,

    D. Song, J. Liang, A. Payandeh, A. H. Raj, X. Xiao, and D. Manocha, “VLM-Social-Nav: Socially Aware Robot Navigation Through Scoring Using Vision-Language Models,”IEEE Robotics and Automation Letters, vol. 10, no. 1, pp. 508–515, Jan. 2025

  36. [36]

    VLM-RRT: Vision Language Model Guided RRT Search for Autonomous UA V Navigation,

    J. Ye, S. Papaioannou, and P. Kolios, “VLM-RRT: Vision Language Model Guided RRT Search for Autonomous UA V Navigation,” in2025 International Conference on Unmanned Aircraft Systems (ICUAS), May 2025, pp. 633–640

  37. [37]

    Rapidly-Exploring Random Trees: A New Tool for Path Planning,

    S. LaValle, “Rapidly-Exploring Random Trees: A New Tool for Path Planning,” Oct. 1998

  38. [38]

    Monitoring Temporal Properties of Continuous Signals,

    O. Maler and D. Nickovic, “Monitoring Temporal Properties of Continuous Signals,” inFormal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y . Vardi, G. Weikum...

  39. [39]

    On Signal Temporal Logic,

    A. Donz ´e, “On Signal Temporal Logic,” inRuntime Verification, A. Legay and S. Bensalem, Eds. Berlin, Heidelberg: Springer, 2013, pp. 382–383

  40. [40]

    Robust Satisfaction of Temporal Logic over Real-Valued Signals,

    A. Donz ´e and O. Maler, “Robust Satisfaction of Temporal Logic over Real-Valued Signals,” inFormal Modeling and Analysis of Timed Sys- tems, D. Hutchison, T. Kanade, J. Kittler, J. M. Kleinberg, F. Mattern, J. C. Mitchell, M. Naor, O. Nierstrasz, C. Pandu Rangan, B. Steffen, M. Sudan, D. Terzopoulos, D. Tygar, M. Y . Vardi, G. Weikum, K. Chatterjee, and ...

  41. [41]

    Planning with Preferences,

    J. A. Baier and S. A. McIlraith, “Planning with Preferences,” vol. 29, no. 4, pp. 25–36. [Online]. Available: https://onlinelibrary.wiley.com/doi/10.1609/aimag.v29i4.2204

  42. [42]

    DeepSTL: From english requirements to signal temporal logic,

    J. He, E. Bartocci, D. Ni ˇckovi´c, H. Isakovic, and R. Grosu, “DeepSTL: From english requirements to signal temporal logic,” inProceedings of the 44th International Conference on Software Engineering, ser. ICSE ’22. New York, NY , USA: Association for Computing Machinery, Jul. 2022, pp. 610–622

  43. [43]

    NL2STL: Transformation from Logic Natural Language to Sig- nal Temporal Logics using Llama2,

    Y . Mao, T. Zhang, X. Cao, Z. Chen, X. Liang, B. Xu, and H. Fang, “NL2STL: Transformation from Logic Natural Language to Sig- nal Temporal Logics using Llama2,” in2024 IEEE International Conference on Cybernetics and Intelligent Systems (CIS) and IEEE International Conference on Robotics, Automation and Mechatronics (RAM), Aug. 2024, pp. 469–474

  44. [44]

    Learning from Failures: Translation of Natural Language Requirements into Linear Temporal Logic with Large Language Models,

    Y . Xu, J. Feng, and W. Miao, “Learning from Failures: Translation of Natural Language Requirements into Linear Temporal Logic with Large Language Models,” in2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS), Jul. 2024, pp. 204–215

  45. [45]

    NL2TL: Transforming Natural Languages to Temporal Logics using Large Language Mod- els,

    Y . Chen, R. Gandhi, Y . Zhang, and C. Fan, “NL2TL: Transforming Natural Languages to Temporal Logics using Large Language Mod- els,” Mar. 2024

  46. [46]

    Formal Synthesis of Embedded Control Software: Application to Vehicle Management Sys- tems,

    T. Wongpiromsarn, U. Topcu, and R. Murray, “Formal Synthesis of Embedded Control Software: Application to Vehicle Management Sys- tems,” inInfotech@Aerospace 2011. St. Louis, Missouri: American Institute of Aeronautics and Astronautics, Mar. 2011

  47. [47]

    Image Segmentation Using Text and Image Prompts,

    T. Luddecke and A. Ecker, “Image Segmentation Using Text and Image Prompts,” in2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp. 7076–7086. [Online]. Available: https://ieeexplore.ieee.org/document/9879551/

  48. [48]

    RRT X: Asymptotically optimal single-query sampling-based motion planning with quick replanning,

    M. Otte and E. Frazzoli, “RRT X : Asymptotically optimal single-query sampling-based motion planning with quick replanning,” vol. 35, no. 7, pp. 797–822. [Online]. Available: https://journals.sagepub.com/doi/10.1177/0278364915594679

  49. [49]

    RRTX: Real-Time Motion Planning/Replanning for Environ- ments with Unpredictable Obstacles

    ——, “RRTX: Real-Time Motion Planning/Replanning for Environ- ments with Unpredictable Obstacles.”

  50. [50]

    Synthesis of Reac- tive Switching Protocols From Temporal Logic Specifications,

    J. Liu, N. Ozay, U. Topcu, and R. M. Murray, “Synthesis of Reac- tive Switching Protocols From Temporal Logic Specifications,”IEEE Transactions on Automatic Control, vol. 58, no. 7, pp. 1771–1785, Jul. 2013