nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models

Caroline Trippel; Christopher Hahn; Daniel Mendoza; Frederik Schmitt; Matthias Cosler

arxiv: 2303.04864 · v1 · pith:KIHE5KKNnew · submitted 2023-03-08 · 💻 cs.LO · cs.AI· cs.LG

nl2spec: Interactively Translating Unstructured Natural Language to Temporal Logics with Large Language Models

Matthias Cosler , Christopher Hahn , Daniel Mendoza , Frederik Schmitt , Caroline Trippel This is my paper

classification 💻 cs.LO cs.AIcs.LG

keywords languagenaturalformalizationmodelsapplicationformalframeworklarge

0 comments

read the original abstract

A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Evaluating the Formal Reasoning Capabilities of Large Language Models through Chomsky Hierarchy
cs.CL 2026-04 unverdicted novelty 7.0

LLMs display clear performance stratification on formal language tasks aligned with Chomsky hierarchy complexity levels, limited by severe efficiency barriers rather than absolute capability.
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
cs.CV 2025-04 unverdicted novelty 6.0

NeuS-E is a post-generation refinement method that uses neuro-symbolic analysis of a formal video representation to detect and correct semantic and temporal inconsistencies in text-to-video outputs, improving prompt a...
Evaluating LLM-Generated ACSL Annotations for Formal Verification
cs.SE 2026-02 unverdicted novelty 4.0

Rule-based annotation generation for ACSL outperforms LLM-based methods in achieving successful formal verification of C programs.
LLM-Assisted Tool for Joint Generation of Formulas and Functions in Rule-Based Verification of Map Transformations
cs.SE 2025-11 unverdicted novelty 4.0

LLM-assisted pipeline jointly generates logical formulas and executable predicates for rule-based verification of HD map transformations in CommonRoad, evaluated on synthetic bridge and slope scenarios.