NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic

Ernest Bonnah; Paapa Kwesi Quansah

arxiv: 2605.22874 · v1 · pith:YJNNH2ONnew · submitted 2026-05-20 · 💻 cs.AI · cs.LO

NeuroNL2LTL: A Neurosymbolic Framework for Natural Language Translation of Linear Temporal Logic

Paapa Kwesi Quansah , Ernest Bonnah This is my paper

Pith reviewed 2026-05-25 06:07 UTC · model grok-4.3

classification 💻 cs.AI cs.LO

keywords neurosymboliclinear temporal logicnatural language to LTLformal verificationreinforcement learningspecification translationsatisfiability checking

0 comments

The pith

A neurosymbolic framework translates natural language to Linear Temporal Logic by routing through a structure-preserving intermediate representation and using verification as a training reward.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces NeuroNL2LTL, which combines neural networks with formal verification to translate natural language requirements into Linear Temporal Logic specifications. It routes the translation through an intermediate representation that maps to LTL in a structure-preserving way, then applies satisfiability checks and repairs. The key training method uses verification outcomes as rewards in reinforcement learning so that the neural part learns to produce formally correct outputs. This matters because it allows domain experts to create reliable formal specs without deep logic expertise, as shown by results on over 200,000 requirements from multiple safety-critical domains where 86 percent of outputs pass verification.

Core claim

NeuroNL2LTL achieves translation by using a verifier-in-the-loop where verification results train the neural translator via reinforcement learning, ensuring that generated LTL specifications are both semantically close to references and formally verifiable, with 28 percent semantic equivalence and 86 percent satisfiability on a large multi-domain dataset.

What carries the argument

verifier-in-the-loop training, in which verification outcomes serve as reward signals for reinforcement learning to optimize neural components for formal correctness

If this is right

Neural components directly optimize for formal correctness rather than just fluency.
86 percent of generated specifications are verified satisfiable.
The system produces contextually grounded explanations from LTL for validation by non-experts.
Translation works across aerospace, robotics, autonomous vehicles and other domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar verifier-in-the-loop approaches could apply to translations into other formal languages like CTL or first-order logic.
Integration with existing model checkers could further automate safety verification pipelines.
Scaling the intermediate representation might allow handling more complex temporal properties.
The repair mechanism could be generalized to other near-miss correction tasks in formal methods.

Load-bearing premise

The intermediate representation has a mapping to LTL that is structure-preserving by construction.

What would settle it

A controlled experiment that trains the same neural model without using verification outcomes as RL rewards and then measures whether the percentage of verified satisfiable outputs falls significantly below 86 percent on the same dataset.

Figures

Figures reproduced from arXiv: 2605.22874 by Ernest Bonnah, Paapa Kwesi Quansah.

**Figure 1.** Figure 1: The architecture of NeuroNL2LTL framework Proposition 1(Structure Preservation) For any LTL formula φ: T −1 (T (φ)) ≡ φ. Proof sketch: The proof follows by structural induction on φ. Each LTL operator maps to a unique ITL keyword, and the recursive structure of T preserves operator scope and precedence. The round-trip property ensures that ITL functions as a lossless intermediate representation. ITL serve… view at source ↗

read the original abstract

Effectively translating between natural language (NL) and formal logics like Linear Temporal Logic (LTL) requires expertise that limits formal verification's reach in safety-critical development. Template-based approaches sacrifice expressiveness for reliability; neural methods achieve fluency but provide no correctness guarantees. We present NeuroNL2LTL, a neurosymbolic architecture unifying learned translation with formal verification. NeuroNL2LTL routes translation through an intermediate representation whose mapping to LTL is structure-preserving by construction. Generated specifications undergo satisfiability and non-triviality checking; a minimal-edit repair mechanism corrects near-miss outputs before they reach downstream tools. The central innovation is verifier-in-the-loop training: verification outcomes serve as reward signals for reinforcement learning, producing neural components that optimize directly for formal correctness. On 200,000+ requirements spanning aerospace, robotics, autonomous vehicles, and ten additional domains, NeuroNL2LTL achieves 28\% semantic equivalence with reference specifications while ensuring 86\% of outputs are verified satisfiable. The system also generates contextually grounded explanations from LTL, enabling domain experts to validate specifications without specialized training. This work demonstrates that formal verification can function as both training objective and runtime filter for neural specification systems, allowing us to build neural-based tools whose reliability derives from logical guarantees rather than statistical confidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Verifier-in-the-loop RL lifts satisfiability to 86% but semantic equivalence stays at 28%, so the method secures valid LTL without guaranteeing it matches the input NL.

read the letter

The main thing to know is that NeuroNL2LTL routes NL through an intermediate representation, applies a verifier for satisfiability and non-triviality, uses minimal-edit repair, and feeds verification outcomes back as RL rewards. On over 200k requirements from aerospace, robotics, and other domains this produces 86% verified-satisfiable outputs while reaching only 28% semantic equivalence with references. It also generates explanations from the LTL side. That combination of IR, repair, and verifier-guided training is the concrete new piece; prior neurosymbolic translation work exists but this specific loop for LTL appears distinct based on the abstract. The large multi-domain dataset and the dual use of verification at training and runtime are practical strengths. The soft spot is exactly the one the stress-test flags: satisfiability and non-triviality checks validate intrinsic properties of the generated formula, not whether the formula captures the original NL meaning. The 28% equivalence rate shows the gap clearly, and the structure-preserving claim on the IR does not close it because the verifier never sees the NL semantics. The abstract gives no baselines, no definition of semantic equivalence, and no error analysis, so the numbers are hard to interpret. This is for groups working on AI support for formal specification in safety-critical engineering. Readers who want to see how verification signals can shape neural training will get value from the architecture even if the results are mixed. The work shows honest engagement with the limits of pure neural translation and tries to address them with external checks. It deserves peer review so the experimental details and comparisons can be examined.

Referee Report

2 major / 1 minor

Summary. The paper introduces NeuroNL2LTL, a neurosymbolic framework for translating natural language requirements into Linear Temporal Logic (LTL). It routes translation through an intermediate representation with a structure-preserving mapping to LTL by construction, uses a verifier-in-the-loop RL training regime where satisfiability and non-triviality checks provide reward signals plus a minimal-edit repair step, and reports 28% semantic equivalence to reference specifications alongside 86% verified satisfiability on a dataset of over 200,000 requirements from aerospace, robotics, autonomous vehicles and ten other domains. The system also generates LTL-based explanations.

Significance. If the central claims hold, the verifier-in-the-loop training paradigm and the large multi-domain dataset would represent a meaningful step toward reliable NL-to-formal-specification tools for safety-critical domains. The explicit separation of neural fluency from formal guarantees via an IR is a clear architectural strength, and the use of verification both at training time and runtime is a substantive contribution that could be adopted more broadly.

major comments (2)

[Abstract] Abstract: the claim that verification outcomes produce neural components that 'optimize directly for formal correctness' is contradicted by the reported metrics themselves. The verifier enforces only satisfiability and non-triviality of the generated LTL; the 28% semantic equivalence rate demonstrates that these intrinsic checks do not enforce fidelity to the input NL semantics, so the RL objective cannot be said to optimize for translation correctness.
[Abstract] Abstract: the asserted 'structure-preserving' mapping from the intermediate representation to LTL transfers a correctness guarantee only for the syntactic validity of the LTL, not for whether the generated IR (and thus LTL) matches the semantics of the original NL requirement. No evidence is given that the neural IR generator is constrained or rewarded for semantic alignment beyond what the downstream verifier already checks.

minor comments (1)

[Abstract] Abstract: performance numbers are stated without accompanying definitions of 'semantic equivalence,' measurement protocol, baselines, or error analysis; these must be supplied with concrete experimental details in the methods and results sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We agree that the abstract wording overstates the guarantees provided by the verifier-in-the-loop training and the structure-preserving mapping. We will revise the abstract for precision and address the points below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that verification outcomes produce neural components that 'optimize directly for formal correctness' is contradicted by the reported metrics themselves. The verifier enforces only satisfiability and non-triviality of the generated LTL; the 28% semantic equivalence rate demonstrates that these intrinsic checks do not enforce fidelity to the input NL semantics, so the RL objective cannot be said to optimize for translation correctness.

Authors: We agree with the observation. The RL objective uses rewards derived solely from satisfiability and non-triviality checks on the generated LTL; these do not enforce or measure semantic equivalence to the source NL. The reported 28% equivalence rate confirms the distinction. We will revise the abstract to state that the neural components optimize for verifiable formal properties (satisfiability and non-triviality) of the LTL output rather than claiming direct optimization for translation correctness. revision: yes
Referee: [Abstract] Abstract: the asserted 'structure-preserving' mapping from the intermediate representation to LTL transfers a correctness guarantee only for the syntactic validity of the LTL, not for whether the generated IR (and thus LTL) matches the semantics of the original NL requirement. No evidence is given that the neural IR generator is constrained or rewarded for semantic alignment beyond what the downstream verifier already checks.

Authors: The referee correctly notes that the structure-preserving mapping guarantees only that the LTL is syntactically valid and structurally faithful to the IR; it provides no semantic alignment guarantee with the original NL. The neural IR generator receives training signals exclusively from the downstream verifier. We will revise the abstract to specify that the mapping ensures syntactic validity by construction while semantic alignment depends on the learned model and verifier feedback without additional direct constraints. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The provided abstract and context describe a neurosymbolic pipeline with an explicitly external verifier supplying reward signals for RL, plus a structure-preserving IR mapping presented as a deliberate architectural choice rather than a self-referential definition. No equations, fitted parameters renamed as predictions, self-citation load-bearing premises, uniqueness theorems imported from the same authors, or ansatzes smuggled via prior work appear in the text. The 28% semantic equivalence and 86% satisfiability figures are reported as empirical outcomes, not tautological consequences of the inputs. The derivation therefore remains self-contained against external benchmarks and does not reduce any claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient detail to identify specific free parameters, axioms, or invented entities; no equations or implementation details given.

pith-pipeline@v0.9.0 · 5768 in / 1255 out tokens · 40558 ms · 2026-05-25T06:07:34.774424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 2 internal anchors

[1]

Constitutional AI: Harmlessness from AI Feedback

Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., et al.: Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073 (2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022
[2]

In: International conference on tools and algo- rithms for the construction and analysis of systems

Bloem, R., Könighofer, B., Könighofer, R., Wang, C.: Shield synthesis: Runtime enforcement for reactive systems. In: International conference on tools and algo- rithms for the construction and analysis of systems. pp. 533–548. Springer (2015)

work page 2015
[3]

arXiv preprint arXiv:2102.12855 (2021)

Cai, M., Hasanbeig, M., Xiao, S., Abate, A., Kan, Z.: Modular deep reinforce- ment learning for continuous motion planning with temporal logic. arXiv preprint arXiv:2102.12855 (2021)

work page arXiv 2021
[4]

In: Proceedings of the International Conference on Automated Planning and Scheduling

Camacho, A., McIlraith, S.A.: Learning interpretable models expressed in linear temporal logic. In: Proceedings of the International Conference on Automated Planning and Scheduling. vol. 29, pp. 621–630 (2019)

work page 2019
[5]

arXiv preprint arXiv:2305.07766 (2023)

Chen, Y., Gandhi, R., Zhang, Y., Fan, C.: Nl2tl: Transforming natural languages to temporal logics using large language models. arXiv preprint arXiv:2305.07766 (2023)

work page arXiv 2023
[6]

IEEE Transactions on Software Engineering47(9), 1943–1959 (2019)

Chen, Z., Kommrusch, S., Tufano, M., Pouchet, L.N., Poshyvanyk, D., Monperrus, M.: Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering47(9), 1943–1959 (2019)

work page 1943
[7]

Advances in neural information processing systems30(2017)

Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., Amodei, D.: Deep reinforcement learning from human preferences. Advances in neural information processing systems30(2017)

work page 2017
[8]

Clarke, E.M., Henzinger, T.A., Veith, H., Bloem, R., et al.: Handbook of model checking, vol. 10. Springer (2018)

work page 2018
[9]

ACM Transactions on Programming Languages and Systems (TOPLAS)24(6), 698–710 (2002)

Corchuelo, R., Pérez, J.A., Ruiz, A., Toro, M.: Repairing syntax errors in lr parsers. ACM Transactions on Programming Languages and Systems (TOPLAS)24(6), 698–710 (2002)

work page 2002
[10]

In: International Conference on Computer Aided Verification

Cosler, M., Hahn, C., Mendoza, D., Schmitt, F., Trippel, C.: nl2spec: Interactively translating unstructured natural language to temporal logics with large language models. In: International Conference on Computer Aided Verification. pp. 383–396. Springer (2023)

work page 2023
[11]

In: International Sym- posium on Automated Technology for Verification and Analysis

Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L.: Spot 2.0—a framework for ltl and-automata manipulation. In: International Sym- posium on Automated Technology for Verification and Analysis. pp. 122–129. Springer (2016)

work page 2016
[12]

In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Sim- ulation of Computer and Telecommunications Systems, 2004.(MASCOTS 2004)

Duret-Lutz, A., Poitrenaud, D.: Spot: an extensible model checking library using transition-based generalized bu/spl uml/chi automata. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Sim- ulation of Computer and Telecommunications Systems, 2004.(MASCOTS 2004). Proceedings. pp. 76–83. IEEE (2004)

work page 2004
[13]

In: Proceedings of the 21st international conference on Software engineering

Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st international conference on Software engineering. pp. 411–420 (1999)

work page 1999
[14]

In: International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ-2020)

Giannakopoulou, D., Mavridou, A., Rhein, J., Pressburger, T., Schumann, J., Shi, N.: Formal requirements elicitation with fret. In: International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ-2020). No. ARC-E-DAA-TN77785 (2020)

work page 2020
[15]

In: Pro- ceedings of the 30th international conference on Software engineering

Grunske, L.: Specification patterns for probabilistic quality properties. In: Pro- ceedings of the 30th international conference on Software engineering. pp. 31–40 (2008) From Logic to Language 17

work page 2008
[16]

arXiv preprint arXiv:2102.06203 (2021)

Han, J.M., Rute, J., Wu, Y., Ayers, E.W., Polu, S.: Proof artifact co-training for theorem proving with language models. arXiv preprint arXiv:2102.06203 (2021)

work page arXiv 2021
[17]

Holzmann, G.J.: The SPIN model checker: Primer and reference manual, vol. 1003. Addison-Wesley Reading (2004)

work page 2004
[18]

Nature pp

Hubert, T., Mehta, R., Sartran, L., Horváth, M.Z., Žužić, G., Wieser, E., Huang, A., Schrittwieser, J., Schroecker, Y., Masoom, H., et al.: Olympiad-level formal mathematical reasoning with reinforcement learning. Nature pp. 1–3 (2025)

work page 2025
[19]

Formal Methods in System Design51(2), 332–361 (2017)

Könighofer, B., Alshiekh, M., Bloem, R., Humphrey, L., Könighofer, R., Topcu, U., Wang, C.: Shield synthesis. Formal Methods in System Design51(2), 332–361 (2017)

work page 2017
[20]

In: Proceedings of the 27th international conference on Software engineering

Konrad, S., Cheng, B.H.: Real-time specification patterns. In: Proceedings of the 27th international conference on Software engineering. pp. 372–381 (2005)

work page 2005
[21]

Ad- vances in neural information processing systems35, 26337–26349 (2022)

Lample, G., Lacroix, T., Lachaux, M.A., Rodriguez, A., Hayat, A., Lavril, T., Ebner, G., Martinet, X.: Hypertree proof search for neural theorem proving. Ad- vances in neural information processing systems35, 26337–26349 (2022)

work page 2022
[22]

Advances in Neural Information Processing Systems35, 21314–21328 (2022)

Le, H., Wang, Y., Gotmare, A.D., Savarese, S., Hoi, S.C.H.: Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Advances in Neural Information Processing Systems35, 21314–21328 (2022)

work page 2022
[23]

In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Le, X.B.D., Chu, D.H., Lo, D., Le Goues, C., Visser, W.: S3: syntax-and semantic- guided repair synthesis via programming by examples. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. pp. 593–604 (2017)

work page 2017
[24]

Lee, H., Phatale, S., Mansoor, H., Lu, K.R., Mesnard, T., Ferret, J., Bishop, C., Hall, E., Carbune, V., Rastogi, A.: Rlaif: Scaling reinforcement learning from hu- man feedback with ai feedback (2023)

work page 2023
[25]

In: 2015 30th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE)

Lemieux, C., Park, D., Beschastnikh, I.: General ltl specification mining (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE). pp. 81–92. IEEE (2015)

work page 2015
[26]

Proceedings of the ACM on Programming Languages7(PLDI), 1463–1487 (2023)

Li, Z., Huang, J., Naik, M.: Scallop: A language for neurosymbolic programming. Proceedings of the ACM on Programming Languages7(PLDI), 1463–1487 (2023)

work page 2023
[27]

In: The Twelfth International Conference on Learning Representations (2023)

Lightman, H., Kosaraju, V., Burda, Y., Edwards, H., Baker, B., Lee, T., Leike, J., Schulman, J., Sutskever, I., Cobbe, K.: Let’s verify step by step. In: The Twelfth International Conference on Learning Representations (2023)

work page 2023
[28]

In: Conference on Robot Learning

Liu, J.X., Yang, Z., Idrees, I., Liang, S., Schornstein, B., Tellex, S., Shah, A.: Grounding complex natural language commands for temporal tasks in unseen en- vironments. In: Conference on Robot Learning. pp. 1084–1110. PMLR (2023)

work page 2023
[29]

Advances in neural information processing systems31(2018)

Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T., De Raedt, L.: Deep- problog: Neural probabilistic logic programming. Advances in neural information processing systems31(2018)

work page 2018
[30]

In: 2009 17th IEEE international requirements engineering conference

Mavin, A., Wilkinson, P., Harwood, A., Novak, M.: Easy approach to require- ments syntax (ears). In: 2009 17th IEEE international requirements engineering conference. pp. 317–322. IEEE (2009)

work page 2009
[31]

In: 2018 Formal Meth- ods in Computer Aided Design (FMCAD)

Neider, D., Gavran, I.: Learning linear temporal properties. In: 2018 Formal Meth- ods in Computer Aided Design (FMCAD). pp. 1–10. IEEE (2018)

work page 2018
[32]

Advances in neural information processing sys- tems35, 27730–27744 (2022)

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al.: Training language models to follow instructions with human feedback. Advances in neural information processing sys- tems35, 27730–27744 (2022)

work page 2022
[33]

In: 2023 IEEE Inter- national Conference on Robotics and Automation (ICRA)

Pan, J., Chou, G., Berenson, D.: Data-efficient learning of natural language to linear temporal logic translators for robot task specification. In: 2023 IEEE Inter- national Conference on Robotics and Automation (ICRA). pp. 11554–11561. IEEE (2023) 18 P. Quansah and E. Bonnah

work page 2023
[34]

In: 18th annual symposium on foun- dations of computer science (sfcs 1977)

Pnueli, A.: The temporal logic of programs. In: 18th annual symposium on foun- dations of computer science (sfcs 1977). pp. 46–57. ieee (1977)

work page 1977
[35]

Generative Language Modeling for Automated Theorem Proving

Polu, S., Sutskever, I.: Generative language modeling for automated theorem prov- ing. arXiv preprint arXiv:2009.03393 (2020)

work page internal anchor Pith review Pith/arXiv arXiv 2009
[36]

arXiv preprint arXiv:2002.03668 (2020)

Roy, R., Fisman, D., Neider, D.: Learning interpretable models in the property specification language. arXiv preprint arXiv:2002.03668 (2020)

work page arXiv 2002
[37]

Computer Science Review5(2), 163–203 (2011)

Rozier, K.Y.: Linear temporal logic symbolic model checking. Computer Science Review5(2), 163–203 (2011)

work page 2011
[38]

In: Proceedings of the 24th International Conference on Software Engineering

Smith, R.L., Avrunin, G.S., Clarke, L.A., Osterweil, L.J.: Propel: an approach sup- porting property elucidation. In: Proceedings of the 24th International Conference on Software Engineering. pp. 11–21 (2002)

work page 2002
[39]

In: Conference on Robot Learning

Wang, C., Ross, C., Kuo, Y.L., Katz, B., Barbu, A.: Learning a natural-language to ltl executable semantic parser for grounded robotics. In: Conference on Robot Learning. pp. 1706–1718. PMLR (2021)

work page 2021
[40]

In: 2020 Chinese Control And Decision Conference (CCDC)

Xie, G., Yin, Z., Li, J.: Temporal logic based motion planning with infeasible ltl specification. In: 2020 Chinese Control And Decision Conference (CCDC). pp. 4899–4904. IEEE (2020)

work page 2020
[41]

In: International conference on machine learning

Xu, J., Zhang, Z., Friedman, T., Liang, Y., Broeck, G.: A semantic loss function for deep learning with symbolic knowledge. In: International conference on machine learning. pp. 5502–5511. PMLR (2018)

work page 2018
[42]

In: Proceedings of the 44th international conference on software engineering

Ye, H., Martinez, M., Monperrus, M.: Neural program repair with execution-based backpropagation. In: Proceedings of the 44th international conference on software engineering. pp. 1506–1518 (2022)

work page 2022

[1] [1]

Constitutional AI: Harmlessness from AI Feedback

Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., et al.: Constitutional ai: Harmlessness from ai feedback. arXiv preprint arXiv:2212.08073 (2022)

work page internal anchor Pith review Pith/arXiv arXiv 2022

[2] [2]

In: International conference on tools and algo- rithms for the construction and analysis of systems

Bloem, R., Könighofer, B., Könighofer, R., Wang, C.: Shield synthesis: Runtime enforcement for reactive systems. In: International conference on tools and algo- rithms for the construction and analysis of systems. pp. 533–548. Springer (2015)

work page 2015

[3] [3]

arXiv preprint arXiv:2102.12855 (2021)

Cai, M., Hasanbeig, M., Xiao, S., Abate, A., Kan, Z.: Modular deep reinforce- ment learning for continuous motion planning with temporal logic. arXiv preprint arXiv:2102.12855 (2021)

work page arXiv 2021

[4] [4]

In: Proceedings of the International Conference on Automated Planning and Scheduling

Camacho, A., McIlraith, S.A.: Learning interpretable models expressed in linear temporal logic. In: Proceedings of the International Conference on Automated Planning and Scheduling. vol. 29, pp. 621–630 (2019)

work page 2019

[5] [5]

arXiv preprint arXiv:2305.07766 (2023)

Chen, Y., Gandhi, R., Zhang, Y., Fan, C.: Nl2tl: Transforming natural languages to temporal logics using large language models. arXiv preprint arXiv:2305.07766 (2023)

work page arXiv 2023

[6] [6]

IEEE Transactions on Software Engineering47(9), 1943–1959 (2019)

Chen, Z., Kommrusch, S., Tufano, M., Pouchet, L.N., Poshyvanyk, D., Monperrus, M.: Sequencer: Sequence-to-sequence learning for end-to-end program repair. IEEE Transactions on Software Engineering47(9), 1943–1959 (2019)

work page 1943

[7] [7]

Advances in neural information processing systems30(2017)

Christiano, P.F., Leike, J., Brown, T., Martic, M., Legg, S., Amodei, D.: Deep reinforcement learning from human preferences. Advances in neural information processing systems30(2017)

work page 2017

[8] [8]

Clarke, E.M., Henzinger, T.A., Veith, H., Bloem, R., et al.: Handbook of model checking, vol. 10. Springer (2018)

work page 2018

[9] [9]

ACM Transactions on Programming Languages and Systems (TOPLAS)24(6), 698–710 (2002)

Corchuelo, R., Pérez, J.A., Ruiz, A., Toro, M.: Repairing syntax errors in lr parsers. ACM Transactions on Programming Languages and Systems (TOPLAS)24(6), 698–710 (2002)

work page 2002

[10] [10]

In: International Conference on Computer Aided Verification

Cosler, M., Hahn, C., Mendoza, D., Schmitt, F., Trippel, C.: nl2spec: Interactively translating unstructured natural language to temporal logics with large language models. In: International Conference on Computer Aided Verification. pp. 383–396. Springer (2023)

work page 2023

[11] [11]

In: International Sym- posium on Automated Technology for Verification and Analysis

Duret-Lutz, A., Lewkowicz, A., Fauchille, A., Michaud, T., Renault, E., Xu, L.: Spot 2.0—a framework for ltl and-automata manipulation. In: International Sym- posium on Automated Technology for Verification and Analysis. pp. 122–129. Springer (2016)

work page 2016

[12] [12]

In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Sim- ulation of Computer and Telecommunications Systems, 2004.(MASCOTS 2004)

Duret-Lutz, A., Poitrenaud, D.: Spot: an extensible model checking library using transition-based generalized bu/spl uml/chi automata. In: The IEEE Computer Society’s 12th Annual International Symposium on Modeling, Analysis, and Sim- ulation of Computer and Telecommunications Systems, 2004.(MASCOTS 2004). Proceedings. pp. 76–83. IEEE (2004)

work page 2004

[13] [13]

In: Proceedings of the 21st international conference on Software engineering

Dwyer, M.B., Avrunin, G.S., Corbett, J.C.: Patterns in property specifications for finite-state verification. In: Proceedings of the 21st international conference on Software engineering. pp. 411–420 (1999)

work page 1999

[14] [14]

In: International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ-2020)

Giannakopoulou, D., Mavridou, A., Rhein, J., Pressburger, T., Schumann, J., Shi, N.: Formal requirements elicitation with fret. In: International Working Conference on Requirements Engineering: Foundation for Software Quality (REFSQ-2020). No. ARC-E-DAA-TN77785 (2020)

work page 2020

[15] [15]

In: Pro- ceedings of the 30th international conference on Software engineering

Grunske, L.: Specification patterns for probabilistic quality properties. In: Pro- ceedings of the 30th international conference on Software engineering. pp. 31–40 (2008) From Logic to Language 17

work page 2008

[16] [16]

arXiv preprint arXiv:2102.06203 (2021)

Han, J.M., Rute, J., Wu, Y., Ayers, E.W., Polu, S.: Proof artifact co-training for theorem proving with language models. arXiv preprint arXiv:2102.06203 (2021)

work page arXiv 2021

[17] [17]

Holzmann, G.J.: The SPIN model checker: Primer and reference manual, vol. 1003. Addison-Wesley Reading (2004)

work page 2004

[18] [18]

Nature pp

Hubert, T., Mehta, R., Sartran, L., Horváth, M.Z., Žužić, G., Wieser, E., Huang, A., Schrittwieser, J., Schroecker, Y., Masoom, H., et al.: Olympiad-level formal mathematical reasoning with reinforcement learning. Nature pp. 1–3 (2025)

work page 2025

[19] [19]

Formal Methods in System Design51(2), 332–361 (2017)

Könighofer, B., Alshiekh, M., Bloem, R., Humphrey, L., Könighofer, R., Topcu, U., Wang, C.: Shield synthesis. Formal Methods in System Design51(2), 332–361 (2017)

work page 2017

[20] [20]

In: Proceedings of the 27th international conference on Software engineering

Konrad, S., Cheng, B.H.: Real-time specification patterns. In: Proceedings of the 27th international conference on Software engineering. pp. 372–381 (2005)

work page 2005

[21] [21]

Ad- vances in neural information processing systems35, 26337–26349 (2022)

Lample, G., Lacroix, T., Lachaux, M.A., Rodriguez, A., Hayat, A., Lavril, T., Ebner, G., Martinet, X.: Hypertree proof search for neural theorem proving. Ad- vances in neural information processing systems35, 26337–26349 (2022)

work page 2022

[22] [22]

Advances in Neural Information Processing Systems35, 21314–21328 (2022)

Le, H., Wang, Y., Gotmare, A.D., Savarese, S., Hoi, S.C.H.: Coderl: Mastering code generation through pretrained models and deep reinforcement learning. Advances in Neural Information Processing Systems35, 21314–21328 (2022)

work page 2022

[23] [23]

In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering

Le, X.B.D., Chu, D.H., Lo, D., Le Goues, C., Visser, W.: S3: syntax-and semantic- guided repair synthesis via programming by examples. In: Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering. pp. 593–604 (2017)

work page 2017

[24] [24]

Lee, H., Phatale, S., Mansoor, H., Lu, K.R., Mesnard, T., Ferret, J., Bishop, C., Hall, E., Carbune, V., Rastogi, A.: Rlaif: Scaling reinforcement learning from hu- man feedback with ai feedback (2023)

work page 2023

[25] [25]

In: 2015 30th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE)

Lemieux, C., Park, D., Beschastnikh, I.: General ltl specification mining (t). In: 2015 30th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE). pp. 81–92. IEEE (2015)

work page 2015

[26] [26]

Proceedings of the ACM on Programming Languages7(PLDI), 1463–1487 (2023)

Li, Z., Huang, J., Naik, M.: Scallop: A language for neurosymbolic programming. Proceedings of the ACM on Programming Languages7(PLDI), 1463–1487 (2023)

work page 2023

[27] [27]

In: The Twelfth International Conference on Learning Representations (2023)

Lightman, H., Kosaraju, V., Burda, Y., Edwards, H., Baker, B., Lee, T., Leike, J., Schulman, J., Sutskever, I., Cobbe, K.: Let’s verify step by step. In: The Twelfth International Conference on Learning Representations (2023)

work page 2023

[28] [28]

In: Conference on Robot Learning

Liu, J.X., Yang, Z., Idrees, I., Liang, S., Schornstein, B., Tellex, S., Shah, A.: Grounding complex natural language commands for temporal tasks in unseen en- vironments. In: Conference on Robot Learning. pp. 1084–1110. PMLR (2023)

work page 2023

[29] [29]

Advances in neural information processing systems31(2018)

Manhaeve, R., Dumancic, S., Kimmig, A., Demeester, T., De Raedt, L.: Deep- problog: Neural probabilistic logic programming. Advances in neural information processing systems31(2018)

work page 2018

[30] [30]

In: 2009 17th IEEE international requirements engineering conference

Mavin, A., Wilkinson, P., Harwood, A., Novak, M.: Easy approach to require- ments syntax (ears). In: 2009 17th IEEE international requirements engineering conference. pp. 317–322. IEEE (2009)

work page 2009

[31] [31]

In: 2018 Formal Meth- ods in Computer Aided Design (FMCAD)

Neider, D., Gavran, I.: Learning linear temporal properties. In: 2018 Formal Meth- ods in Computer Aided Design (FMCAD). pp. 1–10. IEEE (2018)

work page 2018

[32] [32]

Advances in neural information processing sys- tems35, 27730–27744 (2022)

Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., et al.: Training language models to follow instructions with human feedback. Advances in neural information processing sys- tems35, 27730–27744 (2022)

work page 2022

[33] [33]

In: 2023 IEEE Inter- national Conference on Robotics and Automation (ICRA)

Pan, J., Chou, G., Berenson, D.: Data-efficient learning of natural language to linear temporal logic translators for robot task specification. In: 2023 IEEE Inter- national Conference on Robotics and Automation (ICRA). pp. 11554–11561. IEEE (2023) 18 P. Quansah and E. Bonnah

work page 2023

[34] [34]

In: 18th annual symposium on foun- dations of computer science (sfcs 1977)

Pnueli, A.: The temporal logic of programs. In: 18th annual symposium on foun- dations of computer science (sfcs 1977). pp. 46–57. ieee (1977)

work page 1977

[35] [35]

Generative Language Modeling for Automated Theorem Proving

Polu, S., Sutskever, I.: Generative language modeling for automated theorem prov- ing. arXiv preprint arXiv:2009.03393 (2020)

work page internal anchor Pith review Pith/arXiv arXiv 2009

[36] [36]

arXiv preprint arXiv:2002.03668 (2020)

Roy, R., Fisman, D., Neider, D.: Learning interpretable models in the property specification language. arXiv preprint arXiv:2002.03668 (2020)

work page arXiv 2002

[37] [37]

Computer Science Review5(2), 163–203 (2011)

Rozier, K.Y.: Linear temporal logic symbolic model checking. Computer Science Review5(2), 163–203 (2011)

work page 2011

[38] [38]

In: Proceedings of the 24th International Conference on Software Engineering

Smith, R.L., Avrunin, G.S., Clarke, L.A., Osterweil, L.J.: Propel: an approach sup- porting property elucidation. In: Proceedings of the 24th International Conference on Software Engineering. pp. 11–21 (2002)

work page 2002

[39] [39]

In: Conference on Robot Learning

Wang, C., Ross, C., Kuo, Y.L., Katz, B., Barbu, A.: Learning a natural-language to ltl executable semantic parser for grounded robotics. In: Conference on Robot Learning. pp. 1706–1718. PMLR (2021)

work page 2021

[40] [40]

In: 2020 Chinese Control And Decision Conference (CCDC)

Xie, G., Yin, Z., Li, J.: Temporal logic based motion planning with infeasible ltl specification. In: 2020 Chinese Control And Decision Conference (CCDC). pp. 4899–4904. IEEE (2020)

work page 2020

[41] [41]

In: International conference on machine learning

Xu, J., Zhang, Z., Friedman, T., Liang, Y., Broeck, G.: A semantic loss function for deep learning with symbolic knowledge. In: International conference on machine learning. pp. 5502–5511. PMLR (2018)

work page 2018

[42] [42]

In: Proceedings of the 44th international conference on software engineering

Ye, H., Martinez, M., Monperrus, M.: Neural program repair with execution-based backpropagation. In: Proceedings of the 44th international conference on software engineering. pp. 1506–1518 (2022)

work page 2022