Can LLMs reason about program se- mantics? a comprehensive evaluation of LLMs on formal specification inference

Thanh Le-Cong, Bach Le, Toby Murray · 2025 · DOI 10.18653/v1/2025.acl-long.1068

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open at publisher browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation

cs.SE · 2026-04-14 · accept · novelty 7.0

CodeSpecBench shows LLMs achieve at most 20.2% pass rate on repository-level executable behavioral specification generation, revealing that strong code generation does not imply deep semantic understanding.

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

cs.SE · 2026-05-11 · unverdicted · novelty 6.0

AutoSOUP automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs via three techniques and a hybrid LLM-plus-program-synthesis architecture called LLM-As-Function-Call.

citing papers explorer

Showing 2 of 2 citing papers.

CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation cs.SE · 2026-04-14 · accept · none · ref 19
CodeSpecBench shows LLMs achieve at most 20.2% pass rate on repository-level executable behavioral specification generation, revealing that strong code generation does not imply deep semantic understanding.
AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification cs.SE · 2026-05-11 · unverdicted · none · ref 53
AutoSOUP automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs via three techniques and a hybrid LLM-plus-program-synthesis architecture called LLM-As-Function-Call.

Can LLMs reason about program se- mantics? a comprehensive evaluation of LLMs on formal specification inference

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer