Can LLMs reason about program se- mantics? a comprehensive evaluation of LLMs on formal specification inference

Thanh Le-Cong, Bach Le, Toby Murray · 2025 · DOI 10.18653/v1/2025.acl-long.1068

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

open at publisher browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

KBSpec: LLM-driven Formal Specification Generation with Evolving Domain Knowledge Base

cs.SE · 2026-06-19 · unverdicted · novelty 7.0

KBSpec maintains an evolving knowledge base combining external docs and internal verifier feedback to improve LLM generation of verifiable JML specifications, achieving 10-25% higher verification pass rates.

CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation

cs.SE · 2026-04-14 · accept · novelty 7.0

CodeSpecBench shows LLMs achieve at most 20.2% pass rate on repository-level executable behavioral specification generation, revealing that strong code generation does not imply deep semantic understanding.

AutoSOUP: Safety-Oriented Unit Proof Generation for Component-level Memory-Safety Verification

cs.SE · 2026-05-11 · unverdicted · novelty 6.0

AutoSOUP automates component-level memory-safety verification by generating Safety-Oriented Unit Proofs via three techniques and a hybrid LLM-plus-program-synthesis architecture called LLM-As-Function-Call.

citing papers explorer

Showing 1 of 1 citing paper after filters.

CodeSpecBench: Benchmarking LLMs for Executable Behavioral Specification Generation cs.SE · 2026-04-14 · accept · none · ref 19
CodeSpecBench shows LLMs achieve at most 20.2% pass rate on repository-level executable behavioral specification generation, revealing that strong code generation does not imply deep semantic understanding.

Can LLMs reason about program se- mantics? a comprehensive evaluation of LLMs on formal specification inference

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer