pith. sign in

arxiv: 2604.26578 · v2 · pith:XID6Z2NGnew · submitted 2026-04-29 · 💻 cs.SE · cs.AI

Graph Construction and Matching for Imperative Programs using Neural and Structural Methods

Pith reviewed 2026-05-19 17:31 UTC · model grok-4.3

classification 💻 cs.SE cs.AI
keywords graph constructionimperative programsverification reuseabstract syntax treesneural embeddingsprogram similarityACSLJML
0
0 comments X

The pith

A pipeline builds consistent typed and attributed graphs from imperative programs across C, Java, and Dafny by combining abstract syntax tree parsing with neural embeddings.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to represent programs and their specifications as graphs so that structural and semantic similarities can be identified for reuse of verification work. It demonstrates the approach on datasets from C with ACSL annotations, Java with JML, and Dafny, showing that the same pipeline produces comparable graphs despite differences in language and annotation style. The construction mixes standard syntax tree extraction with semantic vectors from models such as SentenceTransformer and CodeBERT. A sympathetic reader would care because manual reuse of verified components is time-consuming and error-prone in safety-critical software. If the graphs truly preserve both structure and meaning, later steps such as approximate matching become feasible without requiring identical code.

Core claim

The paper's central claim is that a pipeline integrating abstract syntax tree parsing with semantic embeddings from SentenceTransformer and CodeBERT can generate typed, attributed graphs that remain consistent across imperative languages and annotation styles including C with ACSL, Java with JML, and Dafny.

What carries the argument

Typed, attributed graphs produced by AST parsing augmented with neural semantic embeddings that encode both structural relationships and contextual meaning.

If this is right

  • Consistent graph representations become available for programs written in C, Java, and Dafny.
  • The graphs serve as a foundation for semantic enrichment of verification artifacts.
  • Approximate graph matching can be applied to identify reusable verified components.
  • Verification artefact reuse can scale across different languages and annotation styles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Tools that automatically suggest verified code fragments for new projects could be built on top of these graphs.
  • The same construction method might apply to other languages once their parsers and suitable embedding models are added.
  • Integration with existing static analysis platforms could reduce redundant verification effort in large codebases.

Load-bearing premise

The graphs formed by merging syntax trees with embeddings from SentenceTransformer and CodeBERT capture enough structural and semantic information to support later approximate matching for verification reuse.

What would settle it

A test in which graphs constructed from semantically equivalent programs in two different languages produce low similarity scores under standard graph matching algorithms would show the representations are not consistent enough for reuse.

Figures

Figures reproduced from arXiv: 2604.26578 by Arshad Beg, Diarmuid O'Donoghue, Rosemary Monahan.

Figure 1
Figure 1. Figure 1: Workflow for verification artefact reuse via hybrid graph matching and view at source ↗
Figure 2
Figure 2. Figure 2: End-to-end workflow for graph construction and matching across C, view at source ↗
Figure 3
Figure 3. Figure 3: Prompt used to guide GPT-5.2 for ACSL annotation generation. view at source ↗
Figure 4
Figure 4. Figure 4: Graphs constructed for ACSL annotated C Programs of quicksort and view at source ↗
Figure 5
Figure 5. Figure 5: Graphs constructed for JML annotated Java Programs of quicksort and view at source ↗
read the original abstract

Reusing verification artefacts requires identifying structural and semantic similarities across programs and their specifications. In this paper, we focus on graph construction as a foundational step toward this goal. We present a pipeline that converts imperative programs and their annotations into typed, attributed graphs. Our experiments cover datasets including C with ACSL, Java with JML, and Dafny for C\#. The pipeline integrates abstract syntax tree parsing with semantic embeddings derived from models such as SentenceTransformer and CodeBERT. This enables the generation of graph representations that capture both structural relationships and semantic context. Our results show that consistent graph representations can be constructed across different languages and annotation styles. This work provides a practical basis for future steps in semantic enrichment and approximate graph matching for scalable verification artefact reuse.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents a pipeline for converting imperative programs and their annotations (C with ACSL, Java with JML, and Dafny) into typed, attributed graphs. The method integrates AST parsing with semantic embeddings from SentenceTransformer and CodeBERT to capture structural relationships and semantic context. The central claim is that this produces consistent graph representations across languages and annotation styles, providing a practical basis for future semantic enrichment and approximate graph matching to enable scalable verification artefact reuse.

Significance. If the consistency of the generated graphs were quantitatively validated, the work could supply a useful foundation for cross-language reuse of verification artefacts, a longstanding challenge in formal methods. The combination of structural AST methods with neural embeddings is a plausible direction. However, the manuscript is currently descriptive and supplies no empirical metrics, limiting its assessed significance.

major comments (2)
  1. [Abstract] Abstract: The assertion that 'consistent graph representations can be constructed across different languages and annotation styles' is presented without any reported quantitative support (e.g., cross-language node-embedding cosine similarities, graph-edit-distance statistics, inter-annotator agreement on labels, or comparison to a pure-AST baseline). This is load-bearing for the central claim.
  2. [Experiments] Experiments section: No metrics, validation details, error analysis, dataset sizes, or baseline comparisons are supplied to substantiate that the AST+SentenceTransformer/CodeBERT pipeline yields aligned structural and semantic attributes across C/ACSL, Java/JML, and Dafny. The consistency claim therefore remains an untested design assertion rather than an observed outcome.
minor comments (2)
  1. [Graph Construction] The description of graph attributes and typing could be clarified by including a small concrete example of a generated graph with node/edge labels.
  2. [Related Work] Consider adding citations to prior work on code graph representations and verification reuse to better situate the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We agree that the manuscript is currently descriptive and that the central claim of cross-language consistency requires quantitative validation to strengthen its significance. We will revise the paper to incorporate empirical metrics, dataset details, and baseline comparisons as outlined below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The assertion that 'consistent graph representations can be constructed across different languages and annotation styles' is presented without any reported quantitative support (e.g., cross-language node-embedding cosine similarities, graph-edit-distance statistics, inter-annotator agreement on labels, or comparison to a pure-AST baseline). This is load-bearing for the central claim.

    Authors: We agree that the claim is load-bearing and currently lacks quantitative support. In the revised manuscript we will add explicit metrics: average cosine similarity of corresponding node embeddings across C/ACSL, Java/JML and Dafny instances of equivalent programs; graph-edit-distance statistics for matched programs; and a direct comparison against a pure-AST baseline that omits neural embeddings. The abstract will be updated to reflect these new results rather than the design assertion alone. revision: yes

  2. Referee: [Experiments] Experiments section: No metrics, validation details, error analysis, dataset sizes, or baseline comparisons are supplied to substantiate that the AST+SentenceTransformer/CodeBERT pipeline yields aligned structural and semantic attributes across C/ACSL, Java/JML, and Dafny. The consistency claim therefore remains an untested design assertion rather than an observed outcome.

    Authors: We accept this assessment. The current experiments section describes the pipeline and the three language datasets but supplies no numerical validation. We will expand the section with: exact dataset sizes (number of programs and annotations per language), quantitative alignment metrics (cosine similarities and edit distances), an error analysis of cases where embeddings diverge, and the pure-AST baseline comparison. These additions will convert the consistency statement into an empirically observed result. revision: yes

Circularity Check

0 steps flagged

No circularity detected in descriptive pipeline

full rationale

The paper describes a pipeline that converts programs in C/ACSL, Java/JML and Dafny into typed attributed graphs by combining AST parsing with semantic embeddings from SentenceTransformer and CodeBERT. No equations, fitted parameters, predictions, or derivations are presented anywhere in the manuscript. The central claim that 'consistent graph representations can be constructed across different languages and annotation styles' is an assertion about the output of the described pipeline rather than a result derived from self-citation chains, self-definitional loops, or renaming of known results. The work is therefore self-contained as a methodological description with no load-bearing step that reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The approach relies on standard AST parsing and pre-trained embedding models without introducing new free parameters, axioms, or invented entities in the abstract.

pith-pipeline@v0.9.0 · 5655 in / 1084 out tokens · 41758 ms · 2026-05-19T17:31:52.861662+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · 1 internal anchor

  1. [1]

    In: International Conference on Computational Creativity (ICCC)

    Ajankovil KG, O’Donoghue D, Monahan R (2021) Creating new program proofs by combining abductive and deductive reasoning. In: International Conference on Computational Creativity (ICCC)

  2. [2]

    Ali SJ, Naganathan V, Bork D (2024) Establishing traceability between natural language requirements and software artifacts by combining RAG and llms. In: Maass W, Han H, Yasar H, Multari NJ (eds) Conceptual Modeling - 43rd International Conference, ER 2024, Pittsburgh, PA, USA, October 28-31, 2024, Proceedings, Springer, Lecture Notes in Computer Science,...

  3. [3]

    Beg A, O’Donoghue D, Monahan R (2026) Learning-infused formal reason- ing: From contract synthesis to artifact reuse and formal semantics. URL https://arxiv.org/abs/2602.02881, vERIF AI-2026: The Interplay between Artificial Intelligence and Software Verification, LASER Center, Villebru- mier, France, March 8–11, 2026, 2602.02881

  4. [4]

    Short Version of VERIFAI2026 Paper -- Learning Infused Formal Reasoning: Contract Synthesis, Artefact Reuse and Semantic Foundations

    Beg A, O’Donoghue D, Monahan R (2026) Short version of verifai2026 paper – learning infused formal reasoning: Contract synthesis, artefact reuse and semantic foundations. URL https://arxiv.org/abs/2604.12747, accepted at AASC-2026: ADAPT Annual Scientific Conference 2026, Dublin City Uni- versity, Ireland, May 14, 2026, 2604.12747

  5. [5]

    In: Kosmatov N, Graph Construction and Matching for Imperative Programs 19 Kovács L (eds) Integrated Formal Methods, Springer Nature Switzerland, Cham, pp 307–325

    Granberry G, Ahrendt W, Johansson M (2025) Specify what? enhanc- ing neural specification synthesis by symbolic methods. In: Kosmatov N, Graph Construction and Matching for Imperative Programs 19 Kovács L (eds) Integrated Formal Methods, Springer Nature Switzerland, Cham, pp 307–325

  6. [6]

    Lippolis AS, Klironomos A, Milon-Flores DF, Zheng H, Jouglar A, Norouzi E, Hogan A (2023) Enhancing entity alignment between wikidata and art- graph using llms. In: Bikakis A, Ferrario R, Jean S, Markhoff B, Mosca A, Asmundo MN (eds) Proceedings of the International Workshop on Se- mantic Web and Ontology Design for Cultural Heritage co-located with the I...

  7. [7]

    CoRR abs/2501.08686, DOI 10 .48550/ARXIV.2501.08686, URL https://doi.org/ 10.48550/arXiv.2501.08686, 2501.08686

    Ma C, Chakrabarti S, Khan A, Molnár B (2025) Knowledge graph- based retrieval-augmented generation for schema matching. CoRR abs/2501.08686, DOI 10 .48550/ARXIV.2501.08686, URL https://doi.org/ 10.48550/arXiv.2501.08686, 2501.08686

  8. [8]

    In: International Conference on Com- putational Creativity (ICCC), Brigham Young University, USA

    O’Donoghue D, Abgaz Y, Hurley D, Ronzano F (2015) Stimulating and sim- ulating creativity with dr inventor. In: International Conference on Com- putational Creativity (ICCC), Brigham Young University, USA

  9. [9]

    In: PICS-Publication Series of the Institute of Cognitive Science, Institute of Cognitive Science, vol 1

    O’Donoghue D, Monahan R, Grijincu D, Pitu M, Halim F, Rahman F, Abgaz Y, Hurley D (2014) Creating formal specifications with analogical reasoning. In: PICS-Publication Series of the Institute of Cognitive Science, Institute of Cognitive Science, vol 1

  10. [10]

    In: Proceedings Artificial Intelligence for Formal Methods (AI4FM), Depart- ment of Computer Science, Heriot-Watt University

    Pitu M, Grijincu D, Li P, Saleem A, Monahan R, O’Donoghue D (2013) Arís: Analogical reasoning for reuse of implementation & specification. In: Proceedings Artificial Intelligence for Formal Methods (AI4FM), Depart- ment of Computer Science, Heriot-Watt University

  11. [11]

    Strecker M (2007) Modeling and verifying graph transformations in proof assistants. In: Mackie I, Plump D (eds) Proceedings of the Fourth In- ternational Workshop on Computing with Terms and Graphs, TER- MGRAPH@ETAPS 2007, Braga, Portugal, March 31, 2007, Elsevier, Electronic Notes in Theoretical Computer Science, vol 203, pp 135– 148, DOI 10 .1016/J.ENTC...

  12. [12]

    In: International conference on applied engineering and natural sciences, vol 1, pp 1050–1056

    Topsakal O, Akinci TC (2023) Creating large language model applications utilizing langchain: A primer on developing llm apps fast. In: International conference on applied engineering and natural sciences, vol 1, pp 1050–1056

  13. [13]

    Tufek N, Thuluva AS, Just VP, Ekaputra FJ, Bandyopadhyay T, Sabou M, Hanbury A (2024) Validating semantic artifacts with large language models. In: Meroño-Peñuela A, Corcho Ó, Groth P, Simperl E, Tamma V, Nuzzolese AG, Poveda-Villalón M, Sabou M, Presutti V, Celino I, Revenko A, Raad J, Sartini B, Lisena P (eds) The Semantic Web: ESWC 2024 Satellite Event...

  14. [14]

    In: Hassan- 20 A

    Vandemoortele N, Steenwinckel B, Hoecke SV, Ongenae F (2024) Scalable table-to-knowledge graph matching from metadata using llms. In: Hassan- 20 A. Beg and D. O’Donoghue and R. Monahan zadeh O, Abdelmageed N, Cremaschi M, Cutrona V, D’Adda F, Efthymiou V, Kruit B, Lobo EA, Mihindukulasooriya N, Pham NH (eds) Proceed- ings of the Semantic Web Challenge on ...

  15. [15]

    large language models: Competitors or partners in supporting virtual museums

    Vasic I, Fill HG, Quattrini R, Pierdicca R (2025) Knowledge graphs vs. large language models: Competitors or partners in supporting virtual museums. ACM Journal on Computing and Cultural Heritage