Neuroforger: certified violation witnesses for smart contracts verification via LLMs

Enrico Lipparini; Massimo Bartoletti

arxiv: 2605.31389 · v1 · pith:TQ6PHJHXnew · submitted 2026-05-29 · 💻 cs.PL · cs.CR

Neuroforger: certified violation witnesses for smart contracts verification via LLMs

Massimo Bartoletti , Enrico Lipparini This is my paper

Pith reviewed 2026-06-28 20:02 UTC · model grok-4.3

classification 💻 cs.PL cs.CR

keywords smart contract verificationlarge language modelscounterexamplesviolation witnessesSolidityabstract typescertified witnesses

0 comments

The pith

A Solidity extension with abstract types turns LLM outputs into certified counterexamples for smart contract properties.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to make LLM-based verification of smart contracts reliable by solving two problems at once: natural language properties are ambiguous, and LLM answers lack correctness guarantees. It does this by defining a new way to write specifications as Solidity tests that include variables of abstract types, which represent possible values that could cause a violation. An LLM suggests concrete values for these variables, after which type checking confirms they are valid, and running the test on the contract shows if a violation actually occurs. This produces executable counterexamples that serve as certified witnesses.

Core claim

The central claim is that a specification can be expressed as a Solidity test function containing existentially quantified variables of abstract type. The LLM provides an instantiation of these variables with concrete values of the appropriate types. Type checking and subsequent concrete execution of the test then certify whether the instantiation demonstrates a violation of the target property, yielding a reliable proof-of-concept counterexample.

What carries the argument

The Solidity test with existentially quantified abstract-type variables, whose instantiation by the LLM is validated through type checking and execution to produce a certified violation witness.

If this is right

Certified counterexamples can be directly executed to demonstrate property violations in smart contracts.
The method combines the generative power of LLMs with the reliability of type systems and runtime checking.
It applies to a range of properties expressible through test-like specifications in Solidity.
Violation witnesses become portable PoCs that auditors can use without trusting the LLM directly.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This approach might allow verification of more complex properties by leveraging LLM creativity for finding edge cases.
Similar techniques could apply to other programming languages with test frameworks.
The workflow could be iterated, using failed executions to refine LLM prompts.

Load-bearing premise

That type checking and concrete execution of the instantiated test will always detect genuine violations without being fooled by incorrect modeling of the contract or incomplete property specification.

What would settle it

A counterexample where an LLM instantiation type-checks and the test passes execution, yet the contract does not actually violate the property due to differences in how the abstract test models the real execution environment.

Figures

Figures reproduced from arXiv: 2605.31389 by Enrico Lipparini, Massimo Bartoletti.

**Figure 1.** Figure 1: GPT-5 output of [12] for the query withdraw-assets-credit-others. Answer : FALSE Explanation : withdraw decreases only credits [ msg . sender ] and then performs an external call to msg . sender . During that external call , arbitrary reentrant code can execute and invoke deposit from a different address , changing another user ’s credit . Thus , other users ’ credits are not necessarily preserved . See wi… view at source ↗

**Figure 2.** Figure 2: Workflow of the procedure. Contract(s) Spec LLM type check & forge test true? (False, ⟨CEX ⟩) ⟨CEX ⟩? True? PASS FAIL 4 Producing certified counterexamples We present in this section our method to produce certified counterexamples. The high-level workflow of our procedure is displayed in [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

read the original abstract

Recent large language models (LLMs) incorporate reasoning capabilities that allow them to perform well in predicting whether a smart contract respects a certain property, suggesting a complementary approach to traditional formal-methods-based techniques for smart contract verification. However, the application of LLMs in such context has two major issues: 1) properties expressed in natural language are intrinsically ambiguous, and 2) answers returned by LLMs have no guarantee of correctness. In this paper, we address both issues simultaneously by: 1) introducing a new formal specification language that extends Solidity with abstract types, and 2) designing a workflow that combines LLMs with type checking and concrete execution to generate and validate violation witnesses (i.e., counterexamples). The key idea is to represent a specification as a Solidity test with (existentially quantified) variables of abstract type; finding an instantiation of these variables to concrete values (of the correct type) concretizes the test into an executable counterexample (PoC) for the target property. We implemented our procedure in the tool Neuroforger, experimentally evaluating it on a smart-contract verification dataset drawn from literature, obtaining promising results that demonstrate its potential applicability in the wild.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Neuroforger gives a concrete workflow for turning LLM outputs into certified counterexamples via abstract types in Solidity tests, but the certification step only works if the test itself is a complete encoding of the property.

read the letter

The paper's main move is to encode a property as a Solidity test containing existentially quantified variables of abstract type. An LLM proposes concrete values for those variables; type checking plus execution then turns the result into a validated violation witness. This is presented as a way to get both the flexibility of LLMs and some formal guarantee without writing full specifications in a separate logic.

What is actually new is the specific representation choice—abstract types inside ordinary-looking Solidity tests—plus the tight loop that feeds the LLM output straight into the existing compiler and EVM runner. Earlier LLM work on contracts either produced natural-language claims or required separate formal oracles; this one tries to stay inside the language the developers already use.

The workflow is practical on paper because it re-uses the type checker and execution engine rather than inventing new proof machinery. The authors report running it on a dataset drawn from the literature and call the results promising, which at least shows the pipeline can be implemented and exercised.

The soft spot is exactly where the stress-test note points: everything rests on the test being a faithful model of the target property. If the test omits reentrancy, external calls, or ordering constraints, then a type-and-run success can still be spurious. The abstract gives no error analysis, no breakdown of false positives, and no discussion of how they chose or validated the test encodings themselves. Without those details it is difficult to know how often the certification is meaningful versus how often it just confirms that the LLM followed the test syntax.

This is for people building or using verification tools for blockchain contracts who already write tests and want to augment them with LLM assistance. A reader working on hybrid LLM-formal methods would find the concrete implementation worth examining.

I would send it to peer review. The idea is clear enough and the engineering path is short enough that referees can check the soundness claim against actual runs.

Referee Report

2 major / 1 minor

Summary. The paper introduces Neuroforger, a workflow and tool that extends Solidity with abstract types to express smart-contract properties as tests containing existentially quantified variables. LLMs are used to propose concrete instantiations of these variables; the resulting test is then type-checked and executed on the EVM to produce a certified violation witness (PoC). The approach is claimed to address both the ambiguity of natural-language properties and the lack of correctness guarantees in LLM outputs, with an experimental evaluation on a literature-derived dataset reported to yield promising results.

Significance. If the central claim holds, the work offers a practical hybrid technique that could complement traditional formal-methods tools for smart-contract verification by generating certified counterexamples. The combination of an extended specification language with external type checkers and execution engines for certification is a notable strength, and the provision of reproducible code or falsifiable predictions would further strengthen its contribution.

major comments (2)

[Abstract / workflow description] Abstract and workflow description: the manuscript states that type checking plus concrete execution certifies an LLM-proposed instantiation as a genuine violation witness, yet supplies no soundness argument addressing potential semantic gaps between the extended Solidity test and the target property (e.g., omitted reentrancy, ordering constraints on abstract types, or external-call effects). This assumption is load-bearing for the certification claim.
[Abstract] Abstract: the evaluation is described only as 'promising results' on a literature dataset; no quantitative metrics, error rates, baseline comparisons, or analysis of edge cases such as non-determinism or incomplete contract models are provided, preventing assessment of whether the central claim is supported.

minor comments (1)

[Abstract] The acronym 'PoC' is introduced without an explicit definition in the abstract.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback. We address the major comments point by point below and will revise the manuscript to strengthen the presentation of the certification claim and evaluation details.

read point-by-point responses

Referee: [Abstract / workflow description] Abstract and workflow description: the manuscript states that type checking plus concrete execution certifies an LLM-proposed instantiation as a genuine violation witness, yet supplies no soundness argument addressing potential semantic gaps between the extended Solidity test and the target property (e.g., omitted reentrancy, ordering constraints on abstract types, or external-call effects). This assumption is load-bearing for the certification claim.

Authors: We agree that an explicit soundness argument is necessary to support the certification claim. The workflow certifies that a concrete instantiation violates the test as written (via type checking and EVM execution), and the test is intended as a direct formalization of the target property using abstract types for existential quantification. However, we acknowledge that the manuscript does not currently provide a detailed argument addressing potential gaps such as reentrancy, ordering constraints, or external-call effects. In the revision we will add a dedicated soundness section that (1) formalizes the relationship between the abstract test and the property, (2) states the assumptions under which the certification holds, and (3) discusses how the test language can express the cited concerns when they are part of the property. This will make the load-bearing assumption explicit and defensible. revision: yes
Referee: [Abstract] Abstract: the evaluation is described only as 'promising results' on a literature dataset; no quantitative metrics, error rates, baseline comparisons, or analysis of edge cases such as non-determinism or incomplete contract models are provided, preventing assessment of whether the central claim is supported.

Authors: The abstract is written at a high level, using the phrase 'promising results' to summarize the experimental outcome. The full manuscript contains a dedicated evaluation section on the literature-derived dataset. To address the concern and enable direct assessment of the central claim, we will revise the abstract to report key quantitative metrics (e.g., witness-generation success rate, false-positive rate after certification) and will expand the evaluation section with explicit analysis of edge cases including LLM non-determinism, incomplete contract models, and any baseline comparisons performed. These additions will be included in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper describes a workflow that encodes specifications as Solidity tests with abstract-typed existentially quantified variables, then uses LLMs to propose instantiations, followed by external type checking and concrete execution to produce certified counterexamples. No equations, fitted parameters, or mathematical derivations appear. The certification step relies on independent external tools (Solidity type checker and EVM execution) rather than any internal definition that loops back to the LLM output or the input property. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The method is therefore self-contained against external benchmarks and does not reduce any claim to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Reviewed from abstract only; no free parameters, axioms, or invented entities are specified in the provided text.

pith-pipeline@v0.9.1-grok · 5736 in / 1172 out tokens · 28732 ms · 2026-06-28T20:02:47.009192+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 26 canonical work pages · 3 internal anchors

[1]

Neuroforger: certified violation witnesses for smart contracts ver- ification via LLMs

Public data for “Neuroforger: certified violation witnesses for smart contracts ver- ification via LLMs”.https://github.com/elipparini/Neuroforger(2026)

2026
[2]

In: TACAS (2020)

Albert, E., Correas, J., Gordillo, P., Rom´ an-D´ ıez, G., Rubio, A.: GASOL: Gas analysis and optimization for Ethereum smart contracts. In: TACAS (2020). https://doi.org/10.1007/978-3-030-45237-7_7

work page doi:10.1007/978-3-030-45237-7_7 2020
[3]

In: CAV (2022).https://doi.org/10.1007/978-3-031-13185-1_ 16 18 Massimo Bartoletti and Enrico Lipparini

Alt, L., Blicha, M., Hyv¨ arinen, A.E.J., Sharygina, N.: SolCMC: Solidity compiler’s model checker. In: CAV (2022).https://doi.org/10.1007/978-3-031-13185-1_ 16 18 Massimo Bartoletti and Enrico Lipparini

work page doi:10.1007/978-3-031-13185-1_ 2022
[4]

PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

Andersson, V., Bobadilla, S., Hobbelhagen, H., Monperrus, M.: Poco: Agentic proof-of-concept exploit generation for smart contracts. CoRR (2026).https: //doi.org/10.48550/ARXIV.2511.02780,https://arxiv.org/abs/2511.02780

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.02780 2026
[5]

In: POST (2017).https://doi.org/10.1007/978-3-662-54455-6_8

Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on Ethereum smart con- tracts (sok). In: POST (2017).https://doi.org/10.1007/978-3-662-54455-6_8

work page doi:10.1007/978-3-662-54455-6_8 2017
[6]

In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

Babel, K., Daian, P., Kelkar, M., Juels, A.: Clockwork finance: Automated analysis of economic security in smart contracts. In: IEEE S&P (2023).https://doi.org/ 10.1109/SP46215.2023.10179346

work page doi:10.1109/sp46215.2023.10179346 2023
[7]

Future Gener

Bartoletti, M., Benetollo, L., Bugliesi, M., Crafa, S., Sasso, G.D., Pettinau, R., Pinna, A., Piras, M., Rossi, S., Salis, S., Span` o, A., Tkachenko, V., Tonelli, R., Zunino, R.: Smart contract languages: A comparative analysis. Future Gener. Com- put. Syst.164(2025).https://doi.org/10.1016/J.FUTURE.2024.107563

work page doi:10.1016/j.future.2024.107563 2025
[8]

In: FMBC (2025).https://doi.org/10

Bartoletti, M., Crafa, S., Lipparini, E.: Formal Verification in Solidity and Move: Insights from a Comparative Analysis. In: FMBC (2025).https://doi.org/10. 4230/OASIcs.FMBC.2025.3

2025
[9]

In: Integrated Formal Methods (2024).https://doi

Bartoletti, M., Ferrando, A., Lipparini, E., Malvone, V.: Solvent: Liquidity veri- fication of smart contracts. In: Integrated Formal Methods (2024).https://doi. org/10.1007/978-3-031-76554-4_14

work page doi:10.1007/978-3-031-76554-4_14 2024
[10]

KindHML: formal verification of smart contracts based on Hennessy-Milner logic

Bartoletti, M., Ferrando, A., Lipparini, E., Malvone, V.: KindHML: formal verifi- cation of smart contracts based on Hennessy-Milner logic. CoRR (2026).https: //doi.org/10.48550/ARXIV.2604.14038,https://arxiv.org/abs/2604.14038

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.14038 2026
[11]

In: FMBC (2024).https://doi.org/ 10.4230/OASICS.FMBC.2024.6

Bartoletti, M., Fioravanti, F., Matricardi, G., Pettinau, R., Sainas, F.: Towards benchmarking of Solidity verification tools. In: FMBC (2024).https://doi.org/ 10.4230/OASICS.FMBC.2024.6

work page doi:10.4230/oasics.fmbc.2024.6 2024
[12]

In: Financial Cryptography (2026)

Bartoletti, M., Lipparini, E., Pompianu, L.: LLMs as verification oracles for Solid- ity. In: Financial Cryptography (2026)

2026
[13]

Bernardi, T., Dor, N., Fedotov, A., Grossman, S., Immerman, N., Jackson, D., Nutz, A., Oppenheim, L., Pistiner, O., Rinetzky, N., Sagiv, M., Taube, M., Toman, J.A., Wilcox, J.R.: WIP: Finding bugs automatically in smart contracts with parameterized invariants.https://groups.csail.mit.edu/sdg/pubs/2020/ sbc2020.pdf(2020)

2020
[14]

In: Conference on Artificial Intelligence and Theorem Proving (2019)

Blanchette, J.C., Ouraoui, D.E., Fontaine, P., Kaliszyk, C.: Machine Learning for Instance Selection in SMT Solving. In: Conference on Artificial Intelligence and Theorem Proving (2019)

2019
[15]

Certora verification reports.https://docs.certora.com/en/latest/docs/ prover/dashboard/report.html(2025)

2025
[16]

Certora: Hooks.https://docs.certora.com/en/latest/docs/cvl/hooks.html (2025)

2025
[17]

In: ICSE (2024).https://doi.org/10

Chaliasos, S., Charalambous, M.A., Zhou, L., Galanopoulou, R., Gervais, A., Mitropoulos, D., Livshits, B.: Smart contract and DeFi security: Insights from tool evaluations and practitioner surveys. In: ICSE (2024).https://doi.org/10. 1145/3597503.3623302

work page arXiv 2024
[18]

ACM Comput

Chen, T.Y., Kuo, F., Liu, H., Poon, P., Towey, D., Tse, T.H., Zhou, Z.Q.: Meta- morphic testing: A review of challenges and opportunities. ACM Comput. Surv. 51(1) (2018).https://doi.org/10.1145/3143561

work page doi:10.1145/3143561 2018
[19]

Cramer, M., McIntyre, L.: Verifying LLM-generated code in the context of software verification with Ada/SPARK (2025),https://arxiv.org/abs/2502.07728

work page arXiv 2025
[20]

In: Financial Cryptography (2020).https://doi.org/10

Eskandari, S., Moosavi, S., Clark, J.: SoK: Transparent Dishonesty: Front-Running Attacks on Blockchain. In: Financial Cryptography (2020).https://doi.org/10. 1007/978-3-030-43725-1_13 Certified violation witnesses for smart contracts verification via LLMs 19

2020
[21]

Foundry: A blazing fast, portable and modular toolkit for Ethereum application development written in Rust.https://github.com/foundry-rs/foundry(2025)

2025
[22]

Foundry: Forge.https://www.getfoundry.sh/forge(2026)

2026
[23]

Foundry: snapshotState cheatcodes.https://getfoundry.sh/reference/ cheatcodes/state-snapshots(2026)

2026
[24]

Foundry: startStateDiffRecording cheatcodes.https://www.getfoundry.sh/ reference/cheatcodes/start-state-diff-recording(2026)

2026
[25]

In: Financial Cryptography (2026)

Gervais, A., Zhou, L.: AI agent smart contract exploit generation. In: Financial Cryptography (2026)

2026
[26]

In: FMCAD (2018).https: //doi.org/10.23919/FMCAD.2018.8603013

Hojjat, H., R¨ ummer, P.: The ELDARICA Horn solver. In: FMCAD (2018).https: //doi.org/10.23919/FMCAD.2018.8603013

work page doi:10.23919/fmcad.2018.8603013 2018
[27]

certora.com/en/latest/docs/whitepaper/index.html(2022)

Jackson, D., Nandi, C., Sagiv, M.: Certora technology white paper.https://docs. certora.com/en/latest/docs/whitepaper/index.html(2022)

2022
[28]

International Journal of Approximate Reasoning189(2026)

Jakub˚ uv, J., Janota, M., Piepenbrock, J., Urban, J.: Machine learning for quantifier selection in cvc5. International Journal of Approximate Reasoning189(2026). https://doi.org/https://doi.org/10.1016/j.ijar.2025.109602

work page doi:10.1016/j.ijar.2025.109602 2026
[29]

In: FMCAD (2024).https://doi.org/10.34727/2024/ISBN.978-3-85448-065-5_16

Kamath, A., Senthilnathan, A., Chakraborty, S., Deligiannis, P., Lahiri, S., Lal, A., Rastogi, A., Roy, S., Sharma, R.: Leveraging LLMs for program verification. In: FMCAD (2024).https://doi.org/10.34727/2024/ISBN.978-3-85448-065-5_16

work page doi:10.34727/2024/isbn.978-3-85448-065-5_16 2024
[30]

In: NDSS (2025)

Liu, Y., Xue, Y., Wu, D., Sun, Y., Li, Y., Shi, M., Liu, Y.: PropertyGPT: LLM- driven formal verification of smart contracts through retrieval-augmented property generation. In: NDSS (2025)

2025
[31]

In: TACAS (2008)

de Moura, L.M., Bjørner, N.: Z3: An efficient SMT solver. In: TACAS (2008). https://doi.org/10.1007/978-3-540-78800-3_24

work page doi:10.1007/978-3-540-78800-3_24 2008
[32]

OpenAI: Gpt-5 docs (2025),https://platform.openai.com/docs/models/gpt-5 [Accessed: 2025-09-12]

2025
[33]

In: IEEE/ACM ASE (2024)

Pirzada, M.A.A., Reger, G., Bhayat, A., Cordeiro, L.C.: LLM-generated invariants for bounded model checking without loop unrolling. In: IEEE/ACM ASE (2024). https://doi.org/10.1145/3691620.3695512

work page doi:10.1145/3691620.3695512 2024
[34]

In: Financial Cryptography (2021).https://doi

Qin, K., Zhou, L., Livshits, B., Gervais, A.: Attacking the DeFi ecosystem with Flash Loans for fun and profit. In: Financial Cryptography (2021).https://doi. org/10.1007/978-3-662-64322-8_1

work page doi:10.1007/978-3-662-64322-8_1 2021
[35]

In: FMBC (2024).https://doi.org/10.4230/OASICS.FMBC.2024.8

Schiffl, J., Beckert, B.: A practical notion of liveness in smart contract applications. In: FMBC (2024).https://doi.org/10.4230/OASICS.FMBC.2024.8

work page doi:10.4230/oasics.fmbc.2024.8 2024
[36]

In: TACAS (2021).https:// doi.org/10.1007/978-3-030-72013-1_16

Scott, J., Niemetz, A., Preiner, M., Nejati, S., Ganesh, V.: MachSMT: A machine learning-based algorithm selector for SMT solvers. In: TACAS (2021).https:// doi.org/10.1007/978-3-030-72013-1_16

work page doi:10.1007/978-3-030-72013-1_16 2021
[37]

In: IEEE S&P (2024).https://doi.org/ 10.1109/SP54263.2024.00230

Sendner, C., Petzi, L., Stang, J., Dmitrienko, A.: Large-scale study of vulnerability scanners for Ethereum smart contracts. In: IEEE S&P (2024).https://doi.org/ 10.1109/SP54263.2024.00230

work page doi:10.1109/sp54263.2024.00230 2024
[38]

The Solidity Authors: Contracts.https://docs.soliditylang.org/en/latest/ contracts.html#contracts(2025)

2025
[39]

The Solidity Authors: Receive Ether Function.https://docs.soliditylang.org/ en/latest/contracts.html#receive-ether-function(2025)

2025
[40]

In: AST (2025).https://doi.org/10.1109/AST66626.2025

Tihanyi, N., Charalambous, Y., Jain, R., Ferrag, M.A., Cordeiro, L.C.: A new era in software security: Towards self-healing software via large language models and formal verification. In: AST (2025).https://doi.org/10.1109/AST66626.2025. 00020

work page doi:10.1109/ast66626.2025 2025
[41]

In: ASE (2024).https://doi

Wu, G., Cao, W., Yao, Y., Wei, H., Chen, T., Ma, X.: LLM meets bounded model checking: Neuro-symbolic loop invariant inference. In: ASE (2024).https://doi. org/10.1145/3691620.3695014 20 Massimo Bartoletti and Enrico Lipparini

work page doi:10.1145/3691620.3695014 2024
[42]

In: ICLR (2024),https://openreview.net/ forum?id=Q3YaCghZNt

Wu, H., Barrett, C., Narodytska, N.: Lemur: Integrating large language models in automated program verification. In: ICLR (2024),https://openreview.net/ forum?id=Q3YaCghZNt

2024
[43]

Xiao, Z., Li, Y., Wang, Q., Chen, S.: Prompt to pwn: Automated exploit generation for smart contracts (2025),https://arxiv.org/abs/2508.01371

work page internal anchor Pith review Pith/arXiv arXiv 2025
[44]

In: IJCAI (2025).https://doi

Zhou, B., Wang, H., Yao, Y., Chen, T., Xu, F., Ma, X.: Simulate, refine and inte- grate: Strategy synthesis for efficient SMT solving. In: IJCAI (2025).https://doi. org/10.24963/ijcai.2025/887,https://doi.org/10.24963/ijcai.2025/887

work page doi:10.24963/ijcai.2025/887 2025
[45]

In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

Zhou, L., Xiong, X., Ernstberger, J., Chaliasos, S., Wang, Z., Wang, Y., Qin, K., Wattenhofer, R., Song, D., Gervais, A.: Sok: Decentralized finance (DeFi) attacks. In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

work page doi:10.1109/sp46215.2023.10179435 2023

[1] [1]

Neuroforger: certified violation witnesses for smart contracts ver- ification via LLMs

Public data for “Neuroforger: certified violation witnesses for smart contracts ver- ification via LLMs”.https://github.com/elipparini/Neuroforger(2026)

2026

[2] [2]

In: TACAS (2020)

Albert, E., Correas, J., Gordillo, P., Rom´ an-D´ ıez, G., Rubio, A.: GASOL: Gas analysis and optimization for Ethereum smart contracts. In: TACAS (2020). https://doi.org/10.1007/978-3-030-45237-7_7

work page doi:10.1007/978-3-030-45237-7_7 2020

[3] [3]

In: CAV (2022).https://doi.org/10.1007/978-3-031-13185-1_ 16 18 Massimo Bartoletti and Enrico Lipparini

Alt, L., Blicha, M., Hyv¨ arinen, A.E.J., Sharygina, N.: SolCMC: Solidity compiler’s model checker. In: CAV (2022).https://doi.org/10.1007/978-3-031-13185-1_ 16 18 Massimo Bartoletti and Enrico Lipparini

work page doi:10.1007/978-3-031-13185-1_ 2022

[4] [4]

PoCo: Agentic Proof-of-Concept Exploit Generation for Smart Contracts

Andersson, V., Bobadilla, S., Hobbelhagen, H., Monperrus, M.: Poco: Agentic proof-of-concept exploit generation for smart contracts. CoRR (2026).https: //doi.org/10.48550/ARXIV.2511.02780,https://arxiv.org/abs/2511.02780

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2511.02780 2026

[5] [5]

In: POST (2017).https://doi.org/10.1007/978-3-662-54455-6_8

Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on Ethereum smart con- tracts (sok). In: POST (2017).https://doi.org/10.1007/978-3-662-54455-6_8

work page doi:10.1007/978-3-662-54455-6_8 2017

[6] [6]

In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

Babel, K., Daian, P., Kelkar, M., Juels, A.: Clockwork finance: Automated analysis of economic security in smart contracts. In: IEEE S&P (2023).https://doi.org/ 10.1109/SP46215.2023.10179346

work page doi:10.1109/sp46215.2023.10179346 2023

[7] [7]

Future Gener

Bartoletti, M., Benetollo, L., Bugliesi, M., Crafa, S., Sasso, G.D., Pettinau, R., Pinna, A., Piras, M., Rossi, S., Salis, S., Span` o, A., Tkachenko, V., Tonelli, R., Zunino, R.: Smart contract languages: A comparative analysis. Future Gener. Com- put. Syst.164(2025).https://doi.org/10.1016/J.FUTURE.2024.107563

work page doi:10.1016/j.future.2024.107563 2025

[8] [8]

In: FMBC (2025).https://doi.org/10

Bartoletti, M., Crafa, S., Lipparini, E.: Formal Verification in Solidity and Move: Insights from a Comparative Analysis. In: FMBC (2025).https://doi.org/10. 4230/OASIcs.FMBC.2025.3

2025

[9] [9]

In: Integrated Formal Methods (2024).https://doi

Bartoletti, M., Ferrando, A., Lipparini, E., Malvone, V.: Solvent: Liquidity veri- fication of smart contracts. In: Integrated Formal Methods (2024).https://doi. org/10.1007/978-3-031-76554-4_14

work page doi:10.1007/978-3-031-76554-4_14 2024

[10] [10]

KindHML: formal verification of smart contracts based on Hennessy-Milner logic

Bartoletti, M., Ferrando, A., Lipparini, E., Malvone, V.: KindHML: formal verifi- cation of smart contracts based on Hennessy-Milner logic. CoRR (2026).https: //doi.org/10.48550/ARXIV.2604.14038,https://arxiv.org/abs/2604.14038

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.14038 2026

[11] [11]

In: FMBC (2024).https://doi.org/ 10.4230/OASICS.FMBC.2024.6

Bartoletti, M., Fioravanti, F., Matricardi, G., Pettinau, R., Sainas, F.: Towards benchmarking of Solidity verification tools. In: FMBC (2024).https://doi.org/ 10.4230/OASICS.FMBC.2024.6

work page doi:10.4230/oasics.fmbc.2024.6 2024

[12] [12]

In: Financial Cryptography (2026)

Bartoletti, M., Lipparini, E., Pompianu, L.: LLMs as verification oracles for Solid- ity. In: Financial Cryptography (2026)

2026

[13] [13]

Bernardi, T., Dor, N., Fedotov, A., Grossman, S., Immerman, N., Jackson, D., Nutz, A., Oppenheim, L., Pistiner, O., Rinetzky, N., Sagiv, M., Taube, M., Toman, J.A., Wilcox, J.R.: WIP: Finding bugs automatically in smart contracts with parameterized invariants.https://groups.csail.mit.edu/sdg/pubs/2020/ sbc2020.pdf(2020)

2020

[14] [14]

In: Conference on Artificial Intelligence and Theorem Proving (2019)

Blanchette, J.C., Ouraoui, D.E., Fontaine, P., Kaliszyk, C.: Machine Learning for Instance Selection in SMT Solving. In: Conference on Artificial Intelligence and Theorem Proving (2019)

2019

[15] [15]

Certora verification reports.https://docs.certora.com/en/latest/docs/ prover/dashboard/report.html(2025)

2025

[16] [16]

Certora: Hooks.https://docs.certora.com/en/latest/docs/cvl/hooks.html (2025)

2025

[17] [17]

In: ICSE (2024).https://doi.org/10

Chaliasos, S., Charalambous, M.A., Zhou, L., Galanopoulou, R., Gervais, A., Mitropoulos, D., Livshits, B.: Smart contract and DeFi security: Insights from tool evaluations and practitioner surveys. In: ICSE (2024).https://doi.org/10. 1145/3597503.3623302

work page arXiv 2024

[18] [18]

ACM Comput

Chen, T.Y., Kuo, F., Liu, H., Poon, P., Towey, D., Tse, T.H., Zhou, Z.Q.: Meta- morphic testing: A review of challenges and opportunities. ACM Comput. Surv. 51(1) (2018).https://doi.org/10.1145/3143561

work page doi:10.1145/3143561 2018

[19] [19]

Cramer, M., McIntyre, L.: Verifying LLM-generated code in the context of software verification with Ada/SPARK (2025),https://arxiv.org/abs/2502.07728

work page arXiv 2025

[20] [20]

In: Financial Cryptography (2020).https://doi.org/10

Eskandari, S., Moosavi, S., Clark, J.: SoK: Transparent Dishonesty: Front-Running Attacks on Blockchain. In: Financial Cryptography (2020).https://doi.org/10. 1007/978-3-030-43725-1_13 Certified violation witnesses for smart contracts verification via LLMs 19

2020

[21] [21]

Foundry: A blazing fast, portable and modular toolkit for Ethereum application development written in Rust.https://github.com/foundry-rs/foundry(2025)

2025

[22] [22]

Foundry: Forge.https://www.getfoundry.sh/forge(2026)

2026

[23] [23]

Foundry: snapshotState cheatcodes.https://getfoundry.sh/reference/ cheatcodes/state-snapshots(2026)

2026

[24] [24]

Foundry: startStateDiffRecording cheatcodes.https://www.getfoundry.sh/ reference/cheatcodes/start-state-diff-recording(2026)

2026

[25] [25]

In: Financial Cryptography (2026)

Gervais, A., Zhou, L.: AI agent smart contract exploit generation. In: Financial Cryptography (2026)

2026

[26] [26]

In: FMCAD (2018).https: //doi.org/10.23919/FMCAD.2018.8603013

Hojjat, H., R¨ ummer, P.: The ELDARICA Horn solver. In: FMCAD (2018).https: //doi.org/10.23919/FMCAD.2018.8603013

work page doi:10.23919/fmcad.2018.8603013 2018

[27] [27]

certora.com/en/latest/docs/whitepaper/index.html(2022)

Jackson, D., Nandi, C., Sagiv, M.: Certora technology white paper.https://docs. certora.com/en/latest/docs/whitepaper/index.html(2022)

2022

[28] [28]

International Journal of Approximate Reasoning189(2026)

Jakub˚ uv, J., Janota, M., Piepenbrock, J., Urban, J.: Machine learning for quantifier selection in cvc5. International Journal of Approximate Reasoning189(2026). https://doi.org/https://doi.org/10.1016/j.ijar.2025.109602

work page doi:10.1016/j.ijar.2025.109602 2026

[29] [29]

In: FMCAD (2024).https://doi.org/10.34727/2024/ISBN.978-3-85448-065-5_16

Kamath, A., Senthilnathan, A., Chakraborty, S., Deligiannis, P., Lahiri, S., Lal, A., Rastogi, A., Roy, S., Sharma, R.: Leveraging LLMs for program verification. In: FMCAD (2024).https://doi.org/10.34727/2024/ISBN.978-3-85448-065-5_16

work page doi:10.34727/2024/isbn.978-3-85448-065-5_16 2024

[30] [30]

In: NDSS (2025)

Liu, Y., Xue, Y., Wu, D., Sun, Y., Li, Y., Shi, M., Liu, Y.: PropertyGPT: LLM- driven formal verification of smart contracts through retrieval-augmented property generation. In: NDSS (2025)

2025

[31] [31]

In: TACAS (2008)

de Moura, L.M., Bjørner, N.: Z3: An efficient SMT solver. In: TACAS (2008). https://doi.org/10.1007/978-3-540-78800-3_24

work page doi:10.1007/978-3-540-78800-3_24 2008

[32] [32]

OpenAI: Gpt-5 docs (2025),https://platform.openai.com/docs/models/gpt-5 [Accessed: 2025-09-12]

2025

[33] [33]

In: IEEE/ACM ASE (2024)

Pirzada, M.A.A., Reger, G., Bhayat, A., Cordeiro, L.C.: LLM-generated invariants for bounded model checking without loop unrolling. In: IEEE/ACM ASE (2024). https://doi.org/10.1145/3691620.3695512

work page doi:10.1145/3691620.3695512 2024

[34] [34]

In: Financial Cryptography (2021).https://doi

Qin, K., Zhou, L., Livshits, B., Gervais, A.: Attacking the DeFi ecosystem with Flash Loans for fun and profit. In: Financial Cryptography (2021).https://doi. org/10.1007/978-3-662-64322-8_1

work page doi:10.1007/978-3-662-64322-8_1 2021

[35] [35]

In: FMBC (2024).https://doi.org/10.4230/OASICS.FMBC.2024.8

Schiffl, J., Beckert, B.: A practical notion of liveness in smart contract applications. In: FMBC (2024).https://doi.org/10.4230/OASICS.FMBC.2024.8

work page doi:10.4230/oasics.fmbc.2024.8 2024

[36] [36]

In: TACAS (2021).https:// doi.org/10.1007/978-3-030-72013-1_16

Scott, J., Niemetz, A., Preiner, M., Nejati, S., Ganesh, V.: MachSMT: A machine learning-based algorithm selector for SMT solvers. In: TACAS (2021).https:// doi.org/10.1007/978-3-030-72013-1_16

work page doi:10.1007/978-3-030-72013-1_16 2021

[37] [37]

In: IEEE S&P (2024).https://doi.org/ 10.1109/SP54263.2024.00230

Sendner, C., Petzi, L., Stang, J., Dmitrienko, A.: Large-scale study of vulnerability scanners for Ethereum smart contracts. In: IEEE S&P (2024).https://doi.org/ 10.1109/SP54263.2024.00230

work page doi:10.1109/sp54263.2024.00230 2024

[38] [38]

The Solidity Authors: Contracts.https://docs.soliditylang.org/en/latest/ contracts.html#contracts(2025)

2025

[39] [39]

The Solidity Authors: Receive Ether Function.https://docs.soliditylang.org/ en/latest/contracts.html#receive-ether-function(2025)

2025

[40] [40]

In: AST (2025).https://doi.org/10.1109/AST66626.2025

Tihanyi, N., Charalambous, Y., Jain, R., Ferrag, M.A., Cordeiro, L.C.: A new era in software security: Towards self-healing software via large language models and formal verification. In: AST (2025).https://doi.org/10.1109/AST66626.2025. 00020

work page doi:10.1109/ast66626.2025 2025

[41] [41]

In: ASE (2024).https://doi

Wu, G., Cao, W., Yao, Y., Wei, H., Chen, T., Ma, X.: LLM meets bounded model checking: Neuro-symbolic loop invariant inference. In: ASE (2024).https://doi. org/10.1145/3691620.3695014 20 Massimo Bartoletti and Enrico Lipparini

work page doi:10.1145/3691620.3695014 2024

[42] [42]

In: ICLR (2024),https://openreview.net/ forum?id=Q3YaCghZNt

Wu, H., Barrett, C., Narodytska, N.: Lemur: Integrating large language models in automated program verification. In: ICLR (2024),https://openreview.net/ forum?id=Q3YaCghZNt

2024

[43] [43]

Xiao, Z., Li, Y., Wang, Q., Chen, S.: Prompt to pwn: Automated exploit generation for smart contracts (2025),https://arxiv.org/abs/2508.01371

work page internal anchor Pith review Pith/arXiv arXiv 2025

[44] [44]

In: IJCAI (2025).https://doi

Zhou, B., Wang, H., Yao, Y., Chen, T., Xu, F., Ma, X.: Simulate, refine and inte- grate: Strategy synthesis for efficient SMT solving. In: IJCAI (2025).https://doi. org/10.24963/ijcai.2025/887,https://doi.org/10.24963/ijcai.2025/887

work page doi:10.24963/ijcai.2025/887 2025

[45] [45]

In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

Zhou, L., Xiong, X., Ernstberger, J., Chaliasos, S., Wang, Z., Wang, Y., Qin, K., Wattenhofer, R., Song, D., Gervais, A.: Sok: Decentralized finance (DeFi) attacks. In: IEEE S&P (2023).https://doi.org/10.1109/SP46215.2023.10179435

work page doi:10.1109/sp46215.2023.10179435 2023