pith. sign in

arxiv: 2606.02651 · v1 · pith:RRCJMFCEnew · submitted 2026-05-31 · 💻 cs.PL · cs.LO· cs.SE

From Rocq to Metal: A Pipeline for Formally Verified Microcontroller Firmware

Pith reviewed 2026-06-28 15:37 UTC · model grok-4.3

classification 💻 cs.PL cs.LOcs.SE
keywords formal verificationRocqmicrocontroller firmwareCPS virtual machineScheme extractionembedded systemsstate-transition functionsbare-metal execution
0
0 comments X

The pith

Rocq proofs extract to Scheme that runs on a bare-metal CPS virtual machine for microcontroller firmware.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a pipeline that extracts code verified in Rocq to Scheme and executes it on microcontrollers using a custom virtual machine. By writing firmware logic as a pure state-transition function, the core becomes provable while the surrounding host layer stays fixed and unverified. This structure supports LLM-assisted tactic synthesis to generate proofs instead of manual reviews. The result targets the gap between high-level formal methods and the limited resources of embedded devices. A reader would care because it shows verified firmware is possible on small targets without growing the untrusted base.

Core claim

We built Encore!, a bare-metal Continuation Passing Style virtual machine that runs Rocq-extracted Scheme on microcontrollers. We also show how to structure firmware as a pure state-transition function, making its core fully provable in Rocq while keeping the unverified host layer constant regardless of firmware complexity. Large Language Model assisted tactic synthesis fits naturally into this workflow, allowing formal theorem statements to replace manual code review for AI-generated firmware.

What carries the argument

Encore!, the bare-metal CPS virtual machine that executes the operational semantics of Rocq-extracted Scheme code directly on microcontrollers.

If this is right

  • The firmware core logic becomes fully provable in Rocq as a state-transition function.
  • The unverified host layer remains constant even as firmware complexity increases.
  • LLM-assisted tactic synthesis integrates directly by proving theorem statements about the extracted code.
  • High-level specification languages become usable for safety-critical embedded targets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar extraction pipelines could target other languages if equivalent bare-metal VMs are developed for them.
  • State-transition structuring may reduce the effort needed to maintain invariants across firmware updates.
  • The approach separates concerns so that hardware-specific parts can be swapped without re-proving the logic.

Load-bearing premise

The bare-metal CPS VM correctly and faithfully implements the operational semantics of the Rocq-extracted Scheme code without introducing extra behavior.

What would settle it

A test program whose Rocq proof guarantees one state transition but whose execution on the VM produces a different state or output.

Figures

Figures reproduced from arXiv: 2606.02651 by Karolina Gorna, Valentin Bergeron.

Figure 1
Figure 1. Figure 1: Formal SDK generic app loop over two consecutive steps. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Encore! VM architecture The register file is divided between special register (Current function SELF 0x00, Current Continuation CONT 0x01 and Nullary sentinel NULL 0xff), Argument registers (0x02 -> 0x0b) and Regular registers (0x0c -> 0xfe). The heap is a flat bump-allocated arena shared by closures, constructors, and byte strings, with all objects prefixed by a GC header word. The garbage collector is ma… view at source ↗
read the original abstract

Enforcing invariants in safety-critical systems is increasingly urgent as AI-generated code becomes widespread. Unfortunately, the runtimes required to support high-level specification languages are too large for most embedded targets. In this article, we show how formally verified firmware is achievable today. We built Encore!, a bare-metal Continuation Passing Style (CPS) virtual machine (VM) that runs Rocq-extracted Scheme on microcontrollers. We also show how to structure firmware as a pure state-transition function, making its core fully provable in Rocq while keeping the unverified host layer constant regardless of firmware complexity. Large Language Model (LLM)-assisted tactic synthesis fits naturally into this workflow: formal theorem statements replace manual code review, allowing AI-generated firmware to prove itself.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents Encore!, a bare-metal Continuation Passing Style (CPS) virtual machine that executes Rocq-extracted Scheme on microcontrollers. It describes structuring firmware as a pure state-transition function so that its core becomes fully provable in Rocq while the unverified host layer remains fixed regardless of firmware complexity. The work also integrates LLM-assisted tactic synthesis, replacing manual code review with formal theorem statements that allow AI-generated firmware to prove itself.

Significance. If the VM-semantics correspondence holds and the state-transition structuring is realized without escape behaviors, the pipeline would enable verified high-level code on severely resource-constrained targets, a longstanding barrier in embedded formal methods. The constant-host-layer property and the explicit use of extraction plus CPS are concrete engineering contributions that could be reused beyond the specific microcontroller setting.

major comments (2)
  1. [Abstract, §3] Abstract and §3 (Encore! VM description): the central claim that the bare-metal CPS VM faithfully realizes the operational semantics of Rocq-extracted Scheme without introducing escape behaviors is stated but unsupported by any derivation, proof sketch, or correspondence argument. This correspondence is load-bearing for the assertion that the firmware core is 'fully provable in Rocq'.
  2. [§4] §4 (state-transition structuring): the claim that the unverified host layer remains constant 'regardless of firmware complexity' is asserted without a concrete interface definition or proof that all firmware-specific behavior is confined to the pure transition function. Without this interface, the separation cannot be verified.
minor comments (2)
  1. [Abstract] The abstract refers to 'Rocq' without an initial citation or expansion; add a reference to the Rocq prover in the first paragraph.
  2. No performance or resource-usage numbers (code size, RAM, cycles) are supplied for the Encore! VM on any target microcontroller; such data would strengthen the practicality claim.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address the two major comments point by point below.

read point-by-point responses
  1. Referee: [Abstract, §3] Abstract and §3 (Encore! VM description): the central claim that the bare-metal CPS VM faithfully realizes the operational semantics of Rocq-extracted Scheme without introducing escape behaviors is stated but unsupported by any derivation, proof sketch, or correspondence argument. This correspondence is load-bearing for the assertion that the firmware core is 'fully provable in Rocq'.

    Authors: The Encore! VM is constructed as a direct implementation of the CPS semantics obtained from Rocq extraction to Scheme followed by compilation to bare-metal C. The manuscript presents this correspondence as holding by construction of the extraction pipeline and the CPS VM design. We agree that an explicit argument would make the claim more robust. We will add a high-level correspondence sketch to the revised §3, outlining the mapping from Rocq operational steps to VM transitions and arguing the absence of escape behaviors. revision: yes

  2. Referee: [§4] §4 (state-transition structuring): the claim that the unverified host layer remains constant 'regardless of firmware complexity' is asserted without a concrete interface definition or proof that all firmware-specific behavior is confined to the pure transition function. Without this interface, the separation cannot be verified.

    Authors: The manuscript describes the host layer as providing only a fixed set of primitives (VM dispatch, memory allocation, and device I/O) while all firmware logic resides in the pure state-transition function. We acknowledge that an explicit interface definition is needed to make the separation verifiable. We will revise §4 to include a concrete host-VM interface specification and a short argument showing that firmware-specific behavior cannot escape the transition function. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes a construction (Encore! CPS VM and pure state-transition firmware structuring) rather than a derivation chain with equations, fitted parameters, or predictions. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or strongest_claim. The central claim is a pipeline for verified firmware; the VM-semantics correspondence is flagged as an external verification task, not an internal reduction to inputs by construction. This matches the default expectation of a non-circular construction paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 1 invented entities

Abstract-only review yields no explicit free parameters, axioms, or invented entities beyond the VM itself. The VM is presented as a new artifact whose correctness is asserted rather than derived from prior literature.

invented entities (1)
  • Encore! bare-metal CPS VM no independent evidence
    purpose: Execute Rocq-extracted Scheme on microcontrollers while preserving formal guarantees
    Introduced in the abstract as the central runtime artifact; no independent evidence of its correctness is supplied.

pith-pipeline@v0.9.1-grok · 5655 in / 1261 out tokens · 22301 ms · 2026-06-28T15:37:39.514806+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 7 canonical work pages

  1. [1]

    Appel, Greg Morrisett, Zoe Paraskevopoulou, Randy Pollack, Olivier Savary Belanger, Matthieu Sozeau, and Matthew Weaver

    Abhishek Anand, Andrew W. Appel, Greg Morrisett, Zoe Paraskevopoulou, Randy Pollack, Olivier Savary Belanger, Matthieu Sozeau, and Matthew Weaver. 2017. CertiCoq: A Verified Compiler for Coq. In Proceedings of the Workshop on Coq for Programming Languages (CoqPL), 2017. Retrieved from https://www.cs.princeton.edu/~appel/papers/certicoq-coqpl.pdf

  2. [2]

    Andrew W. Appel. 1992. Compiling with Continuations. Cambridge University Press

  3. [3]

    Valentin Bergeron. Encore!. Retrieved from https://github.com/vbergeron/encore

  4. [4]

    Guillaume Claret and contributors. 2024. rocq-of-rust: Formal Verification of Rust Programs in Rocq. Retrieved from https://github.com/formal-land/rocq-of-rust

  5. [5]

    Mark Elvers. 2025. Multi Domain OCaml on Raspberry Pi Pico 2 Microcontroller. Retrieved from https://www.tunbury.org/2025/12/31/ocaml-pico/

  6. [6]

    Joe Hendrix. 2021. System Programming Using Lean 4 (Community Discussion). Re - trieved from https://leanprover-community.github.io/archive/stream/270676-lean4/topic/System. 20programming.20using.20Lean.204.html

  7. [7]

    Andrew Kennedy. 2007. Compiling with Continuations, Continued. In Proceedings of the 12th ACM SIGPLAN International Conference on Functional Programming (ICFP 2007), 2007. ACM, 177–

  8. [8]

    https://doi.org/10.1145/1291151.1291179

  9. [9]

    Joomy Korkut, Kathrin Stark, and Andrew W. Appel. 2025. A Verified Foreign Function Interface between Coq and C. Proc. ACM Program. Lang. 9, POPL (2025). https://doi.org/10.1145/3704854

  10. [10]

    György Kurucz. 2024. Porting Lean to the ESP32-C3 RISC-V Microcontroller. Retrieved from https://kuruczgy.com/blog/2024/07/31/lean-esp32/

  11. [11]

    Lean FRO. 2025. The Lean FRO Year 3 Roadmap. Retrieved from https://lean-lang.org/fro/ roadmap/y3

  12. [12]

    Xavier Leroy. 2009. Formal Verification of a Realistic Compiler. Communications of the ACM 52, 7 (2009), 107–115. https://doi.org/10.1145/1538788.1538814

  13. [13]

    Pierre Letouzey. 2003. A New Extraction for Coq. In Types for Proofs and Programs (TYPES

  14. [14]

    & Rogaway, P

    (Lecture Notes in Computer Science) , 2003. Springer, 200–219. https://doi.org/ 10.1007/3- 540-39185-1_12

  15. [15]

    Pierre Letouzey. 2008. Extraction in Coq: An Overview. In Logic and Theory of Algorithms (CiE 2008), 2008. Springer

  16. [16]

    Leonardo de Moura and Sebastian Ullrich. 2021. The Lean 4 Theorem Prover and Programming Language. In Automated Deduction – CADE 28 (Lecture Notes in Computer Science) , 2021. Springer, 625–635. https://doi.org/10.1007/978-3-030-79876-5_37 12

  17. [17]

    Georg Neis, Chung-Kil Hur, Jan-Oliver Kaiser, Craig McLaughlin, Derek Dreyer, and Viktor Vafeiadis. 2015. Pilsner: A Compositionally Verified Compiler for a Higher-Order Imperative Language. In Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP 2015), 2015. ACM, 166–178. https://doi.org/10.1145/2784731.2784764

  18. [18]

    Chibi-Scheme

    Alex Shinn. Chibi-Scheme. Retrieved from https://github.com/ashinn/chibi-scheme

  19. [19]

    Vincent St-Amour and Marc Feeley. 2009. PICOBIT: A Compact Scheme System for Microcon - trollers. In Implementation and Application of Functional Languages (IFL 2009), 2009. Springer

  20. [20]

    Simon Varoumas, Benoît Vaugon, and Emmanuel Chailloux. 2019. WCET of OCaml Bytecode on Microcontrollers. In 19th International Workshop on Worst-Case Execution Time Analysis (WCET 2019), 2019. Schloss Dagstuhl – Leibniz-Zentrum für Informatik. Retrieved from https://drops. dagstuhl.de/opus/volltexte/2019/10770/

  21. [21]

    OMicroB: An OCaml Generic Virtual Machine for Microcontrollers

    Simon Varoumas, Benoît Vaugon, and Emmanuel Chailloux. OMicroB: An OCaml Generic Virtual Machine for Microcontrollers. Retrieved from https://github.com/stevenvar/OMicroB

  22. [22]

    Samuel Yvon and Marc Feeley. 2021. A Small Scheme VM, Compiler, and REPL in 4K. In Proceed- ings of the 13th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages (VMIL 2021), 2021. ACM, 14–24. https://doi.org/10.1145/3486606.3486783 13