Trustworthy Runtime Verification via Bisimulation (Extended Experience Report)

Alwyn E. Goodloe; Ivan Perez; Mike Dodds; Robert Dockins; Ryan G. Scott

arxiv: 2607.01363 · v1 · pith:GHPIQTJHnew · submitted 2026-07-01 · 💻 cs.PL

Trustworthy Runtime Verification via Bisimulation (Extended Experience Report)

Ryan G. Scott , Ivan Perez , Alwyn E. Goodloe , Mike Dodds , Robert Dockins This is my paper

Pith reviewed 2026-07-03 01:06 UTC · model grok-4.3

classification 💻 cs.PL

keywords runtime verificationbisimulationcompiler correctnesssymbolic executionLLVMformal assurancesafety-critical systemsCopilot

0 comments

The pith

CopilotVerifier generates bisimulation proofs that equate a Copilot monitor with its compiled form on outputs and crashes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces CopilotVerifier as a companion to the Copilot compiler that automatically produces a formal proof of correctness for each generated C monitor. The proof takes the form of a bisimulation showing that the high-level specification and the low-level code agree on every output for every input and that they crash under exactly the same conditions. This matters because safety-critical systems require convincing assurance arguments for every deployed piece of code, yet auditors cannot easily inspect compiled artifacts. The approach relies on symbolic execution of LLVM to discharge verification conditions via an SMT solver. If the method works, it raises the level of compiler assurance that can be achieved at moderate additional cost while building directly on existing verification libraries.

Core claim

CopilotVerifier runs alongside the Copilot compiler and emits a bisimulation proof establishing that a given monitor and its compiled form produce equivalent outputs on equivalent inputs and either crash in identical circumstances or cannot crash. The bisimulation is expressed as a collection of verification conditions that are checked by Crucible's symbolic executor for LLVM together with the What4 solver interface.

What carries the argument

Bisimulation relation between Copilot semantics and compiled LLVM code, decomposed into verification conditions solved by symbolic execution.

If this is right

Every compiled monitor is accompanied by a machine-checked argument of behavioral equivalence.
Crash freedom or crash equivalence is established automatically for each monitor.
Assurance cost remains moderate because the proof reuses existing symbolic execution and SMT infrastructure.
The generated proofs supply the formal content needed for human-auditable assurance arguments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bisimulation technique could be applied to other high-level specification languages that compile to C for embedded use.
If the approach scales to larger monitors, it reduces the portion of the runtime verification toolchain that must be trusted without formal evidence.
Auditors could eventually treat the generated bisimulation proof as the primary evidence rather than reviewing source or binary directly.

Load-bearing premise

The bisimulation relation and Crucible's symbolic execution model correctly capture the operational semantics of both the high-level Copilot language and the generated code.

What would settle it

A concrete input sequence on which the high-level monitor and the compiled version produce different outputs or differ in whether they crash.

Figures

Figures reproduced from arXiv: 2607.01363 by Alwyn E. Goodloe, Ivan Perez, Mike Dodds, Robert Dockins, Ryan G. Scott.

**Figure 1.** Figure 1: shows a complete Copilot program implementing a simple thermostat. The program monitors an external stream (temp) representing data from a temperature probe, firing a trigger whenever the temperature (avgTemp) is sufficiently above or below a fixed setpoint, each captured by a boolean stream in the spec. To guard against noisy data, the probe input data is smoothed by computing a sliding window average of … view at source ↗

**Figure 2.** Figure 2: The reified version of spec in [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: C code generated for Figure [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: The architecture of CopilotVerifier. input and compiled program are represented by a collection of formulas in What4, an SMT interface library [43]. The relationship between the two semantics are then established by the bisimulation prover (3). CopilotVerifier reports back to the user whether it solved all of the generated proof goals successfully. Specifically, the verifier either produces a successful re… view at source ↗

**Figure 5.** Figure 5: Two labelled transition systems (LTSes). The upper LTS represents the stream program in [PITH_FULL_IMAGE:figures/full_fig_p010_5.png] view at source ↗

**Figure 6.** Figure 6: A sketch of a GSN diagram presenting a CopilotVerifier assurance case for the thermostat programs from Figures 1 and 3. For brevity’s sake, only a subset of goals are shown. each portion of a proof can be broken down into lower- and lower-level requirements until eventually reaching SMT, with a chain of evidence linking each intermediate step. Using the evidence that CopilotVerifier produces, one can const… view at source ↗

read the original abstract

When runtime verification is used to monitor safety-critical systems, it is essential that monitoring code behaves correctly. The Copilot runtime verification framework pursues this goal by automatically generating C monitor programs from a high-level DSL embedded in Haskell. In safety-critical domains, every piece of deployed code must be accompanied by an assurance argument that is convincing to human auditors. However, it is difficult for auditors to determine with confidence that a compiled monitor cannot crash and implements the behavior required by the Copilot semantics. In this paper we describe CopilotVerifier, which runs alongside the Copilot compiler, generating a proof of correctness for the compiled output. The proof establishes that a given Copilot monitor and its compiled form produce equivalent outputs on equivalent inputs, and that they either crash in identical circumstances or cannot crash. The proof takes the form of a bisimulation broken down into a set of verification conditions. We leverage two pieces of SMT-backed technology: the Crucible symbolic execution library for LLVM and the What4 solver interface library. Our results demonstrate that dramatically increased compiler assurance can be achieved at moderate cost by building on existing tools. This paves the way to our ultimate goal of generating formal assurance arguments that are convincing to human auditors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This experience report wires Crucible and What4 into the Copilot pipeline to emit bisimulation VCs for monitor equivalence and crash behavior, but the soundness rests on unverified models of Copilot and LLVM semantics.

read the letter

The paper's core contribution is an integration experience: CopilotVerifier runs with the existing Copilot compiler and uses Crucible's symbolic execution on the generated LLVM plus What4 to produce verification conditions from a bisimulation relation. The relation is meant to show that the high-level monitor and its compiled form agree on outputs for matching inputs and either both crash or neither does.

They do this with off-the-shelf tools rather than building new solvers, and they report that the cost stays moderate for the monitors they tested. That is the practical takeaway for anyone who needs assurance arguments for generated runtime monitors in safety-critical code.

The main limitation is that the approach inherits whatever soundness gaps exist in Crucible's LLVM model and in how the bisimulation is encoded from Copilot semantics. The abstract gives no separate argument or cross-check that those models match the actual operational behaviors, including undefined behavior and memory layout. As an experience report this is expected, but it means the generated proofs are only as reliable as those external components.

The work is aimed at people building or auditing runtime verification tools for regulated systems who already know Crucible and What4. It is not a new theoretical result on bisimulation or symbolic execution. A serious referee should see it because it supplies concrete data on wiring these tools into a real compiler pipeline, even if the soundness story needs expansion in revision.

Referee Report

2 major / 2 minor

Summary. The paper presents CopilotVerifier, which augments the Copilot compiler to emit a bisimulation proof (as a set of VCs) establishing that a Copilot monitor and its compiled LLVM/C form produce equivalent outputs on equivalent inputs and exhibit identical crash behavior (or are crash-free). The proof is generated by encoding the high-level Copilot semantics into a bisimulation relation and using Crucible's symbolic execution of the LLVM binary together with the What4 SMT interface.

Significance. If the central claim holds, the work demonstrates a practical route to stronger machine-checked assurance for runtime monitors in safety-critical systems by composing existing, independently developed symbolic-execution and SMT tools rather than building a custom verifier from scratch. The experience-report character and reported moderate cost are useful data points for the community.

major comments (2)

[§3 (Bisimulation construction and VC generation)] The soundness of the generated bisimulation rests on the unstated assumption that Crucible's LLVM model and the encoding of Copilot semantics into the bisimulation relation faithfully capture the respective operational semantics (including memory model, undefined behavior, and calling conventions). No separate soundness argument, cross-validation against a reference semantics, or validation suite is supplied; this is load-bearing for both the output-equivalence and crash-equivalence claims.
[§3.2 (Crash equivalence)] The crash-equivalence claim requires that the symbolic execution in Crucible correctly identifies all possible crash sites in the compiled code. Because Crucible is treated as a black box, any mismatch between its model and real LLVM behavior would invalidate the claim that the monitor and compiled form 'crash in identical circumstances or cannot crash.'

minor comments (2)

[Abstract] The abstract states the ultimate goal is 'formal assurance arguments that are convincing to human auditors,' yet the paper does not discuss how the generated VCs or bisimulation relation would be presented to auditors.
[§3] Notation for the bisimulation relation and the form of the generated VCs could be made more explicit (e.g., by showing a small example VC in the main text rather than only in an appendix).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful review and the focus on soundness assumptions. We respond to each major comment below, indicating where the manuscript will be revised to improve clarity.

read point-by-point responses

Referee: [§3 (Bisimulation construction and VC generation)] The soundness of the generated bisimulation rests on the unstated assumption that Crucible's LLVM model and the encoding of Copilot semantics into the bisimulation relation faithfully capture the respective operational semantics (including memory model, undefined behavior, and calling conventions). No separate soundness argument, cross-validation against a reference semantics, or validation suite is supplied; this is load-bearing for both the output-equivalence and crash-equivalence claims.

Authors: We agree that the equivalence claims depend on the fidelity of Crucible's LLVM model and our bisimulation encoding. As an experience report on tool composition rather than a foundational verification effort, the manuscript does not supply an independent soundness argument. In the revision we will add an explicit subsection in §3 stating these assumptions, their scope, and references to Crucible's existing validation and usage in other projects. This will make the dependencies transparent without changing the core contribution. revision: yes
Referee: [§3.2 (Crash equivalence)] The crash-equivalence claim requires that the symbolic execution in Crucible correctly identifies all possible crash sites in the compiled code. Because Crucible is treated as a black box, any mismatch between its model and real LLVM behavior would invalidate the claim that the monitor and compiled form 'crash in identical circumstances or cannot crash.'

Authors: This observation is correct and closely related to the prior comment. The revised manuscript will expand §3.2 to state explicitly that crash equivalence is conditional on Crucible correctly modeling all LLVM crash behaviors. We will note the black-box reliance as a standard limitation when composing existing symbolic-execution tools and will not add a new validation suite, as that would exceed the scope of this experience report. revision: yes

Circularity Check

0 steps flagged

No significant circularity; proof relies on external Crucible/What4 tools

full rationale

The paper generates a bisimulation proof using external SMT-backed libraries Crucible (for LLVM symbolic execution) and What4. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the derivation. The central claim (equivalence and crash equivalence via VCs) is not reduced to its own inputs by construction; it depends on the (unverified here) soundness of the external tools matching Copilot/LLVM semantics, which is an external assumption rather than internal circularity. This is the normal case of a self-contained argument against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only abstract available; no explicit free parameters, axioms, or invented entities are stated.

pith-pipeline@v0.9.1-grok · 5752 in / 1022 out tokens · 20613 ms · 2026-07-03T01:06:55.136399+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

80 extracted references · 40 canonical work pages

[1]

The Core Flight System (cFS)

2026. The Core Flight System (cFS). https://github.com/nasa/cfs

2026
[2]

Lennart Augustsson and Kent Petersson. 1994. Silly Type Families* DRAFT. (1994)

1994
[3]

Haniel Barbosa, Clark W. Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres N ¨otzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. 2022. cvc5: A Versatile and Industrial-Strength SMT Solver. InTools and Algorithms for the Construction a...

work page doi:10.1007/978-3-030-99524-9 2022
[5]

Twan Basten. 1996. Branching bisimilarity is an equivalence indeed!Inform. Process. Lett.58, 3 (1996), 141–147

1996
[6]

Sylvie Boldo and Guillaume Melquiond. 2011. Flocq: A unified library for proving floating-point algorithms in Coq. In2011 IEEE 20th Symposium on Computer Arithmetic. IEEE, 243–252. https://doi.org/10.1109/ ARITH.2011.40

2011
[7]

Brett Boston, Samuel Breese, Joey Dodds, Mike Dodds, Brian Huffman, Adam Petcher, and Andrei Stefanescu
[8]

InCAV 2021

Verified cryptographic code for everybody. InCAV 2021. https://doi.org/10.1007/978-3-030-81685-8 31

work page doi:10.1007/978-3-030-81685-8 2021
[9]

Timothy Bourke, L´elio Brun, Pierre- ´Evariste Dagand, Xavier Leroy, Marc Pouzet, and Lionel Rieg. 2017. A Formally Verified Compiler for Lustre. InProceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation(Barcelona, Spain)(PLDI 2017). Association for Computing Machinery, 586–601. https://doi.org/10.1145/3140587.3062358

work page doi:10.1145/3140587.3062358 2017
[10]

Timothy Bourke, L´elio Brun, and Marc Pouzet. 2019. Mechanized Semantics and Verified Compilation for a Dataflow Synchronous Language with Reset.Proc. ACM Program. Lang.4, POPL, Article 44 (dec 2019), 29 pages. https://doi.org/10.1145/3371112

work page doi:10.1145/3371112 2019
[11]

Timothy Bourke, Paul Jeanmaire, Basile Pesin, and Marc Pouzet. 2021. Verified Lustre Normalization with Node Subsampling.ACM Trans. Embed. Comput. Syst.20, 5s, Article 98 (sep 2021), 25 pages. https://doi.org/10.1145/3477041

work page doi:10.1145/3477041 2021
[12]

California Institute of Technology. 2026. F Prime. https://github.com/nasa/fprime

2026
[13]

Caspi, D

P. Caspi, D. Pialiud, N. Halbwachs, and J. Plaice. 1987. LUSTRE: a Declarative Language for Programming Synchronous Systems. In14th Symposium on Principles of Programming Languages. 178–188. https: //doi.org/10.1145/41625.41641

work page doi:10.1145/41625.41641 1987
[14]

Manuel M. T. Chakravarty, Gabriele Keller, and Simon Peyton Jones. 2005. Associated type synonyms. In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming(Tallinn, Estonia)(ICFP ’05). Association for Computing Machinery, New York, NY, USA, 241–253. https: //doi.org/10.1145/1086365.1086397

work page doi:10.1145/1086365.1086397 2005
[15]

Diatchki, Robert Dockins, Joe Hendrix, and Tristan Ravitch

David Thrane Christiansen, Iavor S. Diatchki, Robert Dockins, Joe Hendrix, and Tristan Ravitch. 2019. Dependently Typed Haskell in Industry (Experience Report).Proc. ACM Program. Lang.3, ICFP, Article 100 (July 2019), 16 pages. https://doi.org/10.1145/3341704

work page doi:10.1145/3341704 2019
[16]

Andrey Chudnov, Nathan Collins, Byron Cook, Joey Dodds, Brian Huffman, Colm MacC´arthaigh, Stephen Magill, Eric Mertens, Eric Mullen, Serdar Tasiran, Aaron Tomb, and Eddy Westbrook. 2018. Continuous Formal Verification of Amazon s2n. InComputer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 430–4...

work page doi:10.1007/978-3-319-96142-2 2018
[17]

Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. InProceedings of the fifth ACM SIGPLAN international conference on Functional programming. 268–279. https://doi.org/10.1145/351240.351266

work page doi:10.1145/351240.351266 2000
[18]

Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 168–176. https://doi.org/10.1007/978-3-540-24730-2 15 Trustworthy Runtime Verification via Bisimulation (Extended Experience Report) 29

work page doi:10.1007/978-3-540-24730-2 2004
[19]

Darren Cofer. 2021. Unintended Behavior in Learning-Enabled Systems: Detecting the Unknown Unknowns. In 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC). 1–7. https://doi.org/10.1109/DASC52595. 2021.9594406

work page doi:10.1109/dasc52595 2021
[20]

Darren Cofer, Isaac Amundson, Ramachandra Sattigeri, Arjun Passiand Christopher Boggs, Eric Smith, Limei Gilham, Taejoon Byun, and Sanjai Rayadurgam. 2020. Run-Time Assurance for Learning-Enabled Systems. In NASA Formal Methods: 12th International Symposium, NFM 2020, Moffett Field, CA, USA, May 11–15, 2020, Proceedings. Springer-Verlag, Berlin, Heidelber...

work page doi:10.1007/978-3-030-55754-6 2020
[21]

Darren Cofer and Steven Miller. 2014. DO-333 Certification Case Studies. InNASA Formal Methods, Julia M. Badger and Kristin Yvonne Rozier (Eds.). Springer International Publishing, Cham, 1–15. https: //doi.org/10.1007/978-3-319-06200-6 1

work page doi:10.1007/978-3-319-06200-6 2014
[22]

2014.Autonomy Research for Civil Aviation: Toward a New Era of Flight

National Research Council. 2014.Autonomy Research for Civil Aviation: Toward a New Era of Flight. The National Academies Press

2014
[23]

Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski
[24]

InInternational conference on software engineering and formal methods

Frama-c. InInternational conference on software engineering and formal methods. Springer, 233–247. https://doi.org/10.1007/978-3-642-33826-7 16

work page doi:10.1007/978-3-642-33826-7
[25]

D’ Angelo, S

B. D’ Angelo, S. Sankaranarayanan, C. S´anchez, W. Robinson, Zohar Manna, B. Finkbeiner, H. Spima, and S. Mehrotra. 2005. LOLA: Runtime Monitoring of Synchronous Systems. In12th International Symposium on Temporal Representation and Reasoning. IEEE, 166–174. https://doi.org/10.1109/TIME.2005.26

work page doi:10.1109/time.2005.26 2005
[26]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. InProceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems(Budapest, Hungary)(TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340

2008
[27]

Robert Dockins, Adam Foltzer, Joe Hendrix, Brian Huffman, Dylan McNamee, and Aaron Tomb. 2016. Constructing Semantic Models of Programs with the Software Analysis Workbench. InVerified Software. Theories, Tools, and Experiments, Sandrine Blazy and Marsha Chechik (Eds.). Springer International Publishing, Cham, 56–72. https://doi.org/10.1007/978-3-319-48869-1 5

work page doi:10.1007/978-3-319-48869-1 2016
[28]

Bruno Dutertre. 2014. Yices 2.2. InComputer Aided Verification, Armin Biere and Roderick Bloem (Eds.). Springer International Publishing, Cham, 737–744

2014
[29]

Yli`es Falcone, Klaus Havelund, and Giles Reger. 2013. A Tutorial on Runtime Verification. InEngineering Dependable Software Systems, Manfred Broy, Doron A. Peled, and Georg Kalus (Eds.). NATO Science for Peace and Security Series, D: Information and Communication Security, Vol. 34. IOS Press, 141–175. https://doi.org/10.3233/978-1-61499-207-3-141

work page doi:10.3233/978-1-61499-207-3-141 2013
[30]

Federal Aviation Administration. 2009. Sense and Avoid (SAA) for Unmanned Aircraft Systems (UAS)

2009
[31]

George Fink and Matt Bishop. 1997. Property-Based Testing: A New Approach to Testing for Assurance. SIGSOFT Softw. Eng. Notes22, 4 (jul 1997), 74–80. https://doi.org/10.1145/263244.263267

work page doi:10.1145/263244.263267 1997
[32]

GitHub. 2025. Copilot Verifier. https://github.com/Copilot-Language/copilot/tree/master/copilot-verifier

2025
[33]

GitHub, Issue on Copilot. 2021a. Generated C code accesses stream buffer out of bounds. https: //github.com/Copilot-Language/copilot/issues/238
[34]

GitHub, Issue on Copilot. 2021b. C translation doesn’t correctly select operations based on types. https: //github.com/Copilot-Language/copilot/issues/263
[35]

GitHub, Issue on Copilot. 2021c. Basic example involving structs generates invalid C code. https: //github.com/Copilot-Language/copilot/issues/275
[36]

GitHub, Issue on Copilot. 2021d. Delaying streams of arrays or structs produces C code that fails to compile with Clang. https://github.com/Copilot-Language/copilot/issues/276
[37]

GitHub, Issue on Copilot. 2021e. Copilot-c99: C99 and interpretation of signum behave inconsistently for 0 and NaN. https://github.com/Copilot-Language/copilot/issues/278
[38]

GitHub, Issue on Copilot. 2022a. Copilot-c99: appending values to a stream of arrays produces incorrect C99 code. https://github.com/Copilot-Language/copilot/issues/314
[39]

GitHub, Issue on Copilot. 2022b. Copilot-c99: array has incomplete element type error when declaring stream with array of structs. https://github.com/Copilot-Language/copilot/issues/373
[40]

GitHub, Issue on Copilot. 2022c. Copilot-c99: Streams of arrays result in C code with undefined behavior. https://github.com/Copilot-Language/copilot/issues/386
[41]

Alwyn Goodloe. 2016. Challenges in high-assurance runtime verification. InInternational Symposium on Leveraging Applications of Formal Methods. Springer, 446–460. https://doi.org/10.1007/978-3-319-47166- 2 31 30 Ryan G. Scott, Ivan Perez, Alwyn E. Goodloe, Mike Dodds, and Robert Dockins

work page doi:10.1007/978-3-319-47166- 2016
[42]

2010.Monitoring Distributed Real-Time Systems: A Survey and Future Directions

Alwyn Goodloe and Lee Pike. 2010.Monitoring Distributed Real-Time Systems: A Survey and Future Directions. Technical Report NASA/CR-2010-216724. NASA Langley Research Center

2010
[43]

Hackage. 2025. Copilot-verifier: System for verifying the correctness of generated Copilot programs. https://hackage.haskell.org/package/copilot-verifier

2025
[44]

Klaus Havelund. 2008. Runtime verification of C programs. InTesting of Software and Communicating Systems. Springer, 7–22. https://doi.org/10.1007/978-3-540-68524-1 3

work page doi:10.1007/978-3-540-68524-1 2008
[45]

Joe Hendrix, Ben Selfridge, and Robert Dockins. 2020. What4: New Library to Help Developers Build Verification and Program Analysis Tools. https://galois.com/blog/2020/07/what4-new-library-to-help-devs- build-verification-program-tools/

2020
[46]

Tim Kelly and Rob Weaver. 2004. The goal structuring notation–a safety argument notation. InProceedings of the dependable systems and networks 2004 workshop on assurance cases, Vol. 6. Citeseer

2004
[47]

Ramana Kumar, Magnus O Myreen, Michael Norrish, and Scott Owens. 2014. CakeML: a verified implemen- tation of ML.ACM SIGPLAN Notices49, 1 (2014), 179–191. https://doi.org/10.1145/2578855.2535841

work page doi:10.1145/2578855.2535841 2014
[48]

Jonathan Laurent, Alwyn Goodloe, and Lee Pike. 2015. Assuring the guardians. InRuntime Verification. Springer, 87–101. https://doi.org/10.1007/978-3-319-23820-3 6

work page doi:10.1007/978-3-319-23820-3 2015
[49]

Xavier Leroy. 2009. Formal verification of a realistic compiler.Commun. ACM52, 7 (2009), 107–115. https://doi.org/10.1145/1538788.1538814

work page doi:10.1145/1538788.1538814 2009
[50]

Nuno P Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. 2021. Alive2: bounded translation validation for LLVM. InProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 65–79. https://doi.org/10.1145/3453483.3454030

work page doi:10.1145/3453483.3454030 2021
[51]

2021.Adding Runtime Verification to the Proteus Language

Brian McClelland. 2021.Adding Runtime Verification to the Proteus Language. Ph. D. Dissertation. California State University, Northridge

2021
[52]

Brian McClelland, Daniel Tellier, Meyer Millman, Kate Beatrix Go, Alice Balayan, Michael J Munje, Kyle Dewey, Nhut Ho, Klaus Havelund, and Michel Ingham. 2021. Towards a systems programming language designed for hierarchical state machines. In2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT). IEEE, 23–3...

work page doi:10.1109/smc- 2021
[53]

2020.AFE 87 - Machine Learning

AFE87 Project Members. 2020.AFE 87 - Machine Learning. Technical Report. Aerospace Vehicle Systems Institute

2020
[54]

Robin Milner. 1990. Operational and algebraic semantics of concurrent processes. InFormal Models and Semantics. Elsevier, 1201–1242

1990
[55]

NASA. 2021. Ogma. https://github.com/nasa/ogma Accessed 2026-09-29

2021
[56]

NASA. 2022. NPR 7150.2 NASA Software Engineering Requirements (Appendix D. Software Classifications). https://nodis3.gsfc.nasa.gov/displayDir.cfm?Internal ID=N PR 7150 002D &page name=AppendixD

2022
[57]

George C. Necula. 1997. Proof-carrying code. InProceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(Paris, France)(POPL ’97). Association for Computing Machinery, New York, NY, USA, 106–119. https://doi.org/10.1145/263699.263712

work page doi:10.1145/263699.263712 1997
[58]

Carlos Paradis, Ivan Perez, and Misty Davies. 2023. Towards Streamlining Auditing for Compliance With Requirements in Open-Source Software at NASA.Journal of Systems and Software(2023). https: //ntrs.nasa.gov/citations/20230011257

work page arXiv 2023
[59]

David Park. 1981. Concurrency and Automata on Infinite Sequences. InProceedings of the 5th GI-Conference on Theoretical Computer Science. Springer-Verlag, Berlin, Heidelberg, 167–183

1981
[60]

2020.Copilot 3

Ivan Perez, Frank Dedden, and Alwyn Goodloe. 2020.Copilot 3. Technical Report. NASA Langley Research Center

2020
[61]

Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Geoffrey Washburn. 2006. Simple unification-based type inference for GADTs. InProceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming(Portland, Oregon, USA)(ICFP ’06). Association for Computing Machinery, New York, NY, USA, 50–61. https://doi.org/10.1145/11...

work page doi:10.1145/1159803.1159811 2006
[62]

Lee Pike, Alwyn Goodloe, Robin Morisset, and Sebastian Niller. 2010. Copilot: a hard real-time runtime monitor. InInternational Conference on Runtime Verification. Springer, 345–359. https://doi.org/10.1007/978- 3-642-16612-9 26

work page doi:10.1007/978- 2010
[63]

Lee Pike, Nis Wegmann, Sebastian Niller, and Alwyn Goodloe. 2012. Experience report: a do-it-yourself high-assurance compiler. InProceedings of the 17th ACM SIGPLAN international conference on Functional programming. 335–340. https://doi.org/10.1145/2398856.2364553 Trustworthy Runtime Verification via Bisimulation (Extended Experience Report) 31

work page doi:10.1145/2398856.2364553 2012
[64]

Lee Pike, Nis Wegmann, Sebastian Niller, and Alwyn Goodloe. 2013. Copilot: monitoring embedded systems. Innovations in Systems and Software Engineering9, 4 (2013), 235–255. https://doi.org/10.1007/s11334-013- 0223-x

work page doi:10.1007/s11334-013- 2013
[65]

Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation validation. InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 151–166. https://doi.org/10. 1007/BFb0054170

1998
[66]

Tahina Ramananandro, Paul Mountcastle, Benoˆıt Meister, and Richard Lethin. 2016. A unified Coq framework for verifying C programs with floating-point computations. InProceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs. 15–26. https://doi.org/10.1145/2854065.2854066

work page doi:10.1145/2854065.2854066 2016
[67]

Formal Methods Supplement to DO-178C and DO-278A

RTCA 2011. Formal Methods Supplement to DO-178C and DO-278A. RTCA, Inc.. RCTA/DO333

2011
[68]

RTCA. 2011. Software Considerations in Airborne Systems and Equipment Certification. RTCA, Inc.. RCTA/DO-178C

2011
[69]

Michael Ryabtsev and Ofer Strichman. 2009. Translation validation: From simulink to c. InInternational Conference on Computer Aided Verification. Springer, 696–701. https://doi.org/10.1007/978-3-642-02658- 4 57

work page doi:10.1007/978-3-642-02658- 2009
[70]

SAE International. 2010. Guidelines For Development Of Civil Aircraft and Systems. SAE International. ARP4754A

2010
[71]

Tom Schrijvers, Simon Peyton Jones, Manuel Chakravarty, and Martin Sulzmann. 2008. Type checking with open type functions. InProceedings of the 13th ACM SIGPLAN International Conference on Functional Programming(Victoria, BC, Canada)(ICFP ’08). Association for Computing Machinery, New York, NY, USA, 51–62. https://doi.org/10.1145/1411204.1411215

work page doi:10.1145/1411204.1411215 2008
[72]

Ryan Scott, Robert Dockins, Tristan Ravitch, and Aaron Tomb. 2021. Crux: Symbolic Execution Meets SMT-based Verification (Competition Contribution). https://doi.org/10.5281/zenodo.6147218

work page doi:10.5281/zenodo.6147218 2021
[73]

Scott, Mike Dodds, Ivan Perez, Alwyn E

Ryan G. Scott, Mike Dodds, Ivan Perez, Alwyn E. Goodloe, and Robert Dockins. 2023.Artifact for ”Trustworthy Runtime Verification via Bisimulation (Experience Report)”. https://doi.org/10.5281/zenodo.7978326

work page doi:10.5281/zenodo.7978326 2023
[74]

Thomas Arthur Leck Sewell, Magnus O Myreen, and Gerwin Klein. 2013. Translation validation for a verified OS kernel. InProceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. 471–482

2013
[75]

Aaron Tomb. 2020. Crux: Introducing Our New Open-Source Tool for Software Verification. https: //galois.com/blog/2020/10/crux-introducing-our-new-open-source-tool-for-software-verification/

2020
[76]

2014.Analysis of Well-Clear Boundary Models for the Integration of UAS in the NAS

Jason M Upchurch, Cesar A Munoz, Anthony J Narkawicz, James P Chamberlain, and Maria C Consiglio. 2014.Analysis of Well-Clear Boundary Models for the Integration of UAS in the NAS. Technical Report. NASA Langley Research Center

2014
[77]

Rob J Van Glabbeek and W Peter Weijland. 1996. Branching time and abstraction in bisimulation semantics. Journal of the ACM (JACM)43, 3 (1996), 555–600

1996
[78]

Lucas Wagner, Alain Mebsout, Cesare Tinelli, Darren Cofer, and Konrad Slind. 2017. Qualification of a Model Checker for Avionics Software Verification. InNASA Formal Methods, Clark Barrett, Misty Davies, and Temesghen Kahsai (Eds.). Springer International Publishing, Cham, 404–419. https://doi.org/10.1007/978-3- 319-57288-8 29

work page doi:10.1007/978-3- 2017
[79]

Eisenberg

Stephanie Weirich, Justin Hsu, and Richard A. Eisenberg. 2013. System FC with explicit kind equality. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming(Boston, Massachusetts, USA)(ICFP ’13). Association for Computing Machinery, New York, NY, USA, 275–286. https://doi.org/10.1145/2500365.2500599

work page doi:10.1145/2500365.2500599 2013
[80]

Jonathan Wilmot. 2005. A core flight software system. InProceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. 13–14

2005
[81]

Yorgey, Stephanie Weirich, Julien Cretin, Simon Peyton Jones, Dimitrios Vytiniotis, and Jos´e Pedro Magalh˜aes

Brent A. Yorgey, Stephanie Weirich, Julien Cretin, Simon Peyton Jones, Dimitrios Vytiniotis, and Jos´e Pedro Magalh˜aes. 2012. Giving Haskell a promotion. InProceedings of the 8th ACM SIGPLAN Workshop on Types in Language Design and Implementation(Philadelphia, Pennsylvania, USA)(TLDI ’12). Association for Computing Machinery, New York, NY, USA, 53–66. ht...

work page doi:10.1145/2103786.2103795 2012

[1] [1]

The Core Flight System (cFS)

2026. The Core Flight System (cFS). https://github.com/nasa/cfs

2026

[2] [2]

Lennart Augustsson and Kent Petersson. 1994. Silly Type Families* DRAFT. (1994)

1994

[3] [3]

Haniel Barbosa, Clark W. Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres N ¨otzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. 2022. cvc5: A Versatile and Industrial-Strength SMT Solver. InTools and Algorithms for the Construction a...

work page doi:10.1007/978-3-030-99524-9 2022

[4] [5]

Twan Basten. 1996. Branching bisimilarity is an equivalence indeed!Inform. Process. Lett.58, 3 (1996), 141–147

1996

[5] [6]

Sylvie Boldo and Guillaume Melquiond. 2011. Flocq: A unified library for proving floating-point algorithms in Coq. In2011 IEEE 20th Symposium on Computer Arithmetic. IEEE, 243–252. https://doi.org/10.1109/ ARITH.2011.40

2011

[6] [7]

Brett Boston, Samuel Breese, Joey Dodds, Mike Dodds, Brian Huffman, Adam Petcher, and Andrei Stefanescu

[7] [8]

InCAV 2021

Verified cryptographic code for everybody. InCAV 2021. https://doi.org/10.1007/978-3-030-81685-8 31

work page doi:10.1007/978-3-030-81685-8 2021

[8] [9]

Timothy Bourke, L´elio Brun, Pierre- ´Evariste Dagand, Xavier Leroy, Marc Pouzet, and Lionel Rieg. 2017. A Formally Verified Compiler for Lustre. InProceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation(Barcelona, Spain)(PLDI 2017). Association for Computing Machinery, 586–601. https://doi.org/10.1145/3140587.3062358

work page doi:10.1145/3140587.3062358 2017

[9] [10]

Timothy Bourke, L´elio Brun, and Marc Pouzet. 2019. Mechanized Semantics and Verified Compilation for a Dataflow Synchronous Language with Reset.Proc. ACM Program. Lang.4, POPL, Article 44 (dec 2019), 29 pages. https://doi.org/10.1145/3371112

work page doi:10.1145/3371112 2019

[10] [11]

Timothy Bourke, Paul Jeanmaire, Basile Pesin, and Marc Pouzet. 2021. Verified Lustre Normalization with Node Subsampling.ACM Trans. Embed. Comput. Syst.20, 5s, Article 98 (sep 2021), 25 pages. https://doi.org/10.1145/3477041

work page doi:10.1145/3477041 2021

[11] [12]

California Institute of Technology. 2026. F Prime. https://github.com/nasa/fprime

2026

[12] [13]

Caspi, D

P. Caspi, D. Pialiud, N. Halbwachs, and J. Plaice. 1987. LUSTRE: a Declarative Language for Programming Synchronous Systems. In14th Symposium on Principles of Programming Languages. 178–188. https: //doi.org/10.1145/41625.41641

work page doi:10.1145/41625.41641 1987

[13] [14]

Manuel M. T. Chakravarty, Gabriele Keller, and Simon Peyton Jones. 2005. Associated type synonyms. In Proceedings of the Tenth ACM SIGPLAN International Conference on Functional Programming(Tallinn, Estonia)(ICFP ’05). Association for Computing Machinery, New York, NY, USA, 241–253. https: //doi.org/10.1145/1086365.1086397

work page doi:10.1145/1086365.1086397 2005

[14] [15]

Diatchki, Robert Dockins, Joe Hendrix, and Tristan Ravitch

David Thrane Christiansen, Iavor S. Diatchki, Robert Dockins, Joe Hendrix, and Tristan Ravitch. 2019. Dependently Typed Haskell in Industry (Experience Report).Proc. ACM Program. Lang.3, ICFP, Article 100 (July 2019), 16 pages. https://doi.org/10.1145/3341704

work page doi:10.1145/3341704 2019

[15] [16]

Andrey Chudnov, Nathan Collins, Byron Cook, Joey Dodds, Brian Huffman, Colm MacC´arthaigh, Stephen Magill, Eric Mertens, Eric Mullen, Serdar Tasiran, Aaron Tomb, and Eddy Westbrook. 2018. Continuous Formal Verification of Amazon s2n. InComputer Aided Verification, Hana Chockler and Georg Weissenbacher (Eds.). Springer International Publishing, Cham, 430–4...

work page doi:10.1007/978-3-319-96142-2 2018

[16] [17]

Koen Claessen and John Hughes. 2000. QuickCheck: a lightweight tool for random testing of Haskell programs. InProceedings of the fifth ACM SIGPLAN international conference on Functional programming. 268–279. https://doi.org/10.1145/351240.351266

work page doi:10.1145/351240.351266 2000

[17] [18]

Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In International Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 168–176. https://doi.org/10.1007/978-3-540-24730-2 15 Trustworthy Runtime Verification via Bisimulation (Extended Experience Report) 29

work page doi:10.1007/978-3-540-24730-2 2004

[18] [19]

Darren Cofer. 2021. Unintended Behavior in Learning-Enabled Systems: Detecting the Unknown Unknowns. In 2021 IEEE/AIAA 40th Digital Avionics Systems Conference (DASC). 1–7. https://doi.org/10.1109/DASC52595. 2021.9594406

work page doi:10.1109/dasc52595 2021

[19] [20]

Darren Cofer, Isaac Amundson, Ramachandra Sattigeri, Arjun Passiand Christopher Boggs, Eric Smith, Limei Gilham, Taejoon Byun, and Sanjai Rayadurgam. 2020. Run-Time Assurance for Learning-Enabled Systems. In NASA Formal Methods: 12th International Symposium, NFM 2020, Moffett Field, CA, USA, May 11–15, 2020, Proceedings. Springer-Verlag, Berlin, Heidelber...

work page doi:10.1007/978-3-030-55754-6 2020

[20] [21]

Darren Cofer and Steven Miller. 2014. DO-333 Certification Case Studies. InNASA Formal Methods, Julia M. Badger and Kristin Yvonne Rozier (Eds.). Springer International Publishing, Cham, 1–15. https: //doi.org/10.1007/978-3-319-06200-6 1

work page doi:10.1007/978-3-319-06200-6 2014

[21] [22]

2014.Autonomy Research for Civil Aviation: Toward a New Era of Flight

National Research Council. 2014.Autonomy Research for Civil Aviation: Toward a New Era of Flight. The National Academies Press

2014

[22] [23]

Pascal Cuoq, Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski

[23] [24]

InInternational conference on software engineering and formal methods

Frama-c. InInternational conference on software engineering and formal methods. Springer, 233–247. https://doi.org/10.1007/978-3-642-33826-7 16

work page doi:10.1007/978-3-642-33826-7

[24] [25]

D’ Angelo, S

B. D’ Angelo, S. Sankaranarayanan, C. S´anchez, W. Robinson, Zohar Manna, B. Finkbeiner, H. Spima, and S. Mehrotra. 2005. LOLA: Runtime Monitoring of Synchronous Systems. In12th International Symposium on Temporal Representation and Reasoning. IEEE, 166–174. https://doi.org/10.1109/TIME.2005.26

work page doi:10.1109/time.2005.26 2005

[25] [26]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. InProceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems(Budapest, Hungary)(TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340

2008

[26] [27]

Robert Dockins, Adam Foltzer, Joe Hendrix, Brian Huffman, Dylan McNamee, and Aaron Tomb. 2016. Constructing Semantic Models of Programs with the Software Analysis Workbench. InVerified Software. Theories, Tools, and Experiments, Sandrine Blazy and Marsha Chechik (Eds.). Springer International Publishing, Cham, 56–72. https://doi.org/10.1007/978-3-319-48869-1 5

work page doi:10.1007/978-3-319-48869-1 2016

[27] [28]

Bruno Dutertre. 2014. Yices 2.2. InComputer Aided Verification, Armin Biere and Roderick Bloem (Eds.). Springer International Publishing, Cham, 737–744

2014

[28] [29]

Yli`es Falcone, Klaus Havelund, and Giles Reger. 2013. A Tutorial on Runtime Verification. InEngineering Dependable Software Systems, Manfred Broy, Doron A. Peled, and Georg Kalus (Eds.). NATO Science for Peace and Security Series, D: Information and Communication Security, Vol. 34. IOS Press, 141–175. https://doi.org/10.3233/978-1-61499-207-3-141

work page doi:10.3233/978-1-61499-207-3-141 2013

[29] [30]

Federal Aviation Administration. 2009. Sense and Avoid (SAA) for Unmanned Aircraft Systems (UAS)

2009

[30] [31]

George Fink and Matt Bishop. 1997. Property-Based Testing: A New Approach to Testing for Assurance. SIGSOFT Softw. Eng. Notes22, 4 (jul 1997), 74–80. https://doi.org/10.1145/263244.263267

work page doi:10.1145/263244.263267 1997

[31] [32]

GitHub. 2025. Copilot Verifier. https://github.com/Copilot-Language/copilot/tree/master/copilot-verifier

2025

[32] [33]

GitHub, Issue on Copilot. 2021a. Generated C code accesses stream buffer out of bounds. https: //github.com/Copilot-Language/copilot/issues/238

[33] [34]

GitHub, Issue on Copilot. 2021b. C translation doesn’t correctly select operations based on types. https: //github.com/Copilot-Language/copilot/issues/263

[34] [35]

GitHub, Issue on Copilot. 2021c. Basic example involving structs generates invalid C code. https: //github.com/Copilot-Language/copilot/issues/275

[35] [36]

GitHub, Issue on Copilot. 2021d. Delaying streams of arrays or structs produces C code that fails to compile with Clang. https://github.com/Copilot-Language/copilot/issues/276

[36] [37]

GitHub, Issue on Copilot. 2021e. Copilot-c99: C99 and interpretation of signum behave inconsistently for 0 and NaN. https://github.com/Copilot-Language/copilot/issues/278

[37] [38]

GitHub, Issue on Copilot. 2022a. Copilot-c99: appending values to a stream of arrays produces incorrect C99 code. https://github.com/Copilot-Language/copilot/issues/314

[38] [39]

GitHub, Issue on Copilot. 2022b. Copilot-c99: array has incomplete element type error when declaring stream with array of structs. https://github.com/Copilot-Language/copilot/issues/373

[39] [40]

GitHub, Issue on Copilot. 2022c. Copilot-c99: Streams of arrays result in C code with undefined behavior. https://github.com/Copilot-Language/copilot/issues/386

[40] [41]

Alwyn Goodloe. 2016. Challenges in high-assurance runtime verification. InInternational Symposium on Leveraging Applications of Formal Methods. Springer, 446–460. https://doi.org/10.1007/978-3-319-47166- 2 31 30 Ryan G. Scott, Ivan Perez, Alwyn E. Goodloe, Mike Dodds, and Robert Dockins

work page doi:10.1007/978-3-319-47166- 2016

[41] [42]

2010.Monitoring Distributed Real-Time Systems: A Survey and Future Directions

Alwyn Goodloe and Lee Pike. 2010.Monitoring Distributed Real-Time Systems: A Survey and Future Directions. Technical Report NASA/CR-2010-216724. NASA Langley Research Center

2010

[42] [43]

Hackage. 2025. Copilot-verifier: System for verifying the correctness of generated Copilot programs. https://hackage.haskell.org/package/copilot-verifier

2025

[43] [44]

Klaus Havelund. 2008. Runtime verification of C programs. InTesting of Software and Communicating Systems. Springer, 7–22. https://doi.org/10.1007/978-3-540-68524-1 3

work page doi:10.1007/978-3-540-68524-1 2008

[44] [45]

Joe Hendrix, Ben Selfridge, and Robert Dockins. 2020. What4: New Library to Help Developers Build Verification and Program Analysis Tools. https://galois.com/blog/2020/07/what4-new-library-to-help-devs- build-verification-program-tools/

2020

[45] [46]

Tim Kelly and Rob Weaver. 2004. The goal structuring notation–a safety argument notation. InProceedings of the dependable systems and networks 2004 workshop on assurance cases, Vol. 6. Citeseer

2004

[46] [47]

Ramana Kumar, Magnus O Myreen, Michael Norrish, and Scott Owens. 2014. CakeML: a verified implemen- tation of ML.ACM SIGPLAN Notices49, 1 (2014), 179–191. https://doi.org/10.1145/2578855.2535841

work page doi:10.1145/2578855.2535841 2014

[47] [48]

Jonathan Laurent, Alwyn Goodloe, and Lee Pike. 2015. Assuring the guardians. InRuntime Verification. Springer, 87–101. https://doi.org/10.1007/978-3-319-23820-3 6

work page doi:10.1007/978-3-319-23820-3 2015

[48] [49]

Xavier Leroy. 2009. Formal verification of a realistic compiler.Commun. ACM52, 7 (2009), 107–115. https://doi.org/10.1145/1538788.1538814

work page doi:10.1145/1538788.1538814 2009

[49] [50]

Nuno P Lopes, Juneyoung Lee, Chung-Kil Hur, Zhengyang Liu, and John Regehr. 2021. Alive2: bounded translation validation for LLVM. InProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation. 65–79. https://doi.org/10.1145/3453483.3454030

work page doi:10.1145/3453483.3454030 2021

[50] [51]

2021.Adding Runtime Verification to the Proteus Language

Brian McClelland. 2021.Adding Runtime Verification to the Proteus Language. Ph. D. Dissertation. California State University, Northridge

2021

[51] [52]

Brian McClelland, Daniel Tellier, Meyer Millman, Kate Beatrix Go, Alice Balayan, Michael J Munje, Kyle Dewey, Nhut Ho, Klaus Havelund, and Michel Ingham. 2021. Towards a systems programming language designed for hierarchical state machines. In2021 IEEE 8th International Conference on Space Mission Challenges for Information Technology (SMC-IT). IEEE, 23–3...

work page doi:10.1109/smc- 2021

[52] [53]

2020.AFE 87 - Machine Learning

AFE87 Project Members. 2020.AFE 87 - Machine Learning. Technical Report. Aerospace Vehicle Systems Institute

2020

[53] [54]

Robin Milner. 1990. Operational and algebraic semantics of concurrent processes. InFormal Models and Semantics. Elsevier, 1201–1242

1990

[54] [55]

NASA. 2021. Ogma. https://github.com/nasa/ogma Accessed 2026-09-29

2021

[55] [56]

NASA. 2022. NPR 7150.2 NASA Software Engineering Requirements (Appendix D. Software Classifications). https://nodis3.gsfc.nasa.gov/displayDir.cfm?Internal ID=N PR 7150 002D &page name=AppendixD

2022

[56] [57]

George C. Necula. 1997. Proof-carrying code. InProceedings of the 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages(Paris, France)(POPL ’97). Association for Computing Machinery, New York, NY, USA, 106–119. https://doi.org/10.1145/263699.263712

work page doi:10.1145/263699.263712 1997

[57] [58]

Carlos Paradis, Ivan Perez, and Misty Davies. 2023. Towards Streamlining Auditing for Compliance With Requirements in Open-Source Software at NASA.Journal of Systems and Software(2023). https: //ntrs.nasa.gov/citations/20230011257

work page arXiv 2023

[58] [59]

David Park. 1981. Concurrency and Automata on Infinite Sequences. InProceedings of the 5th GI-Conference on Theoretical Computer Science. Springer-Verlag, Berlin, Heidelberg, 167–183

1981

[59] [60]

2020.Copilot 3

Ivan Perez, Frank Dedden, and Alwyn Goodloe. 2020.Copilot 3. Technical Report. NASA Langley Research Center

2020

[60] [61]

Simon Peyton Jones, Dimitrios Vytiniotis, Stephanie Weirich, and Geoffrey Washburn. 2006. Simple unification-based type inference for GADTs. InProceedings of the Eleventh ACM SIGPLAN International Conference on Functional Programming(Portland, Oregon, USA)(ICFP ’06). Association for Computing Machinery, New York, NY, USA, 50–61. https://doi.org/10.1145/11...

work page doi:10.1145/1159803.1159811 2006

[61] [62]

Lee Pike, Alwyn Goodloe, Robin Morisset, and Sebastian Niller. 2010. Copilot: a hard real-time runtime monitor. InInternational Conference on Runtime Verification. Springer, 345–359. https://doi.org/10.1007/978- 3-642-16612-9 26

work page doi:10.1007/978- 2010

[62] [63]

Lee Pike, Nis Wegmann, Sebastian Niller, and Alwyn Goodloe. 2012. Experience report: a do-it-yourself high-assurance compiler. InProceedings of the 17th ACM SIGPLAN international conference on Functional programming. 335–340. https://doi.org/10.1145/2398856.2364553 Trustworthy Runtime Verification via Bisimulation (Extended Experience Report) 31

work page doi:10.1145/2398856.2364553 2012

[63] [64]

Lee Pike, Nis Wegmann, Sebastian Niller, and Alwyn Goodloe. 2013. Copilot: monitoring embedded systems. Innovations in Systems and Software Engineering9, 4 (2013), 235–255. https://doi.org/10.1007/s11334-013- 0223-x

work page doi:10.1007/s11334-013- 2013

[64] [65]

Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation validation. InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems. Springer, 151–166. https://doi.org/10. 1007/BFb0054170

1998

[65] [66]

Tahina Ramananandro, Paul Mountcastle, Benoˆıt Meister, and Richard Lethin. 2016. A unified Coq framework for verifying C programs with floating-point computations. InProceedings of the 5th ACM SIGPLAN Conference on Certified Programs and Proofs. 15–26. https://doi.org/10.1145/2854065.2854066

work page doi:10.1145/2854065.2854066 2016

[66] [67]

Formal Methods Supplement to DO-178C and DO-278A

RTCA 2011. Formal Methods Supplement to DO-178C and DO-278A. RTCA, Inc.. RCTA/DO333

2011

[67] [68]

RTCA. 2011. Software Considerations in Airborne Systems and Equipment Certification. RTCA, Inc.. RCTA/DO-178C

2011

[68] [69]

Michael Ryabtsev and Ofer Strichman. 2009. Translation validation: From simulink to c. InInternational Conference on Computer Aided Verification. Springer, 696–701. https://doi.org/10.1007/978-3-642-02658- 4 57

work page doi:10.1007/978-3-642-02658- 2009

[69] [70]

SAE International. 2010. Guidelines For Development Of Civil Aircraft and Systems. SAE International. ARP4754A

2010

[70] [71]

Tom Schrijvers, Simon Peyton Jones, Manuel Chakravarty, and Martin Sulzmann. 2008. Type checking with open type functions. InProceedings of the 13th ACM SIGPLAN International Conference on Functional Programming(Victoria, BC, Canada)(ICFP ’08). Association for Computing Machinery, New York, NY, USA, 51–62. https://doi.org/10.1145/1411204.1411215

work page doi:10.1145/1411204.1411215 2008

[71] [72]

Ryan Scott, Robert Dockins, Tristan Ravitch, and Aaron Tomb. 2021. Crux: Symbolic Execution Meets SMT-based Verification (Competition Contribution). https://doi.org/10.5281/zenodo.6147218

work page doi:10.5281/zenodo.6147218 2021

[72] [73]

Scott, Mike Dodds, Ivan Perez, Alwyn E

Ryan G. Scott, Mike Dodds, Ivan Perez, Alwyn E. Goodloe, and Robert Dockins. 2023.Artifact for ”Trustworthy Runtime Verification via Bisimulation (Experience Report)”. https://doi.org/10.5281/zenodo.7978326

work page doi:10.5281/zenodo.7978326 2023

[73] [74]

Thomas Arthur Leck Sewell, Magnus O Myreen, and Gerwin Klein. 2013. Translation validation for a verified OS kernel. InProceedings of the 34th ACM SIGPLAN conference on Programming language design and implementation. 471–482

2013

[74] [75]

Aaron Tomb. 2020. Crux: Introducing Our New Open-Source Tool for Software Verification. https: //galois.com/blog/2020/10/crux-introducing-our-new-open-source-tool-for-software-verification/

2020

[75] [76]

2014.Analysis of Well-Clear Boundary Models for the Integration of UAS in the NAS

Jason M Upchurch, Cesar A Munoz, Anthony J Narkawicz, James P Chamberlain, and Maria C Consiglio. 2014.Analysis of Well-Clear Boundary Models for the Integration of UAS in the NAS. Technical Report. NASA Langley Research Center

2014

[76] [77]

Rob J Van Glabbeek and W Peter Weijland. 1996. Branching time and abstraction in bisimulation semantics. Journal of the ACM (JACM)43, 3 (1996), 555–600

1996

[77] [78]

Lucas Wagner, Alain Mebsout, Cesare Tinelli, Darren Cofer, and Konrad Slind. 2017. Qualification of a Model Checker for Avionics Software Verification. InNASA Formal Methods, Clark Barrett, Misty Davies, and Temesghen Kahsai (Eds.). Springer International Publishing, Cham, 404–419. https://doi.org/10.1007/978-3- 319-57288-8 29

work page doi:10.1007/978-3- 2017

[78] [79]

Eisenberg

Stephanie Weirich, Justin Hsu, and Richard A. Eisenberg. 2013. System FC with explicit kind equality. In Proceedings of the 18th ACM SIGPLAN International Conference on Functional Programming(Boston, Massachusetts, USA)(ICFP ’13). Association for Computing Machinery, New York, NY, USA, 275–286. https://doi.org/10.1145/2500365.2500599

work page doi:10.1145/2500365.2500599 2013

[79] [80]

Jonathan Wilmot. 2005. A core flight software system. InProceedings of the 3rd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. 13–14

2005

[80] [81]

Yorgey, Stephanie Weirich, Julien Cretin, Simon Peyton Jones, Dimitrios Vytiniotis, and Jos´e Pedro Magalh˜aes

Brent A. Yorgey, Stephanie Weirich, Julien Cretin, Simon Peyton Jones, Dimitrios Vytiniotis, and Jos´e Pedro Magalh˜aes. 2012. Giving Haskell a promotion. InProceedings of the 8th ACM SIGPLAN Workshop on Types in Language Design and Implementation(Philadelphia, Pennsylvania, USA)(TLDI ’12). Association for Computing Machinery, New York, NY, USA, 53–66. ht...

work page doi:10.1145/2103786.2103795 2012