Soteria: Efficient Symbolic Execution as a Functional Library
Pith reviewed 2026-05-17 23:53 UTC · model grok-4.3
The pith
Symbolic execution engines can be built directly for each source language without the trade-offs of intermediate languages.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Soteria enables developers to construct SE engines that operate directly over source-language semantics, offering configurability, compositional reasoning, and ease of implementation. Using Soteria, SoteriaRust is the first Rust SE engine supporting Tree Borrows and SoteriaC is a compositional SE engine for C. Both are competitive with or outperform state-of-the-art tools such as Kani, Pulse, CBMC and Gillian-C in performance and the number of bugs detected. The theoretical foundations are formalized and soundness is proven, demonstrating that sound, efficient, accurate, and expressive SE can be achieved without the compromises of ILs.
What carries the argument
Soteria, a lightweight OCaml library for writing symbolic execution engines in a functional style that operate directly over source-language semantics.
Load-bearing premise
The engineering cost of writing a fresh symbolic execution engine per language remains low enough to be practical once the Soteria combinators are provided.
What would settle it
A direct comparison in which a Soteria-based engine for Rust or C detects fewer bugs or runs slower than an established IL-based tool on the same set of benchmarks would falsify the claim of competitive or superior performance and accuracy.
Figures
read the original abstract
Symbolic execution (SE) tools often rely on intermediate languages (ILs) to support multiple programming languages, promising reusability and efficiency. In practice, this approach introduces trade-offs between performance, accuracy, and language feature support. We argue that building SE engines \emph{directly} for each source language is both simpler and more effective. We present Soteria, a lightweight OCaml library for writing SE engines in a functional style, without compromising on performance, accuracy or feature support. Soteria enables developers to construct SE engines that operate directly over source-language semantics, offering \emph{configurability}, compositional reasoning, and ease of implementation. Using Soteria, we develop Soteria$^{\text{Rust}}$, the \emph{first} Rust SE engine supporting Tree Borrows (the intricate aliasing model of Rust), and Soteria$^{\text{C}}$, a compositional SE engine for C. Both tools are competitive with or outperform state-of-the-art tools such as Kani, Pulse, CBMC and Gillian-C in performance and the number of bugs detected. We formalise the theoretical foundations of Soteria and prove its soundness, demonstrating that sound, efficient, accurate, and expressive SE can be achieved without the compromises of ILs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents Soteria, a lightweight OCaml library for implementing symbolic execution engines directly over source-language semantics in a functional style. It argues that this avoids the performance/accuracy/feature-support trade-offs of intermediate-language approaches, offering configurability and compositional reasoning. Concrete engines are built for Rust (SoteriaRust, the first to support Tree Borrows) and C (SoteriaC); both are shown competitive with or superior to Kani, Pulse, CBMC and Gillian-C on performance and bugs found. A formal soundness proof for the library is supplied.
Significance. If the claims hold, the work is significant: it supplies a reusable functional library plus two non-trivial engines and a soundness proof, demonstrating that direct source-level SE can be both practical and rigorous. The Tree Borrows support is a concrete strength that directly tests the 'no compromises' thesis for complex aliasing. The stress-test concern about per-language engineering cost does not land, because the manuscript supplies two working engines whose implementation effort is implicitly bounded by the library's combinators.
major comments (2)
- [§6] §6 (Soundness): the proof is stated for the library combinators, yet the text does not explicitly lift the result to the Tree Borrows encoding used in SoteriaRust; a short lemma or remark showing that the aliasing model satisfies the required assumptions would make the central claim load-bearing rather than implicit.
- [§7] §7 (Evaluation): the claim that both tools 'outperform state-of-the-art' on bug detection is central, but the paper reports only aggregate counts; per-benchmark tables with false-positive rates and timeout settings are needed to substantiate accuracy and efficiency without IL trade-offs.
minor comments (3)
- [Introduction] The introduction uses 'configurability' and 'compositional reasoning' without a short forward reference to the sections where these properties are demonstrated.
- Figure captions for the architecture diagrams should state the exact OCaml types of the key combinators they illustrate.
- [§5] A one-sentence comparison of lines-of-code for the Tree Borrows model versus an equivalent IL front-end would usefully quantify the 'simpler' claim.
Simulated Author's Rebuttal
We thank the referee for the positive assessment, recognition of the significance of the Tree Borrows support, and recommendation for minor revision. We address each major comment below.
read point-by-point responses
-
Referee: [§6] §6 (Soundness): the proof is stated for the library combinators, yet the text does not explicitly lift the result to the Tree Borrows encoding used in SoteriaRust; a short lemma or remark showing that the aliasing model satisfies the required assumptions would make the central claim load-bearing rather than implicit.
Authors: We agree that an explicit connection would strengthen the central claim. We will add a short remark (or brief lemma) in §6 verifying that the Tree Borrows aliasing model satisfies the assumptions required by the soundness theorem for the library combinators. This is a minor textual addition that makes the application to SoteriaRust direct without altering the existing proof. revision: yes
-
Referee: [§7] §7 (Evaluation): the claim that both tools 'outperform state-of-the-art' on bug detection is central, but the paper reports only aggregate counts; per-benchmark tables with false-positive rates and timeout settings are needed to substantiate accuracy and efficiency without IL trade-offs.
Authors: We accept that more granular data would better substantiate the accuracy and efficiency claims. In the revised version we will add per-benchmark tables (where the experimental data permit) together with explicit false-positive rates and the timeout settings used. This will provide clearer evidence for the performance and bug-detection results without relying solely on aggregates. revision: yes
Circularity Check
No circularity: claims rest on new library, implementations, and independent soundness proof.
full rationale
The paper presents Soteria as a fresh OCaml library with compositional combinators, then demonstrates it via two new engines (SoteriaRust supporting Tree Borrows and SoteriaC) plus a formal soundness proof. No equations, fitted parameters, or predictions are defined in terms of themselves. The central claim that direct source-language symbolic execution avoids IL trade-offs is substantiated by the concrete artifacts and proof rather than by self-referential definitions, load-bearing self-citations, or imported uniqueness results. The unquantified assumption about per-language engineering effort is a practical caveat but does not reduce any derivation step to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The semantics of the target languages (Rust with Tree Borrows, C) can be faithfully encoded as executable OCaml functions.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat.equivNat unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present Soteria, a lightweight OCaml library for writing SE engines in a functional style... We formalise the theoretical foundations of Soteria and prove its soundness
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Soteria rust... first SE engine for Rust that supports Tree Borrows
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Teaching LLMs Program Semantics via Symbolic Execution Traces
Training Qwen3-8B on symbolic execution traces from Soteria improves violation detection in C programs by over 17 points, transfers across five property types, and shows superadditive gains with chain-of-thought.
Reference graph
Works this paper leans on
-
[1]
Léo Andrès, Filipe Marques, Arthur Carcano, Pierre Chambart, José Fragoso Santos, and Jean-Christophe Filliâtre
-
[2]
Owi: Performant Parallel Symbolic Execution Made Easy, an Application to WebAssembly.The Art, Science, and Engineering of Programming9, 1 (Oct. 2024), 3:1–3:42. doi:10.22152/programming-journal.org/2025/9/3
work page doi:10.22152/programming-journal.org/2025/9/3 2024
-
[3]
Vytautas Astrauskas, Peter Müller, Federico Poli, and Alexander J. Summers. 2019. Leveraging Rust Types for Modular Specification and Verification.Proceedings of the ACM on Programming Languages3, OOPSLA (Oct. 2019), 147:1–147:30. doi:10.1145/3360573
-
[4]
Sacha-Élie Ayoun, Xavier Denis, Petar Maksimović, and Philippa Gardner. 2025. A Hybrid Approach to Semi-Automated Rust Verification. doi:10.48550/arXiv.2403.15122 arXiv:2403.15122 [cs]
-
[5]
2024.Gillian: foundations, implementation, and applications of compositional symbolic execution
Sacha-Élie Ayoun. 2024.Gillian: foundations, implementation, and applications of compositional symbolic execution. Ph. D. Dissertation. doi:10.25560/119340
-
[6]
Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, Andres Nötzli, Alex Ozdemir, Mathias Preiner, Andrew Reynolds, Ying Sheng, Cesare Tinelli, and Yoni Zohar. 2022. Cvc5: A Versatile and Industrial-Strength SMT Solver. InTools and Algorithms for the Construction and An...
-
[7]
Siddharth Bhat, Léo Stefanesco, Chris Hughes, and Tobias Grosser. 2025. Certified Decision Procedures for Width- Independent Bitvector Predicates.Proc. ACM Program. Lang.9, OOPSLA2, Article 370 (Oct. 2025), 23 pages. doi:10. 1145/3763148
work page 2025
-
[8]
Armin Biere, Tobias Faller, Katalin Fazekas, Mathias Fleury, Nils Froleyks, and Florian Pollitt. 2024. CaDiCaL 2.0. InComputer Aided Verification - 36th International Conference, CA V 2024, Montreal, QC, Canada, July 24-27, 2024, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 14681), Arie Gurfinkel and Vijay Ganesh (Eds.). Springer, 133–152....
-
[9]
Armin Biere, Tobias Faller, Katalin Fazekas, Mathias Fleury, Nils Froleyks, and Florian Pollitt. 2024. CaDiCaL, Gimsatul, IsaSAT and Kissat Entering the SAT Competition 2024. InProc. of SAT Competition 2024 – Solver, Benchmark and Proof Checker Descriptions (Department of Computer Science Report Series B, Vol. B-2024-1), Marijn Heule, Markus Iser, Matti J...
work page 2024
-
[10]
Cristian Cadar, Daniel Dunbar, and Dawson Engler. 2008. KLEE: Unassisted and Automatic Generation of High- Coverage Tests for Complex Systems Programs. InProceedings of the 8th USENIX Conference on Operating Systems Design and Implementation (OSDI’08). USENIX Association, USA, 209–224
work page 2008
-
[11]
Cristiano Calcagno and Dino Distefano. 2011. Infer: An Automatic Program Verifier for Memory Safety of C Programs. InNASA Formal Methods, Mihaela Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi (Eds.). Springer, Berlin, Heidelberg, 459–465. doi:10.1007/978-3-642-20398-5_33
-
[12]
Rust Community. 2025. Miri. https://github.com/rust-lang/miri 22 Ayoun et al
work page 2025
-
[13]
Loïc Correnson. 2014. Qed. Computing What Remains to Be Proved. InNASA Formal Methods, David Hutchison, Takeo Kanade, Josef Kittler, Jon M. Kleinberg, Alfred Kobsa, Friedemann Mattern, John C. Mitchell, Moni Naor, Oscar Nierstrasz, C. Pandu Rangan, Bernhard Steffen, Demetri Terzopoulos, Doug Tygar, Gerhard Weikum, Julia M. Badger, and Kristin Yvonne Rozie...
-
[14]
Daniel Cumming. [n. d.]. Introducing KMIR: Concrete and Symbolic Execution of Rust MIR. https://runtimeverification.com/blog/introducing-kmir
-
[15]
Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. InProceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS’08/ETAPS’08). Springer-Verlag, Berlin, Heidelberg, 337–340
work page 2008
-
[16]
Xavier Denis, Jacques-Henri Jourdan, and Claude Marché. 2022. Creusot: A Foundry for the Deductive Verification of Rust Programs. InICFEM 2022 - 23th International Conference on Formal Engineering Methods. Springer Verlag
work page 2022
-
[17]
Jean-Christophe Filliâtre and Sylvain Conchon. 2006. Type-safe modular hash-consing. InProceedings of the 2006 Workshop on ML(Portland, Oregon, USA)(ML ’06). Association for Computing Machinery, New York, NY, USA, 12–19. doi:10.1145/1159876.1159880
-
[18]
Nima Rahimi Foroushaani and Bart Jacobs. 2022. Modular Formal Verification of Rust Programs with Unsafe Blocks. doi:10.48550/arXiv.2212.12976 arXiv:2212.12976 [cs]
-
[19]
José Fragoso Santos, Petar Maksimović, Sacha-Élie Ayoun, and Philippa Gardner. 2020. Gillian, Part i: A Multi- Language Platform for Symbolic Execution. InProceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2020). Association for Computing Machinery, New York, NY, USA, 927–942. doi:10.1145/3385412.3386014
-
[20]
José Fragoso Santos, Petar Maksimović, Gabriela Sampaio, and Philippa Gardner. 2019. JaVerT 2.0: Compositional Symbolic Execution for JavaScript.Proceedings of the ACM on Programming Languages3, POPL (Jan. 2019), 66:1–66:31. doi:10.1145/3290379
-
[21]
Lennard Gäher, Michael Sammler, Ralf Jung, Robbert Krebbers, and Derek Dreyer. [n. d.]. RefinedRust: A Type System for High-Assurance Verification of Rust Programs. 8 ([n. d.])
- [22]
-
[23]
Son Ho and Jonathan Protzenko. 2022. Aeneas: Rust Verification by Functional Translation.Proceedings of the ACM on Programming Languages6, ICFP (Aug. 2022), 116:711–116:741. doi:10.1145/3547647
-
[24]
Ralf Jung. 2025.MiniRust: A Precise Specification for “Rust lite / MIR plus”. https://github.com/minirust/minirust GitHub repository; Apache-2.0 / MIT dual-licensed
work page 2025
-
[25]
Steven Keuchel, Sander Huyghebaert, Georgy Lukyanov, and Dominique Devriese. 2022. Verified Symbolic Execution with Kripke Specification Monads (and No Meta-Programming).Proceedings of the ACM on Programming Languages6, ICFP (Aug. 2022), 194–224. doi:10.1145/3547628
-
[26]
Daniel Kroening and Michael Tautschnig. 2014. CBMC – C Bounded Model Checker. InTools and Algorithms for the Construction and Analysis of Systems, Erika Ábrahám and Klaus Havelund (Eds.). Springer, Berlin, Heidelberg, 389–391. doi:10.1007/978-3-642-54862-8_26
-
[27]
Andrea Lattuada, Travis Hance, Chanhee Cho, Matthias Brun, Isitha Subasinghe, Yi Zhou, Jon Howell, Bryan Parno, and Chris Hawblitzel. 2023. Verus: Verifying Rust Programs Using Linear Ghost Types.Proceedings of the ACM on Programming Languages7, OOPSLA1 (April 2023), 85:286–85:315. doi:10.1145/3586037
-
[28]
Quang Loc Le, Azalea Raad, Jules Villard, Josh Berdine, Derek Dreyer, and Peter W. O’Hearn. 2022. Finding Real Bugs in Big Programs with Incorrectness Logic.Proceedings of the ACM on Programming Languages6, OOPSLA1 (April 2022), 81:1–81:27. doi:10.1145/3527325
-
[29]
Nico Lehmann, Adam Geller, Niki Vazou, and Ranjit Jhala. 2022. Flux: Liquid Types for Rust. doi:10.48550/arXiv.2207. 04034 arXiv:2207.04034 [cs]
-
[30]
Appel, Sandrine Blazy, and Gordon Stewart
Xavier Leroy, Andrew W. Appel, Sandrine Blazy, and Gordon Stewart. 2012.The CompCert Memory Model, Version 2. Report. INRIA
work page 2012
-
[31]
Andreas Lööw, Daniele Nantes-Sobrinho, Sacha-Élie Ayoun, Caroline Cronjäger, Petar Maksimović, and Philippa Gardner. 2024. Compositional Symbolic Execution for Correctness and Incorrectness Reasoning. In38th European Conference on Object-Oriented Programming (ECOOP 2024) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 313), Jonathan Aldri...
-
[32]
Sirui Lu and Rastislav Bodík. 2023. Grisette: Symbolic Compilation as a Functional Programming Library.Proceedings of the ACM on Programming Languages7, POPL (Jan. 2023), 16:455–16:487. doi:10.1145/3571209
-
[33]
Petar Maksimović, Sacha-Élie Ayoun, José Fragoso Santos, and Philippa Gardner. 2021. Gillian, Part II: Real-World Verification for JavaScript and C. InComputer Aided Verification (Lecture Notes in Computer Science), Alexandra Silva Soteria: Efficient Symbolic Execution as a Functional Library 23 and K. Rustan M. Leino (Eds.). Springer International Publis...
-
[34]
Kayvan Memarian. 2023.The Cerberus C Semantics. Technical Report UCAM-CL-TR-981. University of Cambridge, Computer Laboratory. doi:10.48456/tr-981
-
[35]
Peter Müller, Malte Schwerhoff, and Alexander J. Summers. 2016. Viper: A Verification Infrastructure for Permission- Based Reasoning. InVerification, Model Checking, and Abstract Interpretation (Lecture Notes in Computer Science), Barbara Jobstmann and K. Rustan M. Leino (Eds.). Springer, Berlin, Heidelberg, 41–62. doi:10.1007/978-3-662-49122-5_2
-
[36]
Peter W. O’Hearn. 2019. Incorrectness Logic.Proceedings of the ACM on Programming Languages4, POPL (Dec. 2019), 10:1–10:32. doi:10.1145/3371078
-
[37]
Gaurav Parthasarathy, Thibault Dardinier, Benjamin Bonneau, Peter Müller, and Alexander J. Summers. 2024. To- wards Trustworthy Automated Program Verifiers: Formally Validating Translations into an Intermediate Verification Language.Towards Trustworthy Automated Program Verifiers: Formally Validating Translations into an Intermediate Verification Language...
-
[38]
2025.std::intrinsics — Compiler intrinsics module (Rust Standard Library)
Rust Language Project. 2025.std::intrinsics — Compiler intrinsics module (Rust Standard Library). https://doc.rust- lang.org/std/intrinsics/index.html Accessed: 2025-10-26
work page 2025
-
[39]
Azalea Raad, Josh Berdine, Hoang-Hai Dang, Derek Dreyer, Peter O’Hearn, and Jules Villard. 2020. Local Reasoning About the Presence of Bugs: Incorrectness Separation Logic. InComputer Aided Verification (Lecture Notes in Computer Science), Shuvendu K. Lahiri and Chao Wang (Eds.). Springer International Publishing, Cham, 225–252. doi:10.1007/978- 3-030-53291-8_14
-
[40]
Grigore Ros,u and Traian Florin S, erbănută. 2010. An Overview of the K Semantic Framework.The Journal of Logic and Algebraic Programming79, 6 (Aug. 2010), 397–434. doi:10.1016/j.jlap.2010.03.012
-
[41]
2025.rustc_const_eval / src/interpret — Rust Compiler (GitHub)
rust-lang/rust Contributors. 2025.rustc_const_eval / src/interpret — Rust Compiler (GitHub). https://github.com/rust- lang/rust/tree/master/compiler/rustc_const_eval/src/interpret
work page 2025
-
[42]
José Fragoso Santos, Petar Maksimović, Théotime Grohens, Julian Dolby, and Philippa Gardner. 2018. Symbolic Execution for JavaScript. InProceedings of the 20th International Symposium on Principles and Practice of Declarative Programming (PPDP ’18). Association for Computing Machinery, New York, NY, USA, 1–14. doi:10.1145/3236950. 3236956
-
[43]
The Kani Team. 2023. How Open Source Projects Are Using Kani to Write Better Software in Rust | AWS Open Source Blog. https://aws.amazon.com/blogs/opensource/how-open-source-projects-are-using-kani-to-write-better-software- in-rust/
work page 2023
-
[44]
The Racket Team. 2010. Reference: Racket. InThe Racket Language. https://racket-lang.org/
work page 2010
-
[45]
Emina Torlak and Rastislav Bodik. 2013. Growing Solver-Aided Languages with Rosette. InProceedings of the 2013 ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming & Software (Onward! 2013). Association for Computing Machinery, New York, NY, USA, 135–152. doi:10.1145/2509578.2509586
-
[46]
Emina Torlak and Rastislav Bodik. 2014. A Lightweight Symbolic Virtual Machine for Solver-Aided Host Languages. InProceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA, 530–541. doi:10.1145/2594291.2594340
-
[47]
Viktor Vafeiadis, Thibaut Balabonski, Soham Chakraborty, Robin Morisset, and Francesco Zappa Nardelli. 2015. Common Compiler Optimisations are Invalid in the C11 Memory Model and what we can do about it.SIGPLAN Not. 50, 1 (Jan. 2015), 209–220. doi:10.1145/2775051.2676995
-
[48]
2025.finetime: Accurate, flexible, and efficient time keeping
Quinten van Woerkom. 2025.finetime: Accurate, flexible, and efficient time keeping. https://crates.io/crates/finetime Rust crate; MIT OR Apache-2.0 license
work page 2025
-
[49]
Alexa VanHattum, Daniel Schwartz-Narbonne, Nathan Chong, and Adrian Sampson. 2022. Verifying Dynamic Trait Objects in Rust. InProceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP ’22). Association for Computing Machinery, New York, NY, USA, 321–330. doi:10.1145/3510457. 3513031
-
[50]
Neven Villani, Johannes Hostert, Derek Dreyer, and Ralf Jung. 2025. Tree Borrows.Proc. ACM Program. Lang.9, PLDI, Article 188 (June 2025), 24 pages. doi:10.1145/3735592 24 Ayoun et al. typeconst = IntofZ . t | Boolofbool typebinop = Add | Sub | And | Div | Eq | Geq typeexpr = Constofconst | Varofstring | BinOpofbinop * expr * expr | Letofstring * expr * e...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.