Executable formal semantics for the POSIX shell
Pith reviewed 2026-05-24 22:45 UTC · model grok-4.3
The pith
Smoosh supplies an executable small-step semantics for the POSIX shell that passes more conformance tests than seven other shells.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We define Smoosh as a formal, mechanized, executable small-step semantics for the POSIX shell. Direct comparison on three test suites shows it to be the most conformant of the eight shells examined, with Modernish identifying only a single bug inherited from its parser and no quirks.
What carries the argument
Smoosh, an executable small-step semantics for the POSIX shell implemented in mechanized form.
If this is right
- Smoosh can serve as a reference implementation for checking other shells or scripts.
- The semantics supports development of new shell tooling and formal analyses.
- A symbolic stepper built on the semantics can reveal subtle execution behavior in existing scripts.
- The model provides a stable foundation for designing or verifying future shell variants.
Where Pith is reading between the lines
- Developers writing portable shell scripts could use Smoosh as an oracle to catch non-standard assumptions earlier.
- The semantics could be extended to model common extensions or environment-specific behaviors seen in real deployments.
- Formal verification of system-administration tasks that rely on shell commands becomes feasible once the model is trusted.
Load-bearing premise
The three test suites together adequately represent the full POSIX shell specification rather than only a tested subset.
What would settle it
Execution of a new test suite that covers previously untested POSIX rules and produces more failures or quirks for Smoosh than for one of the other shells.
Figures
read the original abstract
The POSIX shell is a widely deployed, powerful tool for managing computer systems. The shell is the expert's control panel, a necessary tool for configuring, compiling, installing, maintaining, and deploying systems. Even though it is powerful, critical infrastructure, the POSIX shell is maligned and misunderstood. Its power and its subtlety are a dangerous combination. We define a formal, mechanized, executable small-step semantics for the POSIX shell, which we call Smoosh. We compared Smoosh against seven other shells that aim for some measure of POSIX compliance (bash, dash, zsh, OSH, mksh, ksh93, and yash). Using three test suites---the POSIX test suite, the Modernish test suite and shell diagnosis, and a test suite of our own device---we found Smoosh's semantics to be the most conformant to the POSIX standard. Modernish judges Smoosh to have the fewest bugs (just one, from using dash's parser) and no quirks. To show that our semantics is useful beyond yielding a conformant, executable shell, we also implemented a symbolic stepper to illuminate the subtle behavior of the shell. Smoosh will serve as a foundation for formal study of the POSIX shell, supporting research on and development of new shells, new tooling for shells, and new shell designs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper defines Smoosh, a formal mechanized executable small-step semantics for the POSIX shell. It evaluates Smoosh against seven other shells (bash, dash, zsh, OSH, mksh, ksh93, yash) on the POSIX test suite, the Modernish test suite, and an author-provided suite, concluding that Smoosh is the most conformant (fewest bugs per Modernish, just one, and no quirks). The work also includes a symbolic stepper to explore shell behavior.
Significance. An executable, mechanized semantics for the POSIX shell would be a useful foundation for formal study, tooling, and new shell designs, given the shell's role in critical infrastructure. The comparison across multiple suites and the symbolic stepper are concrete strengths if the conformance results are shown to be robust.
major comments (1)
- [Abstract / Evaluation] The headline claim (abstract) that Smoosh's semantics is the most conformant to the POSIX standard rests on results from three test suites. No coverage argument, mapping of tests to IEEE 1003.1 sections, or enumeration of covered grammar productions, redirection rules, parameter expansions, or special builtins is referenced, so it is unclear whether the suites establish conformance to the full specification or only to the tested subset.
minor comments (1)
- [Abstract] The abstract refers to 'a test suite of our own device' without indicating its size, structure, or relation to the other suites.
Simulated Author's Rebuttal
We thank the referee for the thoughtful review and the opportunity to clarify the scope of our evaluation. We address the major comment below and will make targeted revisions to improve the presentation of our conformance claims.
read point-by-point responses
-
Referee: [Abstract / Evaluation] The headline claim (abstract) that Smoosh's semantics is the most conformant to the POSIX standard rests on results from three test suites. No coverage argument, mapping of tests to IEEE 1003.1 sections, or enumeration of covered grammar productions, redirection rules, parameter expansions, or special builtins is referenced, so it is unclear whether the suites establish conformance to the full specification or only to the tested subset.
Authors: We agree that the abstract's phrasing could be read as claiming absolute conformance to the entire IEEE 1003.1 specification, which the test suites do not establish. Our claim is strictly relative: among the seven shells evaluated on the same three suites (the official POSIX conformance tests, the Modernish suite used by shell implementers, and our own diagnostic suite), Smoosh exhibits the fewest deviations. These suites are the accepted practical measures of POSIX shell behavior in the community. A exhaustive mapping of every test to grammar productions, redirections, expansions, and builtins would be valuable but is outside the paper's scope and would require a separate, lengthy appendix. We will revise the abstract and add a short subsection in the evaluation (Section 5) that enumerates the major language features exercised by the suites (e.g., simple and compound commands, all redirection operators, parameter and command substitutions, special builtins) and notes the absence of coverage for certain edge cases such as job control and certain locale-specific behaviors. This will make the limits of the claim explicit while preserving the relative comparison. revision: partial
Circularity Check
No significant circularity; conformance established via external test suites
full rationale
The paper defines an executable small-step semantics for the POSIX shell and evaluates conformance by running three external test suites (POSIX, Modernish, and authors' own) against Smoosh and other shells. The central claim—that Smoosh is most conformant—rests on comparative test outcomes rather than any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation. No derivation chain reduces a result to its own inputs by construction; the semantics is mechanized independently and then validated against independent benchmarks. This is the normal non-circular case for an executable semantics paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The POSIX shell specification can be captured by a small-step executable semantics.
Reference graph
Works this paper leans on
-
[1]
Sandrine Blazy and Xavier Leroy
POSIX.1 2017: The Open Group Base Specifications Issue 7 (IEEE Std 1003.1-2008). Sandrine Blazy and Xavier Leroy
work page 2017
-
[2]
https://doi.org/10.1007/s10817-009-9148-3 Andy Chu
Mechanized Semantics for the Clight Subset of the C Language.Journal of Automated Reasoning 43, 3 (01 Oct 2009), 263–288. https://doi.org/10.1007/s10817-009-9148-3 Andy Chu
-
[3]
Organised sound 8, 3 (2003), 321–330
Live coding in laptop performance. Organised sound 8, 3 (2003), 321–330. Loris D’Antoni, Rishabh Singh, and Michael Vaughn
work page 2003
-
[4]
NoFAQ: Synthesizing Command Repairs from Examples
NoFAQ: Synthesizing Command Repairs from Examples.CoRR abs/1608.08219 (2016). http://arxiv.org/abs/1608.08219 Martijn Dekker
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[5]
An Executable Formal Semantics of C with Applications. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12). ACM, New York, NY, USA, 533–544. https://doi.org/10.1145/2103656.2103719 Kathleen Fisher, Nate Foster, David Walker, and Kenny Q. Zhu
-
[6]
In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11)
Forest: A Language and Toolkit for Programming with Filestores. In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11) . ACM, New York, NY, USA, 292–306. https://doi.org/10.1145/2034773.2034814 Simson Garfinkel, Daniel Weise (Ed.), and Steven Strassman (Ed.)
-
[7]
https://github.com/Gabriel439/Haskell-Turtle-Library Michael Greenberg
Turtle: shell programming, Haskell style. https://github.com/Gabriel439/Haskell-Turtle-Library Michael Greenberg. 2018a. The POSIX shell is an interactive DSL for concurrency. DSLDI. Michael Greenberg. 2018b. Word expansion supports POSIX shell interactivity. In Programming Companion (presented at Programming eXperience (PX)). ACM. https://doi.org/10.1145...
-
[8]
In ECOOP 2010 – Object-Oriented Programming, Theo D’Hondt (Ed.)
The Essence of JavaScript. In ECOOP 2010 – Object-Oriented Programming, Theo D’Hondt (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 126–150. William Gallard Hatch and Matthew Flatt
work page 2010
-
[9]
Rash: from reckless interactions to reliable programs. In Proceedings of the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, GPCE 2018, Boston, MA, USA, November 5-6, 2018 , Eric Van Wyk and Tiark Rompf (Eds.). ACM, 28–39. https://doi.org/10.1145/3278122.3278129 Alec Heller and Jesse A. Tov
-
[10]
In JFLA 2017 - Vingt-huitième Journées Francophones des Langages Applicatifs
Le coquillage dans le CoLiS-mateur. In JFLA 2017 - Vingt-huitième Journées Francophones des Langages Applicatifs. Gourette, France. https://hal.archives-ouvertes.fr/hal-01432034 Nicolas Jeannerod, Claude Marché, and Ralf Treinen. 2017a. A Formally Verified Interpreter for a Shell-like Programming Language. In VSTTE 2017 - 9th Working Conference on Verifie...
work page 2017
-
[11]
RustBelt: Securing the Foundations of the Rust Programming Language. Proc. ACM Program. Lang. 2, POPL, Article 66 (Dec. 2017), 34 pages. https://doi.org/10. 1145/3158154 Idan Kamara
work page 2017
-
[12]
A Formal C Memory Model Supporting Integer-pointer Casts. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 326–335. https://doi.org/10.1145/2737924. 2738005 koalaman
-
[13]
In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08)
An Operational Semantics for JavaScript. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08) . Springer-Verlag, Berlin, Heidelberg, 307–325. https://doi.org/10.1007/978-3-540-89330-1_22 Karl Mazurak and Steve Zdancewic
- [14]
-
[15]
Into the Depths of C: Elaborating the De Facto Standards. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16) . ACM, New York, NY, USA, 1–15. https: //doi.org/10.1145/2908080.2908081 Scott Moore, Christos Dimoulas, Dan King, and Stephen Chong
-
[16]
In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP ’14)
Lem: Reusable Engineering of Real-world Semantics. In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP ’14). ACM, New York, NY, USA, 175–188. https://doi.org/10.1145/2628136.2628143 Marius Nita, Dan Grossman, and Craig Chambers
-
[17]
A theory of platform-dependent low-level software. InProceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, San Francisco, California, USA, January 7-12, 2008 , George C. Necula and Philip Wadler (Eds.). ACM, 209–220. https://doi.org/10.1145/1328438. 1328465 Gian Ntzik
-
[18]
In 32nd European Conference on Object-Oriented Programming (ECOOP 2018)
A Concurrent Specification of POSIX File Systems. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018). https://doi.org/10.4230/LIPIcs. ECOOP.2018.4 Yann Régis-Gianas, Nicolas Jeannerod, and Ralf Treinen
-
[19]
In SLE 2018 - ACM SIGPLAN International Conference on Software Language Engineering
Morbig: A Static Parser for POSIX Shell. In SLE 2018 - ACM SIGPLAN International Conference on Software Language Engineering . Boston, United States. https://doi.org/10.1145/ 3276604.3276615 Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek
-
[20]
In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15)
SibylFS: Formal Specification and Oracle-based Testing for POSIX and Real-world File Systems. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15). ACM, New York, NY, USA, 38–53. https://doi.org/10.1145/2815400.2815411 Grigore Roşu and Traian Florin Şerbănuţă
-
[21]
https://doi.org/10.1016/j.jlap.2010.03.012 Henry Schreiner, Tomer Filiba, et al
An Overview of the K Semantic Framework.Journal of Logic and Algebraic Programming 79, 6 (2010), 397–434. https://doi.org/10.1016/j.jlap.2010.03.012 Henry Schreiner, Tomer Filiba, et al
-
[22]
Rust Distilled: An Expressive Tower of Languages
Rust Distilled: An Expressive Tower of Languages. CoRR abs/1806.02693 (2018). arXiv:1806.02693 http://arxiv.org/abs/1806.02693 Aaron Weiss, Daniel Patterson, Nicholas D. Matsakis, and Amal Ahmed
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[23]
Oxide: The Essence of Rust. CoRR abs/1903.00982 (2019). arXiv:1903.00982 http://arxiv.org/abs/1903.00982
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.