pith. sign in

arxiv: 1907.05308 · v1 · pith:A5PKPB5Tnew · submitted 2019-07-11 · 💻 cs.PL · cs.OS

Executable formal semantics for the POSIX shell

Pith reviewed 2026-05-24 22:45 UTC · model grok-4.3

classification 💻 cs.PL cs.OS
keywords POSIX shellformal semanticsexecutable semanticssmall-step semanticsshell conformancemechanized semanticssymbolic execution
0
0 comments X

The pith

Smoosh supplies an executable small-step semantics for the POSIX shell that passes more conformance tests than seven other shells.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors built Smoosh, a mechanized executable semantics for the POSIX shell, then ran it against the POSIX test suite, the Modernish suite, and their own tests. Smoosh matched the standard more closely than bash, dash, zsh, OSH, mksh, ksh93, and yash, with Modernish reporting only one bug and no quirks. The shell is critical infrastructure for system configuration and deployment, yet its subtleties often produce unexpected behavior. An executable model lets researchers and tool builders work directly with the standard's rules instead of relying on any particular implementation. The work also includes a symbolic stepper that exposes otherwise opaque execution paths.

Core claim

We define Smoosh as a formal, mechanized, executable small-step semantics for the POSIX shell. Direct comparison on three test suites shows it to be the most conformant of the eight shells examined, with Modernish identifying only a single bug inherited from its parser and no quirks.

What carries the argument

Smoosh, an executable small-step semantics for the POSIX shell implemented in mechanized form.

If this is right

  • Smoosh can serve as a reference implementation for checking other shells or scripts.
  • The semantics supports development of new shell tooling and formal analyses.
  • A symbolic stepper built on the semantics can reveal subtle execution behavior in existing scripts.
  • The model provides a stable foundation for designing or verifying future shell variants.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Developers writing portable shell scripts could use Smoosh as an oracle to catch non-standard assumptions earlier.
  • The semantics could be extended to model common extensions or environment-specific behaviors seen in real deployments.
  • Formal verification of system-administration tasks that rely on shell commands becomes feasible once the model is trusted.

Load-bearing premise

The three test suites together adequately represent the full POSIX shell specification rather than only a tested subset.

What would settle it

Execution of a new test suite that covers previously untested POSIX rules and produces more failures or quirks for Smoosh than for one of the other shells.

Figures

Figures reproduced from arXiv: 1907.05308 by Austin J. Blatt, Michael Greenberg.

Figure 1
Figure 1. Figure 1: Expansion examples specification [Austin Group 2018] §2.6. The examples in Figures 1b and 1d take place in a directory with three files (a, b, and c); the example in Figure 1e takes place in a different directory, where the result of the first echo indicates which files begin with the letter a. 2.2 Smoosh: a foundational, formal interpretation of the POSIX specification Our work aims to offer (a) a formal … view at source ↗
Figure 2
Figure 2. Figure 2: The shell’s source syntax syntax (e.g., prefix# ). We write x ? for optional nonterminals. We use Kleene star x ∗ to represent lists and x + to represent non-empty lists. We use n to refer to natural numbers and b to refer to booleans, tagging such variables with a descriptive name. For example, n$? is a number representing an exit status and b$() is a boolean that indicates whether a command substitution … view at source ↗
Figure 3
Figure 3. Figure 3: The shell’s state 4.1 State The POSIX shell has to track a significant amount of state ( [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Smoosh’s runtime extensions for use in the small-step semantics [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Small-step semantics for word expansion [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Small-step semantics for initial word expansion (expand) and command substitution [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Small-step semantics for evaluation of negation [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Small-step semantics for simple commands (part 1: expansion) [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Helpers for simple command execution and applied (CmdRedirDone), saving file descriptor information sfds for later unwinding. Then we expand the words in the assignments (CmdAssign, CmdAssignErr) and store the results in a fresh local environment (CmdAssignSet). At thus point, we are already in dangerous territory: the POSIX specification allows assignments to be expanded before redirections when there are… view at source ↗
Figure 10
Figure 10. Figure 10: Small-step semantics for simple commands (part 2: evaluation) [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗
read the original abstract

The POSIX shell is a widely deployed, powerful tool for managing computer systems. The shell is the expert's control panel, a necessary tool for configuring, compiling, installing, maintaining, and deploying systems. Even though it is powerful, critical infrastructure, the POSIX shell is maligned and misunderstood. Its power and its subtlety are a dangerous combination. We define a formal, mechanized, executable small-step semantics for the POSIX shell, which we call Smoosh. We compared Smoosh against seven other shells that aim for some measure of POSIX compliance (bash, dash, zsh, OSH, mksh, ksh93, and yash). Using three test suites---the POSIX test suite, the Modernish test suite and shell diagnosis, and a test suite of our own device---we found Smoosh's semantics to be the most conformant to the POSIX standard. Modernish judges Smoosh to have the fewest bugs (just one, from using dash's parser) and no quirks. To show that our semantics is useful beyond yielding a conformant, executable shell, we also implemented a symbolic stepper to illuminate the subtle behavior of the shell. Smoosh will serve as a foundation for formal study of the POSIX shell, supporting research on and development of new shells, new tooling for shells, and new shell designs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper defines Smoosh, a formal mechanized executable small-step semantics for the POSIX shell. It evaluates Smoosh against seven other shells (bash, dash, zsh, OSH, mksh, ksh93, yash) on the POSIX test suite, the Modernish test suite, and an author-provided suite, concluding that Smoosh is the most conformant (fewest bugs per Modernish, just one, and no quirks). The work also includes a symbolic stepper to explore shell behavior.

Significance. An executable, mechanized semantics for the POSIX shell would be a useful foundation for formal study, tooling, and new shell designs, given the shell's role in critical infrastructure. The comparison across multiple suites and the symbolic stepper are concrete strengths if the conformance results are shown to be robust.

major comments (1)
  1. [Abstract / Evaluation] The headline claim (abstract) that Smoosh's semantics is the most conformant to the POSIX standard rests on results from three test suites. No coverage argument, mapping of tests to IEEE 1003.1 sections, or enumeration of covered grammar productions, redirection rules, parameter expansions, or special builtins is referenced, so it is unclear whether the suites establish conformance to the full specification or only to the tested subset.
minor comments (1)
  1. [Abstract] The abstract refers to 'a test suite of our own device' without indicating its size, structure, or relation to the other suites.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful review and the opportunity to clarify the scope of our evaluation. We address the major comment below and will make targeted revisions to improve the presentation of our conformance claims.

read point-by-point responses
  1. Referee: [Abstract / Evaluation] The headline claim (abstract) that Smoosh's semantics is the most conformant to the POSIX standard rests on results from three test suites. No coverage argument, mapping of tests to IEEE 1003.1 sections, or enumeration of covered grammar productions, redirection rules, parameter expansions, or special builtins is referenced, so it is unclear whether the suites establish conformance to the full specification or only to the tested subset.

    Authors: We agree that the abstract's phrasing could be read as claiming absolute conformance to the entire IEEE 1003.1 specification, which the test suites do not establish. Our claim is strictly relative: among the seven shells evaluated on the same three suites (the official POSIX conformance tests, the Modernish suite used by shell implementers, and our own diagnostic suite), Smoosh exhibits the fewest deviations. These suites are the accepted practical measures of POSIX shell behavior in the community. A exhaustive mapping of every test to grammar productions, redirections, expansions, and builtins would be valuable but is outside the paper's scope and would require a separate, lengthy appendix. We will revise the abstract and add a short subsection in the evaluation (Section 5) that enumerates the major language features exercised by the suites (e.g., simple and compound commands, all redirection operators, parameter and command substitutions, special builtins) and notes the absence of coverage for certain edge cases such as job control and certain locale-specific behaviors. This will make the limits of the claim explicit while preserving the relative comparison. revision: partial

Circularity Check

0 steps flagged

No significant circularity; conformance established via external test suites

full rationale

The paper defines an executable small-step semantics for the POSIX shell and evaluates conformance by running three external test suites (POSIX, Modernish, and authors' own) against Smoosh and other shells. The central claim—that Smoosh is most conformant—rests on comparative test outcomes rather than any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation. No derivation chain reduces a result to its own inputs by construction; the semantics is mechanized independently and then validated against independent benchmarks. This is the normal non-circular case for an executable semantics paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only; main domain assumption is that POSIX behavior admits a small-step executable formalization. No free parameters or invented entities are mentioned.

axioms (1)
  • domain assumption The POSIX shell specification can be captured by a small-step executable semantics.
    Invoked to justify defining Smoosh as the semantics.

pith-pipeline@v0.9.0 · 5765 in / 1017 out tokens · 23202 ms · 2026-05-24T22:45:31.638469+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · 2 internal anchors

  1. [1]

    Sandrine Blazy and Xavier Leroy

    POSIX.1 2017: The Open Group Base Specifications Issue 7 (IEEE Std 1003.1-2008). Sandrine Blazy and Xavier Leroy

  2. [2]

    https://doi.org/10.1007/s10817-009-9148-3 Andy Chu

    Mechanized Semantics for the Clight Subset of the C Language.Journal of Automated Reasoning 43, 3 (01 Oct 2009), 263–288. https://doi.org/10.1007/s10817-009-9148-3 Andy Chu

  3. [3]

    Organised sound 8, 3 (2003), 321–330

    Live coding in laptop performance. Organised sound 8, 3 (2003), 321–330. Loris D’Antoni, Rishabh Singh, and Michael Vaughn

  4. [4]

    NoFAQ: Synthesizing Command Repairs from Examples

    NoFAQ: Synthesizing Command Repairs from Examples.CoRR abs/1608.08219 (2016). http://arxiv.org/abs/1608.08219 Martijn Dekker

  5. [5]

    In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12)

    An Executable Formal Semantics of C with Applications. In Proceedings of the 39th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’12). ACM, New York, NY, USA, 533–544. https://doi.org/10.1145/2103656.2103719 Kathleen Fisher, Nate Foster, David Walker, and Kenny Q. Zhu

  6. [6]

    In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11)

    Forest: A Language and Toolkit for Programming with Filestores. In Proceedings of the 16th ACM SIGPLAN International Conference on Functional Programming (ICFP ’11) . ACM, New York, NY, USA, 292–306. https://doi.org/10.1145/2034773.2034814 Simson Garfinkel, Daniel Weise (Ed.), and Steven Strassman (Ed.)

  7. [7]

    https://github.com/Gabriel439/Haskell-Turtle-Library Michael Greenberg

    Turtle: shell programming, Haskell style. https://github.com/Gabriel439/Haskell-Turtle-Library Michael Greenberg. 2018a. The POSIX shell is an interactive DSL for concurrency. DSLDI. Michael Greenberg. 2018b. Word expansion supports POSIX shell interactivity. In Programming Companion (presented at Programming eXperience (PX)). ACM. https://doi.org/10.1145...

  8. [8]

    In ECOOP 2010 – Object-Oriented Programming, Theo D’Hondt (Ed.)

    The Essence of JavaScript. In ECOOP 2010 – Object-Oriented Programming, Theo D’Hondt (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 126–150. William Gallard Hatch and Matthew Flatt

  9. [9]

    Rash: from reckless interactions to reliable programs. In Proceedings of the 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, GPCE 2018, Boston, MA, USA, November 5-6, 2018 , Eric Van Wyk and Tiark Rompf (Eds.). ACM, 28–39. https://doi.org/10.1145/3278122.3278129 Alec Heller and Jesse A. Tov

  10. [10]

    In JFLA 2017 - Vingt-huitième Journées Francophones des Langages Applicatifs

    Le coquillage dans le CoLiS-mateur. In JFLA 2017 - Vingt-huitième Journées Francophones des Langages Applicatifs. Gourette, France. https://hal.archives-ouvertes.fr/hal-01432034 Nicolas Jeannerod, Claude Marché, and Ralf Treinen. 2017a. A Formally Verified Interpreter for a Shell-like Programming Language. In VSTTE 2017 - 9th Working Conference on Verifie...

  11. [11]

    RustBelt: Securing the Foundations of the Rust Programming Language. Proc. ACM Program. Lang. 2, POPL, Article 66 (Dec. 2017), 34 pages. https://doi.org/10. 1145/3158154 Idan Kamara

  12. [12]

    In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15)

    A Formal C Memory Model Supporting Integer-pointer Casts. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). ACM, New York, NY, USA, 326–335. https://doi.org/10.1145/2737924. 2738005 koalaman

  13. [13]

    In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08)

    An Operational Semantics for JavaScript. In Proceedings of the 6th Asian Symposium on Programming Languages and Systems (APLAS ’08) . Springer-Verlag, Berlin, Heidelberg, 307–325. https://doi.org/10.1007/978-3-540-89330-1_22 Karl Mazurak and Steve Zdancewic

  14. [14]

    ABASH: Finding Bugs in Bash Scripts. In PLAS. 105–114. https://doi.org/10.1145/ 1255329.1255347 Kayvan Memarian, Justus Matthiesen, James Lingard, Kyndylan Nienhuis, David Chisnall, Robert N. M. Watson, and Peter Sewell

  15. [15]

    In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16)

    Into the Depths of C: Elaborating the De Facto Standards. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16) . ACM, New York, NY, USA, 1–15. https: //doi.org/10.1145/2908080.2908081 Scott Moore, Christos Dimoulas, Dan King, and Stephen Chong

  16. [16]

    In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP ’14)

    Lem: Reusable Engineering of Real-world Semantics. In Proceedings of the 19th ACM SIGPLAN International Conference on Functional Programming (ICFP ’14). ACM, New York, NY, USA, 175–188. https://doi.org/10.1145/2628136.2628143 Marius Nita, Dan Grossman, and Craig Chambers

  17. [17]

    InProceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, San Francisco, California, USA, January 7-12, 2008 , George C

    A theory of platform-dependent low-level software. InProceedings of the 35th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2008, San Francisco, California, USA, January 7-12, 2008 , George C. Necula and Philip Wadler (Eds.). ACM, 209–220. https://doi.org/10.1145/1328438. 1328465 Gian Ntzik

  18. [18]

    In 32nd European Conference on Object-Oriented Programming (ECOOP 2018)

    A Concurrent Specification of POSIX File Systems. In 32nd European Conference on Object-Oriented Programming (ECOOP 2018). https://doi.org/10.4230/LIPIcs. ECOOP.2018.4 Yann Régis-Gianas, Nicolas Jeannerod, and Ralf Treinen

  19. [19]

    In SLE 2018 - ACM SIGPLAN International Conference on Software Language Engineering

    Morbig: A Static Parser for POSIX Shell. In SLE 2018 - ACM SIGPLAN International Conference on Software Language Engineering . Boston, United States. https://doi.org/10.1145/ 3276604.3276615 Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek

  20. [20]

    In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15)

    SibylFS: Formal Specification and Oracle-based Testing for POSIX and Real-world File Systems. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP ’15). ACM, New York, NY, USA, 38–53. https://doi.org/10.1145/2815400.2815411 Grigore Roşu and Traian Florin Şerbănuţă

  21. [21]

    https://doi.org/10.1016/j.jlap.2010.03.012 Henry Schreiner, Tomer Filiba, et al

    An Overview of the K Semantic Framework.Journal of Logic and Algebraic Programming 79, 6 (2010), 397–434. https://doi.org/10.1016/j.jlap.2010.03.012 Henry Schreiner, Tomer Filiba, et al

  22. [22]

    Rust Distilled: An Expressive Tower of Languages

    Rust Distilled: An Expressive Tower of Languages. CoRR abs/1806.02693 (2018). arXiv:1806.02693 http://arxiv.org/abs/1806.02693 Aaron Weiss, Daniel Patterson, Nicholas D. Matsakis, and Amal Ahmed

  23. [23]

    CoRR abs/1903.00982 (2019)

    Oxide: The Essence of Rust. CoRR abs/1903.00982 (2019). arXiv:1903.00982 http://arxiv.org/abs/1903.00982