OxyMake: A Formally-Specified, Content-Addressable Workflow Engine
Pith reviewed 2026-06-26 15:49 UTC · model grok-4.3
The pith
OxyMake decides workflow re-runs with a BLAKE3 content hash of rule source, inputs, parameters, environment and platform instead of file timestamps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
OxyMake replaces the mtime proxy with a content-addressed cache key: a BLAKE3 hash of rule source, input content, parameters, environment, and platform. Because the key is a pure function of these declared inputs, the caching decision survives mtime churn and travels across same-platform machines and shared caches. Phantom re-runs vanish for declared inputs. The spec stays declarative and statically parseable, keeping the Make rule model so Snakemake pipelines port directly. DAG resolution is an order of magnitude faster than Snakemake's on large graphs, but a cold end-to-end run is slower, the price of content-addressed bookkeeping repaid several-fold on the warm re-run. Execution is daemon
What carries the argument
The content-addressed cache key formed by a BLAKE3 hash over rule source, input content, parameters, environment and platform; this pure function replaces mtime as the sole determinant of whether a job must re-execute.
If this is right
- Caching decisions remain correct after git checkouts or tree copies that alter mtimes but leave content unchanged.
- The same cache key can be shared across different machines running the identical platform.
- Snakemake pipelines can be used directly because the rule model stays declarative and statically parseable.
- Concurrent sessions execute safely without a daemon by using the claim/reclaim protocol.
- DAG resolution completes an order of magnitude faster than Snakemake on large graphs while the bookkeeping cost is recovered on repeated warm runs.
Where Pith is reading between the lines
- Teams could maintain a shared content-addressed cache for identical workflows because the key depends only on declared inputs rather than local timestamps.
- Any input that is not declared in a rule remains invisible to the cache key, so changes to it will not trigger re-execution.
- The TLA+ model could be extended to more than three sessions once the implementation confirms that atomic commits hold under higher concurrency.
Load-bearing premise
The TLA+ model-checking of cross-session safety assumes atomic state commits in the claim/reclaim protocol.
What would settle it
Run two concurrent sessions on the same workflow and observe whether both execute the same job when the protocol should prevent duplication, or alter mtimes on inputs without changing their content and check whether the workflow still skips re-execution.
Figures
read the original abstract
Make-lineage workflow runners decide whether a job must re-run from file-modification time (mtime, a timestamp) -- a broken proxy for the question that matters: did the content change? A git checkout, a tree copy, or a backup restore rewrites mtimes without touching content, forcing spurious re-execution; and in the reverse case -- when an output looks newer than its inputs but its content is stale -- the stale output is silently reused. (Snakemake 7's per-output provenance survives this churn, as local bookkeeping; GNU Make and pure-mtime fast paths are where it bites.) OxyMake, a single-binary Rust workflow engine, replaces the proxy with a content-addressed cache key: a BLAKE3 hash of rule source, input content, parameters, environment, and platform. Because the key is a pure function of these declared inputs, the caching decision survives mtime churn and travels across same-platform machines and shared caches. Phantom re-runs vanish for declared inputs (no sandbox: an undeclared input is invisible to the key). The spec stays declarative and statically parseable, keeping the Make rule model so Snakemake pipelines port directly. DAG resolution is an order of magnitude faster than Snakemake's on large graphs, but a cold end-to-end run is slower -- the price of content-addressed bookkeeping -- repaid several-fold on the warm re-run that caching exists to serve (exact figures, hardware, and a bundled reproducer are in the evaluation). Execution is daemon-free via a cooperative claim/reclaim protocol (sessions claim jobs, reclaiming stalled ones); today two sessions duplicate work safely rather than coordinate, and wiring the protocol as a hard execution gate is staged, not yet done. Cross-session safety is specified in TLA+ and model-checked over all interleavings for 2-3 sessions, assuming atomic state commits. A plan-of-record lockfile and NDJSON event stream record exactly what ran.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. OxyMake is a single-binary Rust workflow engine that replaces mtime-based re-run decisions with a content-addressed BLAKE3 hash key computed from rule source, input content, parameters, environment, and platform. The declarative spec remains Make-compatible for direct Snakemake porting. DAG resolution is claimed to be an order of magnitude faster than Snakemake on large graphs, with cold runs slower but warm re-runs faster due to portable caching. A cooperative claim/reclaim protocol enables cross-session safety (two sessions may duplicate work safely today), specified and model-checked in TLA+ over all interleavings for 2-3 sessions under the assumption of atomic state commits; execution is recorded in a plan-of-record lockfile and NDJSON stream.
Significance. If the claims hold, the work directly addresses a well-known limitation of mtime proxies in workflow systems (spurious re-runs after git checkouts or restores, and silent reuse of stale outputs), while preserving the familiar declarative rule model. The machine-checked TLA+ verification of the claim/reclaim protocol is a concrete strength that provides rigorous evidence for the safety invariants under the stated atomicity assumption. Portable content-addressed caching across machines and shared caches could improve reproducibility and efficiency in software engineering and scientific pipelines.
major comments (2)
- [Abstract] Abstract (TLA+ safety paragraph): The cross-session safety claim rests on TLA+ model-checking of the claim/reclaim protocol over all interleavings for 2-3 sessions, but only under the explicit assumption of atomic state commits. The manuscript provides no argument or evidence that the Rust implementation's shared-state updates (via filesystem or lockfile) are atomic; non-atomic operations would mean the verified invariants do not transfer, allowing the possibility that two sessions both execute the same job.
- [Evaluation] Evaluation (performance claims): The manuscript asserts that 'DAG resolution is an order of magnitude faster than Snakemake's on large graphs' and that the cold-run overhead 'is repaid several-fold on the warm re-run', yet supplies no quantitative data, tables, figures, hardware specifications, or error analysis despite referencing an evaluation section and a bundled reproducer.
minor comments (1)
- The abstract states that 'exact figures, hardware, and a bundled reproducer are in the evaluation' but these elements are absent from the manuscript text, which weakens the ability to assess the empirical claims.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed review. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract (TLA+ safety paragraph): The cross-session safety claim rests on TLA+ model-checking of the claim/reclaim protocol over all interleavings for 2-3 sessions, but only under the explicit assumption of atomic state commits. The manuscript provides no argument or evidence that the Rust implementation's shared-state updates (via filesystem or lockfile) are atomic; non-atomic operations would mean the verified invariants do not transfer, allowing the possibility that two sessions both execute the same job.
Authors: We agree that the TLA+ invariants transfer to the implementation only under the stated atomicity assumption, which the manuscript already notes but does not further justify at the code level. In revision we will add a dedicated implementation subsection that describes the filesystem primitives used (atomic rename for the plan-of-record lockfile, fsync on the NDJSON stream, and single-writer semantics) and explicitly qualifies the safety claim to rest on those primitives approximating atomic commits. The abstract will be updated to foreground the assumption. revision: yes
-
Referee: [Evaluation] Evaluation (performance claims): The manuscript asserts that 'DAG resolution is an order of magnitude faster than Snakemake's on large graphs' and that the cold-run overhead 'is repaid several-fold on the warm re-run', yet supplies no quantitative data, tables, figures, hardware specifications, or error analysis despite referencing an evaluation section and a bundled reproducer.
Authors: The referee correctly observes that the submitted version references quantitative claims and an evaluation section with reproducer yet omits the supporting data, tables, figures, hardware details, and error analysis. This omission will be corrected by inserting the benchmark results (DAG resolution timings on large graphs, cold vs. warm run comparisons, hardware specification, and statistical summary) directly into the evaluation section of the revised manuscript. revision: yes
Circularity Check
No circularity: implementation and TLA+ model are self-contained
full rationale
The paper describes an implementation of a content-addressed workflow engine and its TLA+ specification for cross-session safety. No mathematical derivations, equations, or predictions are present that reduce to fitted parameters or self-referential definitions. The TLA+ verification is explicitly conditioned on atomic state commits, with no load-bearing self-citations or ansatzes smuggled in. The central claims rest on the described protocol and model rather than any circular reduction to inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Atomic state commits in the claim/reclaim protocol
Reference graph
Works this paper leans on
-
[1]
A survey on schedul- ing strategies for workflows in cloud environment and emerging trends.ACM Computing Surveys, 52(4):1–36, 2019
Mainak Adhikari, Tarachand Amgoth, and Satish Narayana Srirama. A survey on schedul- ing strategies for workflows in cloud environment and emerging trends.ACM Computing Surveys, 52(4):1–36, 2019. 34
2019
-
[2]
Grüning, et al
Enis Afgan, Dannon Baker, Bérénice Batut, Marius van den Beek, Dave Bouvier, Martin Čech, John Chilton, Dave Clements, Nate Coraor, Björn A. Grüning, et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update. Nucleic Acids Research, 46(W1):W537–W544, 2018
2018
-
[3]
Argo Workflows: The workflow engine for Kubernetes
Argoproj contributors. Argo Workflows: The workflow engine for Kubernetes. https: //github.com/argoproj/argo-workflows, 2018. CNCF graduated project. Accessed: 2026-04-01
2018
-
[4]
Starlark language specification
Bazel contributors. Starlark language specification. https://github.com/ bazelbuild/starlark/blob/master/spec.md, 2018. Configuration lan- guage for Bazel, formerly Skylark. Accessed: 2026-04-01
2018
-
[5]
Apache Airflow: A platform to programmatically author, schedule, and monitor workflows.https://airflow
Maxime Beauchemin and Apache Software Foundation. Apache Airflow: A platform to programmatically author, schedule, and monitor workflows.https://airflow. apache.org, 2015. Originally developed at Airbnb, Apache top-level project 2019. Accessed: 2026-04-01
2015
-
[6]
Chue Hong, Daniel S
Neil P. Chue Hong, Daniel S. Katz, Michelle Barker, Anna-Lena Lamprecht, Carlos Mar- tinez, Fotis E. Psomopoulos, Jen Harrow, Leyla Jael Castro, Morane Gruenpeter, Paula An- drea Martinez, and Tom Honeyman. FAIR Principles for Research Software (FAIR4RS Principles). Recommendation, Research Data Alliance, 2022. Extends the FAIR Data Principles (Wilkinson ...
2022
-
[7]
Reproducible and user-controlled software en- vironments in HPC with Guix
Ludovic Courtès and Ricardo Wurmus. Reproducible and user-controlled software en- vironments in HPC with Guix. InEuro-Par 2015: Parallel Processing Workshops, pages 579–591. Springer, 2015. GNU Guix as a from-source, content-addressed package manager for scientific computing. Establishes Guix’s bit-reproducibility guarantee from input hash to binary output
2015
-
[8]
Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, and The CWL Community
Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reyes, Bogdan Gavrilović, Carole Goble, and The CWL Community. Methods included: Standardizing computational reuse and portability with the Common Workflow Language.Communications of the ACM, 65(6):54–63, 2022
2022
-
[9]
Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira da Silva, Miron Livny, and Kent Wenger
Ewa Deelman, Karan Vahi, Gideon Juve, Mats Rynge, Scott Callaghan, Philip J. Maechling, Rajiv Mayani, Weiwei Chen, Rafael Ferreira da Silva, Miron Livny, and Kent Wenger. Pegasus, a workflow management system for science automation.Future Generation Computer Systems, 46:17–35, 2015
2015
-
[10]
Floden, Pablo Prieto Barja, Emilio Palumbo, and Cedric Notredame
Paolo Di Tommaso, Maria Chatzou, Evan W. Floden, Pablo Prieto Barja, Emilio Palumbo, and Cedric Notredame. Nextflow enables reproducible computational workflows.Nature Biotechnology, 35(4):316–319, 2017
2017
-
[11]
PhD thesis, Utrecht University, 2006
Eelco Dolstra.The Purely Functional Software Deployment Model. PhD thesis, Utrecht University, 2006
2006
-
[12]
Stuart I. Feldman. Make – a program for maintaining computer programs.Software: Practice and Experience, 9(4):255–265, 1979. 35
1979
-
[13]
Crusoe, Kristian Peters, and Daniel Schober
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, and Daniel Schober. FAIR Computational Workflows. Data Intelligence, 2(1-2):108–121, 2020
2020
-
[14]
Dhall: A programmable configuration language, 2024
Gabriel Gonzalez. Dhall: A programmable configuration language, 2024. A total functional language that guarantees termination
2024
-
[15]
Bazel: A fast, scalable, multi-language and extensible build system
Google. Bazel: A fast, scalable, multi-language and extensible build system. https: //bazel.build, 2015. Open-source release of Google’s internal Blaze build system. Accessed: 2026-04-01
2015
-
[16]
Snakemake—a scalable bioinformatics workflow engine.Bioinformatics, 28(19):2520–2522, 2012
Johannes Köster and Sven Rahmann. Snakemake—a scalable bioinformatics workflow engine.Bioinformatics, 28(19):2520–2522, 2012
2012
-
[17]
Addison-Wesley, 2002
Leslie Lamport.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, 2002
2002
-
[18]
Build faster with Buck2: Our open source build sys- tem
Meta Engineering. Build faster with Buck2: Our open source build sys- tem. https://engineering.fb.com/2023/04/06/open-source/ buck2-open-source-large-scale-build-system/ , 2023. Meta Engi- neering Blog. Accessed: 2026-04-01
2023
-
[19]
Shake before building: Replacing Make with Haskell
Neil Mitchell. Shake before building: Replacing Make with Haskell. InProceedings of the 17th ACM SIGPLAN International Conference on Functional Programming, pages 55–66, 2012
2012
-
[20]
Build Systems à la Carte
Andrey Mokhov, Neil Mitchell, and Simon Peyton Jones. Build Systems à la Carte. In Proceedings of the ACM on Programming Languages, volume 2, pages 1–29, 2018
2018
-
[21]
Build systems à la carte: Theory and practice.Journal of Functional Programming, 30:e11, 2020
Andrey Mokhov, Neil Mitchell, and Simon Peyton Jones. Build systems à la carte: Theory and practice.Journal of Functional Programming, 30:e11, 2020
2020
-
[22]
Hall, Christopher H
Felix Mölder, Kim Philipp Jablonski, Brice Letcher, Michael B. Hall, Christopher H. Tomkins-Tinch, Vanessa Sochat, Jan Forster, Soohyun Lee, Sven O. Twardziok, Alexander Kanitz, Andreas Wilm, Manuel Holtgrewe, Sven Rahmann, Sven Nahnsen, and Johannes Köster. Sustainable data analysis with Snakemake.F1000Research, 10:33, 2021
2021
-
[23]
Jordan, and Ion Stoica
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. Ray: A distributed framework for emerging AI applications. InProceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI 18), pages 561–577, 2018
2018
-
[24]
How Amazon Web Services uses formal methods.Communications of the ACM, 58(4):66–73, 2015
Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff. How Amazon Web Services uses formal methods.Communications of the ACM, 58(4):66–73, 2015
2015
-
[25]
BLAKE3: One function, fast everywhere
Jack O’Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox- O’Hearn. BLAKE3: One function, fast everywhere. https://github.com/ BLAKE3-team/BLAKE3, 2020. Accessed: 2026-03-24. 36
2020
-
[26]
BLAKE3 specification
Jack O’Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox- O’Hearn. BLAKE3 specification. https://github.com/BLAKE3-team/ BLAKE3-specs/blob/master/blake3.pdf, 2020. Benchmarks: 6.9 GiB/s single-threaded on Cascade Lake-SP 8275CL (16 KiB input); AVX-512 exceeds 8 GB/s. Accessed: 2026-04-01
2020
-
[27]
Roger D. Peng. Reproducible research in computational science.Science, 334(6060):1226– 1227, 2011
2011
-
[28]
petgraph: Graph data structure library for Rust
petgraph contributors. petgraph: Graph data structure library for Rust. https: //github.com/petgraph/petgraph, 2024. O(|V|+|E|) topological sort via Kahn’s algorithm. Accessed: 2026-03-24
2024
-
[29]
CWL-workflows: Guix-managed Common Workflow Language ref- erence workflows
Pjotr Prins. CWL-workflows: Guix-managed Common Workflow Language ref- erence workflows. Software repository, https://github.com/pjotrp/ CWL-workflows, 2020. Software artifact (fork of hacchy1983/CWL-workflows). Ref- erence workflows illustrating how a CWL runner composes with Guix-managed environ- ments: orchestration (CWL/cwltool) and substrate (Guix st...
2020
-
[30]
Dask: Parallel computation with blocked algorithms and task scheduling
Matthew Rocklin. Dask: Parallel computation with blocked algorithms and task scheduling. InProceedings of the 14th Python in Science Conference (SciPy 2015), pages 126–132, 2015
2015
-
[31]
Fernández, Daniel Garijo, Björn Grüning, Lorraine La Rosa, Stefano Leo, Eoghan Ó Carragáin, Marc Portier, Ana Trisovic, RO-Crate Community, Paul Groth, and Carole Goble
Stian Soiland-Reyes, Peter Sefton, Mercè Crosas, Leyla Jael Castro, Frederik Coppens, José M. Fernández, Daniel Garijo, Björn Grüning, Lorraine La Rosa, Stefano Leo, Eoghan Ó Carragáin, Marc Portier, Ana Trisovic, RO-Crate Community, Paul Groth, and Carole Goble. Packaging research artefacts with RO-Crate.Data Science, 5(2):97–138, 2022. Specifies RO-Crat...
2022
-
[32]
Bailey, Ewa Deelman, Yolanda Gil, Brooks Hanson, Michael A
Victoria Stodden, Marcia McNutt, David H. Bailey, Ewa Deelman, Yolanda Gil, Brooks Hanson, Michael A. Heroux, John P. A. Ioannidis, and Michela Taufer. Enhancing repro- ducibility for computational methods.Science, 354(6317):1240–1241, 2016
2016
-
[33]
Performance-effective and low- complexity task scheduling for heterogeneous computing.IEEE Transactions on Parallel and Distributed Systems, 13(3):260–274, 2002
Haluk Topcuoglu, Salim Hariri, and Min-You Wu. Performance-effective and low- complexity task scheduling for heterogeneous computing.IEEE Transactions on Parallel and Distributed Systems, 13(3):260–274, 2002
2002
-
[34]
CUE: Validate, define, and use dynamic and text-based data, 2024
Marcel van Lohuizen. CUE: Validate, define, and use dynamic and text-based data, 2024. A constraint-based configuration language with types and validation
2024
-
[35]
Full-stack genomics pipelining with GATK4 + WDL + Cromwell
Kate Voss, Geraldine Van der Auwera, and Jeff Gentry. Full-stack genomics pipelining with GATK4 + WDL + Cromwell. F1000Research 6(ISCB Comm J):1381, 2017
2017
-
[36]
Wilkinson, Meznah Aloqalaa, Khalid Belhajjame, Michael R
Sean R. Wilkinson, Meznah Aloqalaa, Khalid Belhajjame, Michael R. Crusoe, Bruno de Paula Kinoshita, Luiz Gadelha, Daniel Garijo, Stian Soiland-Reyes, et al. Applying the FAIR Principles to computational workflows.Scientific Data, 12:328, 2025
2025
-
[37]
PiGx: Reproducible genomics analysis pipelines with GNU Guix.GigaScience, 7(12):giy123, 2018
Ricardo Wurmus, Bora Uyar, Brendan Osberg, Vedran Franke, Alexander Gosdschan, Katarzyna Wreçzycka, Jonathan Ronen, and Altuna Akalin. PiGx: Reproducible genomics analysis pipelines with GNU Guix.GigaScience, 7(12):giy123, 2018. Demonstrates Guix as 37 the substrate layer for bioinformatics pipeline reproducibility; orchestration layer remains Snakemake-like. 38
2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.