pith. sign in

arxiv: 2503.19797 · v2 · submitted 2025-03-25 · 💻 cs.PL

Fail Faster: Staging and Fast Randomness for High-Performance PBT

Pith reviewed 2026-05-22 22:08 UTC · model grok-4.3

classification 💻 cs.PL
keywords property-based testingmulti-stage programmingtest generatorsrandomness sourcesperformance optimizationOCamlScala
0
0 comments X

The pith

Multi-stage programming and optimized randomness make property-based testing find bugs up to 13 times faster.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that current PBT generator libraries suffer from abstraction overhead in their combinator style and from slow randomness sources. It introduces a multi-stage programming technique called Allegro that compiles away those overheads in OCaml and Scala 3 libraries. A separate experiment replaces the randomness source in the OCaml library with an optimized version. Both changes are shown to preserve generator semantics exactly, so any speed gains can be compared pointwise. Together the two changes locate bugs up to thirteen times faster.

Core claim

By applying multi-stage transformations to eliminate combinator overhead and by swapping in a faster randomness source, the authors show that the same generators can produce test cases much more quickly while still generating exactly the same distributions of inputs.

What carries the argument

Allegro, a multi-stage programming technique that rewrites high-level generator combinators into efficient low-level code.

If this is right

  • PBT libraries can execute far more tests in the same wall-clock time.
  • Users obtain the same test distributions without rewriting their generators.
  • Randomness source quality becomes a first-class performance concern in PBT.
  • The technique applies to existing combinator libraries in OCaml and Scala 3.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar staging could be applied to generator libraries in other languages that support multi-stage compilation.
  • Faster generators might make property-based testing practical for larger state spaces that are currently too slow to explore.
  • The controlled experiment isolates randomness as an independent variable that future PBT work can optimize separately.

Load-bearing premise

The multi-stage transformations and randomness replacement exactly preserve the semantics of the original generators.

What would settle it

Execute the original and staged generators on the same sequence of random seeds and observe whether the produced test cases or the bugs discovered differ.

Figures

Figures reproduced from arXiv: 2503.19797 by Benjamin C. Pierce, Cynthia Richey, Harrison Goldstein, Joseph W. Cutler.

Figure 1
Figure 1. Figure 1: Some functions from the API of the Base_quickcheck generator DSL. The standard design for a generator DSL, introduced in Haskell’s QuickCheck library [13] and copied in dozens of other frameworks, is via an embedded monadic language [44]. We use the syntax and types from OCaml’s Base_quickcheck library, which is presented in [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Simple monadic generators for a pair of ordered integers. [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Internals of a Monadic Generator eDSL one: it makes a weighted choice between different generators, allowing the developer to combine different sub-generators into a single program and tune the data distribution. Also important are functions like size and with_size that are used to control the sizes of generated values and fixed_point that is used to define recursive generators. let tree_of g = fixed_point… view at source ↗
Figure 3
Figure 3. Figure 3: A generator using a variety of convenience [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Microbenchmarks of generators in Base_quickcheck and ScalaCheck. Average over 10,000 generations with random seeds and a fixed size. The performance impact of this simplification is large ( [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simplifying Generators , Vol. 1, No. 1, Article . Publication date: March 2025 [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Basic Staged Generator Library The constant generator return runs at compile time. Given cx : ’a code, the code for an ’a, it returns the generator that ignores its size and random arguments and simply returns cx. Similarly, bind g k sequences generators by passing the result of running the generator g to the continuation k. However, instead of getting access to the particular value generated by g, the con… view at source ↗
Figure 8
Figure 8. Figure 8: Pairs of Ints, Staged 3.3 Staging Combinators In Section 2, we noted that generator combinators like weighted_union allocate lists in the hot path of the generator. Even though these lists are usually small—at most a few dozen elements—each allocation takes us closer to the next garbage collection, which is bad for performance. This is an ideal opportunity to exercise another feature of staging: compile-ti… view at source ↗
Figure 9
Figure 9. Figure 9: Staged Weighted Union let grades : char Bq.t = Gen.to_bq ( Gen.bind size (fun n -> Gen.weighted_union [ (n, Gen.return .<'a'>.); (.<2>., Gen.return .<'b'>.); (.<1>., Gen.return .<'c'>.); ] ) ) (* .< fun size random -> let sum = size + 2 + 1 + 0 in let r = SR.int random 0 sum in if r <= size then 'a' else let r' = r - size in if r' <= 2 then 'b' else let r'' = r' - 2 in if r'' <= 1 then 'c' else failwith "E… view at source ↗
Figure 10
Figure 10. Figure 10: Use of Staged Weighted Union , Vol. 1, No. 1, Article . Publication date: March 2025 [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: bind, two ways In essence, the behavior of splice .˜cx in a staged function f(cx : ’a code) = ... is to copy the entire block of code, effects and all. To ensure that the randomness effects of the first generator are executed only once but that the value can be used in the continuation multiple times, the correct bind let-binds the result of generation to a variable and then passes it to the continuation.… view at source ↗
Figure 12
Figure 12. Figure 12: CodeCPS and the Final Gen Monad type. In prior work, this type is often referred to as the “code generation” monad. This is because a value of type (’a code) CodeCps.t is like an “action” that generates code: CodeCps.run passes the continuation transformer the identity continuation to produce a ’a code. To avoid confusion with random data generators, we refer to this type as CodeCps.t. Most importantly, t… view at source ↗
Figure 13
Figure 13. Figure 13: Staged Recursive Generator Combinator API [PITH_FULL_IMAGE:figures/full_fig_p014_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Time to generate values of varying sizes using each AllegrOCaml strategy. Lower is better. Both axes [PITH_FULL_IMAGE:figures/full_fig_p017_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Left: Speedup from staging (compared to Base_quickcheck) versus the number of bind calls. Right: Speedup from CSplitMix (compared to AllegrOCaml alone) versus the number of random samples [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Time to generate values of varying sizes using each ScAllegro strategy. Lower is better. Both axes are [PITH_FULL_IMAGE:figures/full_fig_p019_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Geometric average of all speedups—relative to [PITH_FULL_IMAGE:figures/full_fig_p020_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Distribution of individual-task speedups across strategies and benchmarks. Gray dots represent tasks. [PITH_FULL_IMAGE:figures/full_fig_p020_18.png] view at source ↗
read the original abstract

Property-based testing (PBT) relies on generators for random test cases, often constructed using embedded domain specific languages, which provide expressive combinators for building and composing generators. The effectiveness of PBT depends critically on the speed of these generators. However, careful measurements show that the generator performance of widely used PBT libraries falls well short of what is possible, due principally to (1) the abstraction overhead of their combinator-heavy style and (2) suboptimal sources of randomness. We characterize, quantify, and address these bottlenecks. To eliminate abstraction overheads, we propose a technique based on multi-stage programming, dubbed Allegro. We apply this technique to leading generator libraries in OCaml and Scala 3, significantly improving performance. To quantify the performance impact of the randomness source, we carry out a controlled experiment, replacing the randomness in the OCaml PBT library with an optimized version. Both interventions exactly preserve the semantics of generators, enabling precise, pointwise comparisons. Together, these improvements find bugs up to $13\times$ faster.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces Allegro, a multi-stage programming technique applied to PBT generator libraries in OCaml and Scala 3 to eliminate combinator abstraction overhead, and reports a controlled experiment replacing the randomness source in the OCaml library with an optimized version. Both changes are stated to exactly preserve generator semantics, enabling pointwise comparisons, and together they are claimed to find bugs up to 13× faster.

Significance. If the semantics-preservation claims and experimental controls hold, the work quantifies and mitigates two concrete bottlenecks (abstraction overhead and randomness quality) that limit PBT effectiveness. The staging approach offers a reusable method for specializing generator DSLs, and the randomness replacement provides a measurable baseline; both could inform generator design in other languages and libraries.

major comments (2)
  1. [Abstract] Abstract and §1: the central performance and bug-finding claims rest on the assertion that Allegro transformations and the randomness replacement 'exactly preserve the semantics of generators' (distribution and per-seed observable behavior). No machine-checked equivalence, bisimulation argument, or even informal proof sketch is referenced; if staging alters thunk order, laziness, or bit consumption, or if the new RNG changes the effective distribution, the 13× speedup could reflect different test cases rather than equivalent faster generation. This precondition must be substantiated with concrete evidence before the pointwise comparisons are credible.
  2. [Abstract] The controlled experiment replacing the randomness source is described only at abstract level; the manuscript must supply the measurement methodology, data-exclusion rules, statistical analysis, and seed-handling protocol (as flagged by the low soundness score) to allow reproduction and to confirm that timing differences are not confounded by distribution changes.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive review. The comments highlight important points regarding the substantiation of our semantics-preservation claims and the level of detail provided for the controlled experiment. We respond to each major comment below and will revise the manuscript to address them.

read point-by-point responses
  1. Referee: [Abstract] Abstract and §1: the central performance and bug-finding claims rest on the assertion that Allegro transformations and the randomness replacement 'exactly preserve the semantics of generators' (distribution and per-seed observable behavior). No machine-checked equivalence, bisimulation argument, or even informal proof sketch is referenced; if staging alters thunk order, laziness, or bit consumption, or if the new RNG changes the effective distribution, the 13× speedup could reflect different test cases rather than equivalent faster generation. This precondition must be substantiated with concrete evidence before the pointwise comparisons are credible.

    Authors: We agree that an explicit justification strengthens the claims. Allegro is constructed so that the staged generators consume exactly the same sequence of random bits and produce identical observable outputs for any given seed, because staging only removes indirection in the combinator implementations without reordering effects or altering laziness. The RNG replacement is a drop-in substitute that yields identical bit streams for the same seed. In the revision we will add a dedicated subsection with an informal argument (based on the structure of the staged code and the RNG interface) together with a small set of seed-by-seed equality checks on generated values. This will make the precondition for the pointwise comparisons explicit. revision: yes

  2. Referee: [Abstract] The controlled experiment replacing the randomness source is described only at abstract level; the manuscript must supply the measurement methodology, data-exclusion rules, statistical analysis, and seed-handling protocol (as flagged by the low soundness score) to allow reproduction and to confirm that timing differences are not confounded by distribution changes.

    Authors: We accept that the current description is insufficient for reproduction. The revised manuscript will contain a new subsection that details the timing harness, the precise rules used to exclude outlier runs, the statistical tests applied to the speed-up figures, and the seed-initialization and state-management protocol. These additions will confirm that the measured differences are attributable to generator speed rather than any change in the induced distribution. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on external measurements

full rationale

The paper's core claims concern measured speedups from multi-stage transformations (Allegro) and randomness replacement in PBT generators. These are presented as experimental interventions that preserve semantics, with results quantified via timing and bug-finding comparisons. No equations, fitted parameters, or derivations appear in the abstract or description; the 13× claim is tied directly to controlled experiments rather than any self-referential definition or self-citation chain. The semantics-preservation precondition is stated as an assumption enabling the comparisons but is not derived from the paper's own inputs. This is a standard empirical paper structure with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that multi-stage programming can be applied to generator combinators without semantic change and that the chosen randomness source is a drop-in replacement; no free parameters or invented entities are mentioned in the abstract.

axioms (1)
  • domain assumption Multi-stage programming transformations preserve the observable behavior of the original generator combinators.
    Stated as the basis for exact pointwise comparisons in the abstract.

pith-pipeline@v0.9.0 · 5717 in / 1101 out tokens · 33732 ms · 2026-05-22T22:08:25.205565+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages

  1. [1]

    [n. d.]. Macros. https://dotty.epfl.ch/docs/reference/metaprogramming/macros.html. Accessed: 2025-03-17

  2. [2]

    [n. d.]. ScalaCheck. https://scalacheck.org/. Accessed: 2025-03-24

  3. [3]

    [n. d.]. A small noncryptographic pseudorandom number generator. https://burtleburtle.net/bob/rand/smallprng.html. Accessed: 2025-03-24

  4. [4]

    Supun Abeysinghe and Tiark Rompf. 2023. Rhyme: A Data-Centric Expressive Query Language for Nested Data Structures. In Practical Aspects of Declarative Languages , Martin Gebser and Ilya Sergey (Eds.). Springer Nature Switzerland, Cham, 64–81

  5. [5]

    Stefan Ackermann, Vojin Jovanovic, Tiark Rompf, and Martin Odersky. 2012. Jet: An Embedded DSL for High Performance Big Data Processing. https://infoscience.epfl.ch/handle/20.500.14299/85985

  6. [6]

    Alan Bawden. 1999. Quasiquotation in Lisp. In Partial Evaluation and Semantic-Based Program Manipulation . 4–12. citeseer.ist.psu.edu/bawden99quasiquotation.html

  7. [7]

    David Blackman and Sebastiano Vigna. 2022. Scrambled Linear Pseudorandom Number Generators. arXiv:1805.01407 [cs.DS] https://arxiv.org/abs/1805.01407

  8. [8]

    Anders Bondorf. 1992. Improving binding times without explicit CPS-conversion. In Proceedings of the 1992 ACM Conference on LISP and Functional Programming (San Francisco, California, USA) (LFP ’92). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/141471.141483

  9. [9]

    Rudy Braquehais, Michael Walker, José Manuel Calderón Trilla, and Colin Runciman. 2017. A simple incremental development of a property-based testing tool (Functional Pearl). (2017)

  10. [10]

    Bytheway, Cameron. 2025. Bolero. https://camshaft.github.io/bolero/. Accessed: 2025-03-24

  11. [11]

    Jacques Carette and Oleg Kiselyov. 2005. Multi-stage Programming with Functors and Monads: Eliminating Abstraction Overhead from Generic Code. In Generative Programming and Component Engineering , Robert Glück and Michael Lowry (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 256–274

  12. [12]

    Koen Claessen, Jonas Duregård, and Michal H. Palka. 2015. Generating Constrained Random Data with Uniform Distribution. J. Funct. Program. 25 (2015). https://doi.org/10.1017/S0956796815000143

  13. [14]

    Koen Claessen and John Hughes. 2000. QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming (ICFP ’00), Montreal, Canada, September 18-21, 2000 , Martin Odersky and Philip Wadler (Eds.). ACM, Montreal, Canada, 268–279. https: //doi.org/10.1145/351240.351266

  14. [15]

    Rowan Davies. 2017. A Temporal Logic Approach to Binding-Time Analysis. J. ACM 64, 1, Article 1 (March 2017), 45 pages. https://doi.org/10.1145/3011069

  15. [16]

    Rowan Davies and Frank Pfenning. 2001. A modal analysis of staged computation. J. ACM 48, 3 (May 2001), 555–604. https://doi.org/10.1145/382780.382785

  16. [17]

    Eisenberg, Stephen Dolan, and Leo White

    Richard A. Eisenberg, Stephen Dolan, and Leo White. 2022. Unboxed types for OCaml. https://www.youtube.com/ watch?v=Vevld4cXSYk

  17. [18]

    Matthew Flatt. 2012. Creating languages in Racket. Commun. ACM 55, 1 (Jan. 2012), 48–56. https://doi.org/10.1145/ 2063176.2063195

  18. [19]

    FsCheck Team. 2025. FsCheck: Random Testing for .NET. https://fscheck.github.io/FsCheck/. Accessed: 2025-03-24

  19. [20]

    Peyton Jones

    Andrew Gill, John Launchbury, and Simon L. Peyton Jones. 1993. A short cut to deforestation. In Proceedings of the Conference on Functional Programming Languages and Computer Architecture (Copenhagen, Denmark) (FPCA ’93). Association for Computing Machinery, New York, NY, USA, 223–232. https://doi.org/10.1145/165180.165214

  20. [21]

    Cutler, Daniel Dickstein, Benjamin C

    Harrison Goldstein, Joseph W. Cutler, Daniel Dickstein, Benjamin C. Pierce, and Andrew Head. 2024. Property- Based Testing in Practice. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering (Lisbon, Portugal) (ICSE ’24). Association for Computing Machinery, New York, NY, USA, Article 187, 13 pages. https://doi.org/10.1145/35...

  21. [22]

    Harrison Goldstein and Benjamin C. Pierce. 2022. Parsing Randomness. Proceedings of the ACM on Programming Languages 6, OOPSLA2 (Oct. 2022), 128:89–128:113. https://doi.org/10.1145/3563291

  22. [23]

    John H. Holland. 1992. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control and Artificial Intelligence. MIT Press, Cambridge, MA, USA

  23. [24]

    Hypothesis Team. 2025. Hypothesis. https://hypothesis.works/. Accessed: 2025-03-24

  24. [25]

    Intel Corporation. 2025. Intel ® Processor Trace. https://edc.intel.com/content/www/us/en/design/products/platforms/ processor-and-core-i3-n-series-datasheet-volume-1-of-2/001/intel-processor-trace/. Accessed: 2025-03-24

  25. [26]

    Neil D Jones, Carsten K Gomard, and Peter Sestoft. 1993. Partial evaluation and automatic program generation . Peter Sestoft. , Vol. 1, No. 1, Article . Publication date: March 2025. 24 Richey and Cutler et al

  26. [27]

    Manohar Jonnalagedda, Thierry Coppey, Sandro Stucki, Tiark Rompf, and Martin Odersky. 2014. Staged parser combinators for efficient data processing. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages &amp; Applications (Portland, Oregon, USA)(OOPSLA ’14). Association for Computing Machinery, New York, ...

  27. [28]

    Oleg Kiselyov. 2014. The Design and Implementation of BER MetaOCaml. In Functional and Logic Programming , Michael Codish and Eijiro Sumii (Eds.). Springer International Publishing, Cham, 86–102

  28. [29]

    Oleg Kiselyov. 2023. MetaOCaml Theory and Implementation. arXiv:2309.08207 [cs.PL] https://arxiv.org/abs/2309. 08207

  29. [30]

    Oleg Kiselyov, Aggelos Biboudis, Nick Palladinos, and Yannis Smaragdakis. 2017. Stream fusion, to completeness. SIGPLAN Not. 52, 1 (Jan. 2017), 285–299. https://doi.org/10.1145/3093333.3009880

  30. [31]

    Oleg Kiselyov and Jeremy Yallop. 2022. let (rec) insertion without Effects, Lights or Magic. arXiv:2201.00495 [cs.PL] https://arxiv.org/abs/2201.00495

  31. [32]

    András Kovács. 2024. Closure-Free Functional Programming in a Two-Level Type Theory. Proc. ACM Program. Lang. 8, ICFP, Article 259 (Aug. 2024), 34 pages. https://doi.org/10.1145/3674648

  32. [33]

    Shriram Krishnamurthi and Matthias Felleisen. 2001. Linguistic reuse. Ph. D. Dissertation. AAI3021152

  33. [34]

    Krishnaswami and Jeremy Yallop

    Neelakantan R. Krishnaswami and Jeremy Yallop. 2019. A typed, algebraic approach to parsing. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (Phoenix, AZ, USA) (PLDI 2019). Association for Computing Machinery, New York, NY, USA, 379–393. https://doi.org/10.1145/3314221.3314625

  34. [35]

    Robert Krook, Nicholas Smallbone, Bo Joel Svensson, and Koen Claessen. 2024. QuickerCheck: Implementing and Evaluating a Parallel Run-Time for QuickCheck. InProceedings of the 35th Symposium on Implementation and Application of Functional Languages (IFL ’23). Association for Computing Machinery, New York, NY, USA, 1–12. https://doi.org/ 10.1145/3652561.3652570

  35. [36]

    Pierce, and Li-yao Xia

    Leonidas Lampropoulos, Diane Gallois-Wong, Catalin Hritcu, John Hughes, Benjamin C. Pierce, and Li-yao Xia. 2017. Beginner’s Luck: A Language for Property-Based Generators. Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages, POPL 2017, Paris, France, January 18-20, 2017 (2017), 114–129

  36. [37]

    Leonidas Lampropoulos, Michael Hicks, and Benjamin C. Pierce. 2019. Coverage Guided, Property Based Testing. PACMPL 3, OOPSLA (2019), 181:1–181:29. https://doi.org/10.1145/3360607

  37. [38]

    Leonidas Lampropoulos, Zoe Paraskevopoulou, and Benjamin C. Pierce. 2017. Generating Good Generators for Inductive Relations. Proceedings of the ACM on Programming Languages 2, POPL (2017), 1–30

  38. [39]

    Daniel Lemire. 2023. testingRNG. https://github.com/lemire/testingRNG

  39. [40]

    Andreas Löscher and Konstantinos Sagonas. 2017. Targeted Property-Based Testing. In Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2017). Association for Computing Machinery, New York, NY, USA, 46–56. https://doi.org/10.1145/3092703.3092711

  40. [41]

    William M McKeeman. 1998. Differential testing for software. Digital Technical Journal 10, 1 (1998), 100–107

  41. [42]

    Microsoft. 2025. Code Quotations — F# | Microsoft Learn. https://learn.microsoft.com/en-us/dotnet/fsharp/language- reference/code-quotations. Accessed: 2025-03-24

  42. [43]

    O’Reilly Media, Inc

    Yaron Minsky and Anil Madhavapeddy. 2022. Real World OCaml: Functional Programming for the Masses . " O’Reilly Media, Inc. "

  43. [44]

    Eugenio Moggi. 1991. Notions of computation and monads. Information and computation 93, 1 (1991), 55–92

  44. [45]

    Anders Møller and Oskar Haarklou Veileborg. 2020. Eliminating abstraction overhead of Java stream pipelines using ahead-of-time program optimization. Proc. ACM Program. Lang. 4, OOPSLA, Article 168 (Nov. 2020), 29 pages. https://doi.org/10.1145/3428236

  45. [46]

    Aleksandar Nanevski, Frank Pfenning, and Brigitte Pientka. 2008. Contextual modal type theory. ACM Trans. Comput. Logic 9, 3, Article 23 (June 2008), 49 pages. https://doi.org/10.1145/1352582.1352591

  46. [47]

    Flemming Nielson and Hanne Riis Nielson. 1992. Two-level functional languages. Cambridge university press

  47. [48]

    openjdk. 2023. Java Microbenchmarking Harness. https://github.com/openjdk/jmh

  48. [49]

    Lionel Parreaux, Amir Shaikhha, and Christoph E. Koch. 2017. Quoted staged rewriting: a practical approach to library- defined optimizations. In Proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (Vancouver, BC, Canada) (GPCE 2017). Association for Computing Machinery, New York, NY, USA, 131–14...

  49. [50]

    Lionel Emile Vincent Parreaux. 2020. Type-Safe Metaprogramming and Compilation Techniques For Designing Efficient Systems in High-Level Languages . Ph. D. Dissertation. EPFL, Lausanne. https://doi.org/10.5075/epfl-thesis-10285

  50. [51]

    W. H. Payne, J. R. Rabung, and T. P. Bogyo. 1969. Coding the Lehmer pseudo-random number generator. Commun. ACM 12, 2 (Feb. 1969), 85–86. https://doi.org/10.1145/362848.362860

  51. [52]

    Proptest Contributors. 2025. The Proptest Book. https://altsysrq.github.io/proptest-book/. Accessed: 2025-03-24

  52. [53]

    Tiark Rompf and Martin Odersky. 2012. Lightweight modular staging: a pragmatic approach to runtime code generation and compiled DSLs. Commun. ACM 55, 6 (June 2012), 121–130. https://doi.org/10.1145/2184319.2184345 , Vol. 1, No. 1, Article . Publication date: March 2025. Fail Faster 25

  53. [54]

    Colin Runciman, Matthew Naylor, and Fredrik Lindblad. 2008. Smallcheck and Lazy Smallcheck: Automatic Exhaustive Testing for Small Values. ACM SIGPLAN Notices 44, 2 (Sept. 2008), 37–48. https://doi.org/10.1145/1543134.1411292

  54. [55]

    In: Proc

    Eric L. Seidel, Niki Vazou, and Ranjit Jhala. 2015. Type Targeted Testing. In Programming Languages and Systems (Lecture Notes in Computer Science) , Jan Vitek (Ed.). Springer, Berlin, Heidelberg, 812–836. https://doi.org/10.1007/978- 3-662-46669-8_33

  55. [56]

    Tim Sheard, Zine el-abidine Benaissa, and Emir Pasalic. 1999. DSL Implementation Using Staging and Monads. In 2nd Conference on Domain-Specific Languages (DSL 99) . USENIX Association, Austin, TX. https://www.usenix.org/ conference/dsl-99/dsl-implementation-using-staging-and-monads

  56. [57]

    Tim Sheard and Simon Peyton Jones. 2002. Template meta-programming for Haskell. In Proceedings of the 2002 ACM SIGPLAN Workshop on Haskell (Pittsburgh, Pennsylvania) (Haskell ’02). Association for Computing Machinery, New York, NY, USA, 1–16. https://doi.org/10.1145/581690.581691

  57. [58]

    Pierce, and Leonidas Lampropoulos

    Jessica Shi, Alperen Keles, Harrison Goldstein, Benjamin C. Pierce, and Leonidas Lampropoulos. 2023. Etna: An Evaluation Platform for Property-Based Testing (Experience Report). Proc. ACM Program. Lang. 7, ICFP, Article 218 (Aug. 2023), 17 pages. https://doi.org/10.1145/3607860

  58. [59]

    Steele, Doug Lea, and Christine H

    Guy L. Steele, Doug Lea, and Christine H. Flood. 2014. Fast splittable pseudorandom number generators. InProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications (Portland, Oregon, USA) (OOPSLA ’14). Association for Computing Machinery, New York, NY, USA, 453–472. https://doi.org/10. 1145/2660193.2660195

  59. [60]

    Dominic Steinhöfel and Andreas Zeller. 2022. Input Invariants. In Proceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022) . Association for Computing Machinery, New York, NY, USA, 583–594. https://doi.org/10.1145/3540250.3549139

  60. [61]

    Jane Street. 2021. Base Quickcheck. https://github.com/janestreet/base_quickcheck

  61. [62]

    Jane Street. 2023. Core bench. https://github.com/janestreet/core_bench

  62. [63]

    Walid Taha and Tim Sheard. 2000. MetaML and multi-stage programming with explicit annotations. Theoretical computer science 248, 1-2 (2000), 211–242

  63. [64]

    Sam Tobin-Hochstadt, Vincent St-Amour, Ryan Culpepper, Matthew Flatt, and Matthias Felleisen. 2011. Languages as libraries. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (San Jose, California, USA) (PLDI ’11). Association for Computing Machinery, New York, NY, USA, 132–141. https: //doi.org/10.1145/199...

  64. [65]

    Laurence Tratt. 2008. Domain specific language implementation via compile-time meta-programming. ACM Trans. Program. Lang. Syst. 30, 6, Article 31 (Oct. 2008), 40 pages. https://doi.org/10.1145/1391956.1391958

  65. [66]

    Janis Voigtländer. 2008. Asymptotic Improvement of Computations over Free Monads. In Mathematics of Program Construction, Philippe Audebaud and Christine Paulin-Mohring (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 388–403

  66. [67]

    John Von Neumann. 1951. Various Techniques Used in Connection with Random Digits. Appl. Math Ser 12, 36-38 (1951), 3

  67. [68]

    Edwin Westbrook, Mathias Ricken, Jun Inoue, Yilong Yao, Tamer Abdelatif, and Walid Taha. 2010. Mint: Java multi-stage programming using weak separability. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (Toronto, Ontario, Canada) (PLDI ’10). Association for Computing Machinery, New York, NY, USA, 400–411...

  68. [69]

    Jamie Willis, Nicolas Wu, and Matthew Pickering. 2020. Staged selective parser combinators. Proc. ACM Program. Lang. 4, ICFP, Article 120 (Aug. 2020), 30 pages. https://doi.org/10.1145/3409002

  69. [70]

    Ningning Xie, Leo White, Olivier Nicole, and Jeremy Yallop. 2023. MacoCaml: Staging Composable and Compilable Macros. Proc. ACM Program. Lang. 7, ICFP, Article 209 (Aug. 2023), 45 pages. https://doi.org/10.1145/3607851

  70. [71]

    Jeremy Yallop, Ningning Xie, and Neel Krishnaswami. 2023. flap: A Deterministic Parser with Fused Lexing. Proc. ACM Program. Lang. 7, PLDI, Article 155 (June 2023), 24 pages. https://doi.org/10.1145/3591269

  71. [72]

    Wang Yi. 2021. wyhash. https://github.com/wangyi-fudan/wyhash. Received –; revised –; accepted – , Vol. 1, No. 1, Article . Publication date: March 2025