pith. sign in

arxiv: 1907.05947 · v1 · pith:E5KX5UH3new · submitted 2019-07-11 · 🧮 math.HO · stat.OT

Truth, Proof, and Reproducibility: There's no counter-attack for the codeless

Pith reviewed 2026-05-24 22:55 UTC · model grok-4.3

classification 🧮 math.HO stat.OT
keywords reproducibilitymathematical proofscomputational mathematicsEuclidBoyleLakatosopen source codestatistical inquiry
0
0 comments X

The pith

Mathematical proofs are inherently reproducible by design, but modern computational math now requires open code to sustain that standard.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies two historical lines of reproducibility in science. One begins with Euclid and the production of mathematical proofs, which carry a built-in guarantee of reproducibility. The other begins with experimental practice, illustrated by Boyle's detailed accounts of his vacuum-pump apparatus. Contemporary mathematical work has moved from blackboard proofs to computational workflows that resemble experimental science. The authors therefore conclude that code written in open-source languages must now be supplied so that others can reproduce and extend computational mathematical results. They call for a reorientation of mathematical science that makes its reproducibility assessable in the same manner as experimental work.

Core claim

Proofs have a distinctive quality of being necessarily reproducible, and are the cornerstone of mathematical science. However, the task of the modern mathematical scientist has drifted from that of blackboard rhetorician to a scientific workflow that now more closely resembles that of an experimental scientist. The computer is an analog for Boyle's pump, another kind of scientific instrument that needs detailed descriptions of how it generates results. In place of Boyle's hand-written notes, code must be available to enable reproduction of computational experiments.

What carries the argument

The computer-as-instrument analogy to Boyle's vacuum pump, which requires explicit descriptions (now code) so that results can be reproduced.

If this is right

  • Open-source code becomes a necessary component of due scientific diligence for computational mathematical inquiry.
  • Reproducibility standards applied to experimental science must now be applied to computational mathematics.
  • The meaning of proof itself shifts when the workflow is computational rather than purely rhetorical.
  • Mathematical science requires reorientation so that its reproducibility can be readily assessed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Treating code as an extension of the proof process could change how computational claims are taught and reviewed.
  • Fields outside mathematics that rely on computation might adopt similar expectations for sharing the programs that generate results.
  • Lakatos-style conjectural dialogue could be used more broadly to examine truth under uncertainty in any computational setting.

Load-bearing premise

The computer functions as an analog for Boyle's pump, another kind of scientific instrument that needs detailed descriptions of how it generates results.

What would settle it

A collection of modern mathematical results shown to be fully verifiable and reproducible by readers who have no access to any code or computational artifacts.

Figures

Figures reproduced from arXiv: 1907.05947 by Ben Marwick, Charles T. Gray.

Figure 1
Figure 1. Figure 1: We propose updating this spectrum of reproducibility [23] with unit tests for data analysis. In addition to the advertising, the formal scientific argument put for￾ward, many informal and traditionally hidden scientific outputs comprise the compen￾dium of research that produces the results. Given the underutilised nature of unit tests, we suggest there is further work to be done to facilitate the adoption … view at source ↗
Figure 2
Figure 2. Figure 2: On the left, Hal might begin to verify his understanding of + by first consid￾ering the case where both numbers are negative, x, y < 0. In this case, we might think of + as combining x steps to the left with y steps to the left. The halfway point x+y 2 , falls in the middle of the two arrows laid side by side. On the right, Hal considers the case where x < 0, y > 0 and |x| < |y|. Here x + y can be thought … view at source ↗
Figure 3
Figure 3. Figure 3: This panel shows some basic details of tests in R packages listed in CRAN task views [42]. The measure of interest, test size ratio, was calculated by dividing the test file size with the overall package source file size from the unofficial CRAN mirror on GitHub. This is a rough indicator of test coverage, future work should consider more precise metrics such as those produced by the covr:: package. a) the… view at source ↗
Figure 4
Figure 4. Figure 4: a) shows the change in the number of packages in each CRAN task view over time. b) shows the proportion of packages in each CRAN task view that have tests [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
read the original abstract

Current concerns about reproducibility in many research communities can be traced back to a high value placed on empirical reproducibility of the physical details of scientific experiments and observations. For example, the detailed descriptions by 17th century scientist Robert Boyle of his vacuum pump experiments are often held to be the ideal of reproducibility as a cornerstone of scientific practice. Victoria Stodden has claimed that the computer is an analog for Boyle's pump -- another kind of scientific instrument that needs detailed descriptions of how it generates results. In the place of Boyle's hand-written notes, we now expect code in open source programming languages to be available to enable others to reproduce and extend computational experiments. In this paper we show that there is another genealogy for reproducibility, starting at least from Euclid, in the production of proofs in mathematics. Proofs have a distinctive quality of being necessarily reproducible, and are the cornerstone of mathematical science. However, the task of the modern mathematical scientist has drifted from that of blackboard rhetorician, where the craft of proof reigned, to a scientific workflow that now more closely resembles that of an experimental scientist. So, what is proof in modern mathematics? And, if proof is unattainable in other fields, what is due scientific diligence in a computational experimental environment? How do we measure truth in the context of uncertainty? Adopting a manner of Lakatosian conversant conjecture between two mathematicians, we examine how proof informs our practice of computational statistical inquiry. We propose that a reorientation of mathematical science is necessary so that its reproducibility can be readily assessed.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper traces concerns about reproducibility to empirical traditions, citing Robert Boyle's vacuum pump experiments as an ideal, and notes Victoria Stodden's analogy of the computer to such instruments. It contrasts this with a separate genealogy of reproducibility rooted in Euclidean mathematical proofs, which it describes as necessarily reproducible and foundational to mathematical science. The manuscript argues that modern mathematical practice has shifted from blackboard-based proof craft toward computational workflows resembling experimental science, and proposes a reorientation of mathematical science—via a Lakatosian style of conversant conjecture—to better assess reproducibility, truth, and due diligence in computational statistical inquiry under uncertainty.

Significance. If the interpretive genealogy and normative proposal hold, the paper could enrich discussions in the history and philosophy of mathematics by offering an alternative framing for reproducibility that emphasizes proof traditions over experimental ones, potentially informing standards for computational mathematics and statistics. As a conceptual essay without derivations, data, or formal checks, its value lies in stimulating reflection on practice rather than providing testable claims or tools.

major comments (2)
  1. [Abstract] Abstract: the central distinction between the 'necessarily reproducible' quality of proofs and empirical reproducibility (Boyle/Stodden) is asserted without a definition of reproducibility in the mathematical context or explicit criteria for the claimed distinction; this interpretive claim is load-bearing for the alternative genealogy and the subsequent call for reorientation.
  2. [Abstract] Abstract: the assertion that 'the task of the modern mathematical scientist has drifted' from blackboard rhetorician to experimental-like workflow is presented as factual grounding for the reorientation proposal, yet no specific examples, citations, or periodization of this drift are supplied in the text; without such support the normative recommendation rests on an unevaluated historical premise.
minor comments (2)
  1. The title phrase 'There's no counter-attack for the codeless' is not explained or connected to the argument in the abstract.
  2. The manuscript would benefit from explicit section headings and a clearer roadmap to distinguish the historical genealogy from the Lakatosian examination and the final proposal.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments and recommendation for minor revision. We address each major comment point by point below, providing clarifications where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central distinction between the 'necessarily reproducible' quality of proofs and empirical reproducibility (Boyle/Stodden) is asserted without a definition of reproducibility in the mathematical context or explicit criteria for the claimed distinction; this interpretive claim is load-bearing for the alternative genealogy and the subsequent call for reorientation.

    Authors: The manuscript does provide an implicit definition through the contrast with Boyle's experiments and the Euclidean tradition, where proofs are reproducible via logical deduction alone. However, to strengthen the abstract as suggested, we will incorporate an explicit definition of mathematical reproducibility as the independent verification of deductive steps by any reader familiar with the axioms and rules of inference. This revision will be made in the next version. revision: yes

  2. Referee: [Abstract] Abstract: the assertion that 'the task of the modern mathematical scientist has drifted' from blackboard rhetorician to experimental-like workflow is presented as factual grounding for the reorientation proposal, yet no specific examples, citations, or periodization of this drift are supplied in the text; without such support the normative recommendation rests on an unevaluated historical premise.

    Authors: This assertion is presented as a contemporary observation rather than a detailed historical analysis requiring periodization or specific citations. The paper's argument proceeds from this premise to discuss implications for computational inquiry using a Lakatosian approach. We maintain that the normative proposal does not rest on an unevaluated premise but on the logical consequences for assessing truth under uncertainty. No revision is planned for this point. revision: no

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper is a philosophical/historical essay (math.HO) with no equations, formal derivations, fitted parameters, predictions, or load-bearing self-citations. It traces ideas from Euclid and Boyle to modern practice and proposes a normative reorientation, but contains no steps that reduce a result to its own inputs by construction. All claims are interpretive and self-contained against external historical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The argument rests on interpretive claims about historical figures and the nature of proof; no quantitative parameters or new entities are introduced.

axioms (2)
  • domain assumption Proofs in mathematics are necessarily reproducible by design.
    Presented as the cornerstone of mathematical science in the abstract.
  • domain assumption Modern mathematical and statistical workflows now resemble experimental science more than pure proof construction.
    Stated directly as the basis for applying experimental reproducibility standards.

pith-pipeline@v0.9.0 · 5813 in / 1290 out tokens · 20085 ms · 2026-05-24T22:55:57.163941+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 1 internal anchor

  1. [1]

    Nature 567(7748), 305 (Mar 2019)

    Amrhein, V., Greenland, S., McShane, B.: Scientists rise up against statistical significance. Nature 567(7748), 305 (Mar 2019). https://doi.org/10.1038/d41586- 019-00857-9, http://www.nature.com/articles/d41586-019-00857-9 , bib- tex*[rights=2019 Nature]

  2. [2]

    Farrar, Straus and Giroux (Mar 2001), google-Books- ID: 6AUtQVhrY90C

    Auburn, D.: Proof: A Play. Farrar, Straus and Giroux (Mar 2001), google-Books- ID: 6AUtQVhrY90C

  3. [3]

    In: International Conference on Theorem Proving in Higher Order Logics

    Bertot, Y.: A short presentation of Coq. In: International Conference on Theorem Proving in Higher Order Logics. pp. 12–16 (2008), bibtex*[organization=Springer]

  4. [4]

    In: Proceedings of the 16th Annual Conference on Research in Undergraduate Mathematics Education

    Brown, S.: Partial unpacking and indirect proofs: A study of students productive use of the symbolic proof scheme. In: Proceedings of the 16th Annual Conference on Research in Undergraduate Mathematics Education. vol. 2, pp. 47–54 (2013)

  5. [5]

    Four transformations on the Catalan triangle

    Bryan, J.: Excuse Me, Do You Have a Moment to Talk About Ver- sion Control? The American Statistician 72(1), 20–27 (2018). ht- tps://doi.org/10.1080/00031305.2017.1399928, https://doi.org/10. 1080/00031305.2017.1399928, bibtex*[publisher=Taylor & Fran- cis;eprint=https://doi.org/10.1080/00031305.2017.1399928]

  6. [6]

    Science 351(6280), 1433– 1436 (Mar 2016)

    Camerer, C.F., Dreber, A., Forsell, E., Ho, T.H., Huber, J., Johannesson, M., Kirchler, M., Almenberg, J., Altmejd, A., Chan, T., Heikensten, E., Holzmeister, F., Imai, T., Isaksson, S., Nave, G., Pfeiffer, T., Razen, M., Wu, H.: Evaluating replicability of laboratory experiments in economics. Science 351(6280), 1433– 1436 (Mar 2016). https://doi.org/10.11...

  7. [7]

    Carroll, L.: The Annotated Alice: The Definitive Edition. W. W. Norton & Company, New York, updated, subsequent edition edn. (Nov 1999), bib- tex*[editora=undefined;editoratype=collaborator] 18 Charles T. Gray and Ben Marwick

  8. [8]

    Cambridge Uni- versity Press (Apr 2002), google-Books-ID: vVVTxeuiyvQC

    Davey, B.A., Priestley, H.A.: Introduction to Lattices and Order. Cambridge Uni- versity Press (Apr 2002), google-Books-ID: vVVTxeuiyvQC

  9. [9]

    Davey, B.A.: When is a Proof? La Trobe University, 2 edn. (2009)

  10. [10]

    Order 35(2), 193–214 (Jul 2018)

    Davey, B.A., Gray, C.T., Pitkethly, J.G.: The Homomorphism Lat- tice Induced by a Finite Algebra. Order 35(2), 193–214 (Jul 2018). https://doi.org/10.1007/s11083-017-9426-3, https://doi.org/10.1007/ s11083-017-9426-3 , bibtex*[shortjournal=Order;rights=All rights reserved]

  11. [11]

    Biostatist- ics 11(3), 385–388 (Jul 2010)

    Donoho, D.L.: An invitation to reproducible computational research. Biostatist- ics 11(3), 385–388 (Jul 2010). https://doi.org/10.1093/biostatistics/kxq028, https://doi.org/10.1093/biostatistics/kxq028, bibtex*[eprint=http://oup.prod.sis.lan/biostatistics/article- pdf/11/3/385/716135/kxq028.pdf]

  12. [12]

    In: Zalta, E.N

    Fidler, F., Wilcox, J.: Reproducibility of Scientific Results. In: Zalta, E.N. (ed.) The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, winter 2018 edn. (2018)

  13. [13]

    PLOS ONE 13(7), e0200303 (Jul 2018)

    Fraser, H., Parker, T., Nakagawa, S., Barnett, A., Fidler, F.: Ques- tionable research practices in ecology and evolution. PLOS ONE 13(7), e0200303 (Jul 2018). https://doi.org/10.1371/journal.pone.0200303, https:// journals.plos.org/plosone/article?id=10.1371/journal.pone.0200303, bib- tex*[shortjournal=PLOS ONE]

  14. [14]

    Prometheus Books (Mar 2011), google-Books-ID: RhXxaPTc EYC

    Haack, S.: Defending Science - within Reason: Between Scientism And Cynicism. Prometheus Books (Mar 2011), google-Books-ID: RhXxaPTc EYC

  15. [15]

    Journal of Statistical Software 59(1), 1–23 (Sep 2014)

    Hadley Wickham: Tidy Data. Journal of Statistical Software 59(1), 1–23 (Sep 2014). https://doi.org/10.18637/jss.v059.i10, https://www.jstatsoft.org/ index.php/jss/article/view/v059i10, bibtex*[rights=Copyright (c) 2013 Had- ley Wickham]

  16. [16]

    alexpghayes.com/blog/testing-statistical-software/, bibtex*[type=Blog]

    Hayes, A.: testing statistical software - aleatoric (Jul 2019), https://www. alexpghayes.com/blog/testing-statistical-software/, bibtex*[type=Blog]

  17. [17]

    PLOS Biology 13(3), e1002106 (Mar 2015)

    Head, M.L., Holman, L., Lanfear, R., Kahn, A.T., Jennions, M.D.: The Ex- tent and Consequences of P-Hacking in Science. PLOS Biology 13(3), e1002106 (Mar 2015). https://doi.org/10.1371/journal.pbio.1002106, https://journals. plos.org/plosbiology/article?id=10.1371/journal.pbio.1002106

  18. [18]

    com/resources/webinars/covr-bringing-test-coverage-to-r/

    Hester, J.: covr: Bringing test coverage to R (Jan 2016), https://www.rstudio. com/resources/webinars/covr-bringing-test-coverage-to-r/

  19. [19]

    org/package=covr

    Hester, J.: covr: Test Coverage for Packages (2018), https://CRAN.R-project. org/package=covr

  20. [20]

    Cam- bridge University Press, Cambridge, reissue edition edn

    Lakatos, I.: Proofs and Refutations: The Logic of Mathematical Discovery. Cam- bridge University Press, Cambridge, reissue edition edn. (Oct 2015)

  21. [21]

    Comput Sci Eng 14 (2012)

    LeVeque, R.J., Mitchell, I.M., Stodden, V.: Reproducible research for scientific computing: Tools and strategies for changing the culture. Comput Sci Eng 14 (2012). https://doi.org/10.1109/mcse.2012.38

  22. [22]

    Marwick, B.: rrtools: Creates a reproducible research compendium (2018), https: //github.com/benmarwick/rrtools

  23. [23]

    Marwick, B., Boettiger, C., Mullen, L.: Packaging data analytical work repro- ducibly using R (and friends). Tech. Rep. e3192v2, PeerJ Inc. (Mar 2018). https://doi.org/10.7287/peerj.preprints.3192v2, https://peerj.com/preprints/ 3192

  24. [24]

    University of Chicago Press (Sep 1996), google-Books-ID: j94XiVDwAZEC

    Merton, R.K.: On Social Structure and Science. University of Chicago Press (Sep 1996), google-Books-ID: j94XiVDwAZEC

  25. [25]

    Statistical Science 20(3), 239–241 (2005), https://www.jstor.org/stable/20061179 Truth, Proof, and Reproducibility: There’s no counter-attack for the codeless 19

    Murray, C.: How to Accuse the Other Guy of Lying with Statistics. Statistical Science 20(3), 239–241 (2005), https://www.jstor.org/stable/20061179 Truth, Proof, and Reproducibility: There’s no counter-attack for the codeless 19

  26. [26]

    Nature 571, 133 (Jul 2019)

    Nowogrodzki, A.: How to support open-source software and stay sane. Nature 571, 133 (Jul 2019). https://doi.org/10.1038/d41586-019-02046-0, http://www.nature. com/articles/d41586-019-02046-0 , bibtex*[copyright=2019 Nature]

  27. [27]

    preprint (2017)

    Parker, H.: Opinionated analysis development. preprint (2017). ht- tps://doi.org/10.7287/peerj.preprints.3210v1

  28. [28]

    Science 334(6060), 1226–1227 (Dec 2011)

    Peng, R.D.: Reproducible Research in Computational Science. Science 334(6060), 1226–1227 (Dec 2011). https://doi.org/10.1126/science.1213847

  29. [29]

    University of Chicago Press (2010)

    Pickering, A.: The mangle of practice: Time, agency, and science. University of Chicago Press (2010)

  30. [30]

    Robinson, D., Hayes, A.: broom: Convert Statistical Analysis Objects into Tidy Tibbles (2019), https://CRAN.R-project.org/package=broom

  31. [31]

    Shapin, S., Schaffer, S.: Leviathan and the air-pump: Hobbes, Boyle, and the ex- perimental life (New in paper), vol. 32. Princeton University Press (2011)

  32. [32]

    edge.org/response-detail/25340.%202014

    Stodden, V.: What scientific idea is ready for retirement? (2014), https://www. edge.org/response-detail/25340.%202014

  33. [33]

    SIAM News 46(5) (2013)

    Stodden, V., Borwein, J., Bailey, D.H.: ”Setting the Default to Reproducible in Computational Science Research. SIAM News 46(5) (2013)

  34. [34]

    Srensen, M.H., Urzyczyn, P.: Lectures on the Curry-Howard isomorphism, vol. 149. Elsevier (2006)

  35. [35]

    PLOS Biology 16(11), e2006930 (Nov 2018)

    Wallach, J.D., Boyack, K.W., Ioannidis, J.P.A.: Reproducible research practices, transparency, and open access data in the biomedical lit- erature, 20152017. PLOS Biology 16(11), e2006930 (Nov 2018). ht- tps://doi.org/10.1371/journal.pbio.2006930, https://journals.plos.org/ plosbiology/article?id=10.1371/journal.pbio.2006930

  36. [36]

    Westgate, M., Barrett, M., Grames, E., Gray, C., Hamilton, W.K., Kothe, E., McGuinness, L., O’Dea, R., Sanchez-Tojar, A., Schermann, M.: metaverse: Workflows for evidence synthesis projects (2019), (link:https://github.com/ rmetaverse/metaverse)github.com/rmetaverse/met, r package version 0.0.1

  37. [37]

    O’Reilly Media (2015), https://books.google.com.au/books?id=DqSxBwAAQBAJ, bibtex*[lccn=2015472811]

    Wickham, H.: R Packages: Organize, Test, Document, and Share Your Code. O’Reilly Media (2015), https://books.google.com.au/books?id=DqSxBwAAQBAJ, bibtex*[lccn=2015472811]

  38. [38]

    Wickham, H.: testthat: Get Started with Testing (2011)

  39. [39]

    Wickham, H.: tidyverse: Easily Install and Load the ’Tidyverse’ (2017), https: //CRAN.R-project.org/package=tidyverse

  40. [40]

    PLoS Biology 12(1), e1001745 (Jan 2014)

    Wilson, G., Aruliah, D.A., Brown, C.T., Chue Hong, N.P., Davis, M., Guy, R.T., Haddock, S.H.D., Huff, K.D., Mitchell, I.M., Plumbley, M.D., Waugh, B., White, E.P., Wilson, P.: Best Practices for Scientific Computing. PLoS Biology 12(1), e1001745 (Jan 2014). https://doi.org/10.1371/journal.pbio.1001745, https://dx. plos.org/10.1371/journal.pbio.1001745

  41. [41]

    PLOS Computational Biology 13(6), e1005510 (Jun 2017)

    Wilson, G., Bryan, J., Cranston, K., Kitzes, J., Nederbragt, L., Teal, T.K.: Good enough practices in scientific computing. PLOS Computational Biology 13(6), e1005510 (Jun 2017). https://doi.org/10.1371/journal.pcbi.1005510, http: //dx.plos.org/10.1371/journal.pcbi.1005510

  42. [42]

    R News 5(1), 39–40 (2005), https://CRAN

    Zeileis, A.: CRAN Task Views. R News 5(1), 39–40 (2005), https://CRAN. R-project.org/doc/Rnews/