pith. sign in

arxiv: 2411.16088 · v4 · submitted 2024-11-25 · ❄️ cond-mat.stat-mech

Entropy production bounds for systems running computer programs

Pith reviewed 2026-05-23 17:03 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech
keywords mismatch costentropy productionthermodynamic costsorting algorithmscomputer programsheat flowsubroutinesdigital computers
0
0 comments X

The pith

Mismatch cost provides a lower bound on entropy production that scales at least linearly with heat flow and increases when time intervals are subdivided.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that mismatch cost, a universal lower bound on entropy production for any physical process, scales at least linearly with total heat flow in the worst case over initial distributions. It further shows that this bound does not decrease, and often increases, when a time interval is split into smaller subintervals. Building on this, the authors develop a framework to compute the minimal entropy production required to run a specific computer program on any physical system that implements a digital computer. They demonstrate the framework by comparing the costs of bubble sort and bucket sort, revealing how costs vary with input size and the presence of repeated entries. The framework also extends to handle programs that invoke subroutines.

Core claim

Mismatch cost is a universally applicable lower bound on the entropy production of any fixed physical process across a given time interval. For physical systems that implement modern digital computers, this bound equals the minimal entropy production associated with running a given program and can be computed via a general framework that maps program steps to thermodynamic costs. The framework is applied to show that bubble sort and bucket sort have different mismatch costs that depend on input size and structure, and it extends to programs calling subroutines.

What carries the argument

Mismatch cost (MMC), a universally applicable lower bound on entropy production for any fixed physical process across a given time interval.

If this is right

  • Mismatch cost scales at least linearly with the total heat flow in the worst case over initial distributions.
  • The mismatch cost lower bound over a given time interval never decreases if the interval is subdivided into sub-intervals, and often increases.
  • The minimal entropy production for running a program depends on input size and structure such as the presence of repeated entries.
  • Different algorithms incur different mismatch costs, as shown by direct comparison of bubble sort and bucket sort.
  • The framework extends directly to programs that call subroutines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Finer time resolution in the analysis of program execution would generally produce stricter lower bounds on thermodynamic cost.
  • The framework could be used to rank algorithms by their minimal energy requirements for large inputs without running them on hardware.
  • If the mapping from digital programs to physical systems holds, the same approach might quantify costs for programs that include conditional branches or loops.

Load-bearing premise

Any physical system that implements a modern digital computer can be mapped to the general framework for computing minimal entropy production associated with running a given program.

What would settle it

A measurement on a physical computer running bubble sort on a known input where total entropy production falls below the mismatch cost value computed by the framework for that program and input.

Figures

Figures reproduced from arXiv: 2411.16088 by Abhishek Yadav, David Wolpert, Francesco Caravelli.

Figure 1
Figure 1. Figure 1: Large mismatch cost contribution in the high-EP regime: Starting with arbitrary chosen values [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Stored program architecture: The control unit has access to multiple registers, including the instruction register and program counter. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Two C programs are translated into their corresponding lower-level RASP representations. The RASP code explicitly shows how [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: State space of the heaviside program execution defined in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Mismatch cost of the heaviside program. The input variable x follows a binomial distribution pin(x = k) = n k  α k (1 − α) n−k , parameterized by a Bernoulli parameter α, while the variable c is fixed at 5. This input distribution induces a distribution over all program variables, thereby determining the initial distribution pX0 over the program’s state space. (a) Mismatch cost incurred in each iteration … view at source ↗
Figure 6
Figure 6. Figure 6: (a) High-level source code for the bubble sort algorithm written in C. (b) Corresponding lower-level representation of the bubble [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: State space of the BubbleSort program for input arrays of length n = 3 (top) and n = 4 (bottom). The input arrays are per￾mutations of {1, . . . , n}. Each node represents a unique state of the program, and directed edges correspond to state transitions caused by the execution of individual instructions. Leaf nodes represent possible initial states of the program, determined by different input permutations… view at source ↗
Figure 8
Figure 8. Figure 8: Mismatch cost incurred at each iteration of the bubble sort algorithm, denoted by [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗
Figure 10
Figure 10. Figure 10: PMC trajectories for bubble sort for both permutations [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: A source code for bucket sort written in MATLAP [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Cumulative mismatch cost (MMC) at each iteration of the [PITH_FULL_IMAGE:figures/full_fig_p013_12.png] view at source ↗
read the original abstract

Mismatch cost (MMC) is a universally applicable lower bound on the entropy production (EP) of any fixed physical process across a given time interval. In the first part of the paper, we establish results concerning MMC to prove that it scales at least linearly with the total heat flow in the worst case over initial distributions. We also prove that the MMC lower bound over a given time interval never decreases if the time interval is subdivided into a sequence of sub-intervals, and that the bound often increases. In the second part of the paper, we introduce a general framework for computing the minimal EP (i.e., the MMC) associated with running a computer program on any physical system that implements a modern digital computer. We apply this general framework to compare MMC of running two canonical sorting algorithms, bubble sort and bucket sort. The framework enables us to investigate how thermodynamic cost depends on features like input size and structure (e.g., with or without repeated entries). Finally, we extend the framework to programs that call subroutines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that mismatch cost (MMC) is a lower bound on entropy production that scales at least linearly with total heat flow in the worst case over initial distributions, and that the MMC bound over an interval is non-decreasing (and often strictly increasing) under subdivision into sub-intervals. It then introduces a general framework for computing MMC when running a computer program on any physical system implementing a modern digital computer, applies the framework to compare the MMC of bubble sort versus bucket sort (including dependence on input size and repeated entries), and extends the framework to programs that call subroutines.

Significance. The MMC scaling and subdivision results, if rigorously derived, are general mathematical statements about a lower bound on entropy production that could apply across non-equilibrium thermodynamics. The computational framework, if the required mapping from arbitrary physical dynamics to the MMC stochastic process is established, would enable algorithm-specific thermodynamic bounds that depend on program structure rather than implementation details, offering a route to compare the minimal EP of different algorithms such as sorting routines.

major comments (2)
  1. [Second part (framework introduction and sorting applications)] The central claim of the second part—that MMC can be computed for running programs like bubble sort versus bucket sort on any physical implementation—rests on the existence of a general mapping from arbitrary physical dynamics (including those with continuous degrees of freedom or non-Markovian effects) to the MMC framework that preserves initial distributions, heat flows, and the stochastic process over computational states. The abstract asserts this mapping without indicating an explicit construction or proof of validity and tightness, which is load-bearing for all applications to concrete algorithms.
  2. [First part (MMC properties) and its linkage to second part] The MMC scaling result (linear in worst-case heat flow) and the subdivision monotonicity result are presented as independent of the computational mapping, but their utility for the paper's overall conclusions on program thermodynamics requires that the mapping preserve the quantities appearing in those bounds; no verification is indicated that the mapping does so for the sorting examples.
minor comments (1)
  1. [Abstract] The abstract refers to 'the MMC lower bound' without a forward reference to the precise definition or equation that will be used in the framework; adding an explicit pointer would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and for highlighting the need for greater clarity on the mapping from physical dynamics to the MMC framework. We address each major comment below, indicating revisions where appropriate to strengthen the manuscript.

read point-by-point responses
  1. Referee: The central claim of the second part—that MMC can be computed for running programs like bubble sort versus bucket sort on any physical implementation—rests on the existence of a general mapping from arbitrary physical dynamics (including those with continuous degrees of freedom or non-Markovian effects) to the MMC framework that preserves initial distributions, heat flows, and the stochastic process over computational states. The abstract asserts this mapping without indicating an explicit construction or proof of validity and tightness, which is load-bearing for all applications to concrete algorithms.

    Authors: We agree that the manuscript introduces the framework at a high level without a fully explicit construction or proof of the mapping for arbitrary physical systems (including continuous degrees of freedom or non-Markovian dynamics). The framework relies on coarse-graining any physical implementation of a digital computer to the discrete computational states while preserving the relevant thermodynamic quantities by definition of MMC. In the revision we will add an explicit section constructing the mapping, discussing its validity under the stated assumptions, and addressing tightness for the sorting applications. revision: yes

  2. Referee: The MMC scaling result (linear in worst-case heat flow) and the subdivision monotonicity result are presented as independent of the computational mapping, but their utility for the paper's overall conclusions on program thermodynamics requires that the mapping preserve the quantities appearing in those bounds; no verification is indicated that the mapping does so for the sorting examples.

    Authors: The MMC scaling and subdivision results are general mathematical statements derived independently of any specific mapping. In the computational framework, the mapping is constructed precisely so that initial distributions, heat flows, and the stochastic process over computational states are preserved; thus the bounds apply directly. We will add explicit verification of this preservation for the bubble-sort and bucket-sort examples in the revised manuscript to make the linkage clear. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's first part establishes new mathematical results on MMC scaling linearly with heat flow and non-decreasing under subdivision; these are presented as independent proofs without reference to fitted parameters or self-referential definitions. The second part introduces a new general framework for mapping physical computer implementations to MMC calculations and applies it to sorting algorithms and subroutines. No equations, self-citations, or ansatzes are quoted that reduce claims to inputs by construction. The framework and applications constitute original content rather than renaming or self-definition. The mapping assertion is a modeling choice, not a circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated beyond the given definition of mismatch cost.

axioms (1)
  • domain assumption Mismatch cost is a universally applicable lower bound on the entropy production of any fixed physical process across a given time interval.
    Stated directly in the opening sentence of the abstract as the starting point for all subsequent results.

pith-pipeline@v0.9.0 · 5704 in / 1149 out tokens · 28492 ms · 2026-05-23T17:03:32.601465+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Thermodynamic Costs of Simple Linear Regression

    cond-mat.stat-mech 2026-05 unverdicted novelty 5.0

    Thermodynamic lower bounds are approximated for exact and SGD linear regression, producing energy-aware scaling laws for optimal training dataset size given a target generalization error.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 1 Pith paper

  1. [1]

    Heaviside program Consider the heaviside program introduced in Fig.3, whosestatespaceisdepictedinFig.4,generatedthroughsim- ulation of the corresponding low-level RASP code. The state of the program is given by the joint value of variablesx, c, a, pc , where x and c are the input variables,a is the con- ditional flag variable, andpc is the program counter...

  2. [2]

    Bubble sort program We now turn to a sorting programs for integer arrays of length n. Fig. 6 depicts the source code forBubbleSortpro- gram and the corresponding low level RASP like code. The state of the program consists of an arrayarr of length n, boolean flagswapped, loop counteri, and a temporary vari- able temp. The full state of the program at any s...

  3. [3]

    Stochastic thermodynamics, fluctuation theorems and molecular machines

    Udo Seifert. Stochastic thermodynamics, fluctuation theorems and molecular machines. Reports on Progress in Physics , 75(12):126001, 2012

  4. [4]

    Thermodynamics of information

    Juan MR Parrondo, Jordan M Horowitz, and Takahiro Sagawa. Thermodynamics of information. Nature Physics, 11(2):131– 139, 2015

  5. [5]

    The stochastic thermodynamics of compu- tation

    David H Wolpert. The stochastic thermodynamics of compu- tation. Journal of Physics A: Mathematical and Theoretical , 52(19):193001, 2019. See arXiv:1905.05669 for updated ver- sion

  6. [6]

    Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

    Artemy Kolchinsky and David H Wolpert. Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

  7. [7]

    On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings

    Leo Szilard. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Behavioral Science, 9(4):301–310, 1964

  8. [8]

    Irreversibility and heat generation in the com- puting process

    Rolf Landauer. Irreversibility and heat generation in the com- puting process. IBM Journal of Research and Development , 5(3):183–191, 1961

  9. [9]

    The thermodynamics of computation—a review

    Charles H Bennett. The thermodynamics of computation—a review. International Journal of Theoretical Physics , 21:905– 940, 1982

  10. [10]

    Thermodynamic and logical reversibilities revisited

    Takahiro Sagawa. Thermodynamic and logical reversibilities revisited. Journal of Statistical Mechanics: Theory and Exper- iment, 2014(3):P03025, 2014

  11. [11]

    ChristianVandenBroeckandMassimilianoEsposito.Ensemble and trajectory thermodynamics: a brief introduction.Physica A, 418:6–16, 2015

  12. [12]

    Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

    David H Wolpert. Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

  13. [13]

    Dependenceofdissi- pationontheinitialdistributionoverstates

    ArtemyKolchinskyandDavidHWolpert. Dependenceofdissi- pationontheinitialdistributionoverstates. Journal of Statistical Mechanics: Theory and Experiment , 2017(8):083202, 2017

  14. [14]

    Thermodynamics of computingwithcircuits

    David H Wolpert and Artemy Kolchinsky. Thermodynamics of computingwithcircuits. New Journal of Physics,22(6):063047, 2020

  15. [15]

    Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

    David H Wolpert. Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

  16. [16]

    New Journal of Physics, 25(12):123013, 2023

    ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. New Journal of Physics, 25(12):123013, 2023

  17. [17]

    Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses

    Andreas Dechant. Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses. Journal of Physics A: Mathematical and Theoretical , 55(9):094001, 2022

  18. [18]

    Thermodynamic uncertainty relation for biomolecular processes

    Andre C Barato and Udo Seifert. Thermodynamic uncertainty relation for biomolecular processes. Physical review letters , 114(15):158101, 2015

  19. [19]

    Dissipation bounds all steady-state current fluctuations

    Todd R Gingrich, Jordan M Horowitz, Nikolay Perunov, and Jeremy L England. Dissipation bounds all steady-state current fluctuations. Physical review letters, 116(12):120601, 2016

  20. [20]

    Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

    Jordan M Horowitz and Todd R Gingrich. Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

  21. [21]

    A stochastic model of mathematicsandscience

    David H Wolpert and David B Kinney. A stochastic model of mathematicsandscience. Foundations of Physics,2024. Invited Contribution

  22. [22]

    Michael sipser

    Lance Fortnow. Michael sipser. introduction to the theory of computation. pws publishing company, boston etc. 1997, xv + 396 pp. Journal of Symbolic Logic , 64(1):403–403, March 15 1999

  23. [23]

    Cambridge University Press, Cambridge, 2009

    Sanjeev Arora and Boaz Barak.Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge, 2009

  24. [24]

    Lecturenotesoninformation theory

    YuryPolyanskiyandYihongWu. Lecturenotesoninformation theory. Lecture Notes for ECE563 (UIUC) and,6(2012-2016):7, 2014

  25. [25]

    arXiv preprint arXiv:2208.06895, 2022

    ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. arXiv preprint arXiv:2208.06895, 2022

  26. [26]

    Speed limit for classical stochastic processes

    Naoto Shiraishi, Ken Funo, and Keiji Saito. Speed limit for classical stochastic processes. Physical review letters , 121(7):070601, 2018

  27. [27]

    Physical Review E, 102(6):062132, 2020

    VanTuanVo,TanVanVu,andYoshihikoHasegawa.Unifiedap- proach to classical speed limit and thermodynamic uncertainty relation. Physical Review E, 102(6):062132, 2020

  28. [28]

    Unified thermodynamic–kineticuncertaintyrelation

    Tan Van Vu, Yoshihiko Hasegawa, et al. Unified thermodynamic–kineticuncertaintyrelation. Journal of Physics A: Mathematical and Theoretical, 55(40):405004, 2022

  29. [29]

    Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021

    Ken Hiura and Shin-ichi Sasa. Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021. Appendix A: Methods of estimation of prior distribution Considerasystemwithdiscretespace X andlet ∆X denotethesimplexonthestatespace. Supposethataninitialdistribution pX0 ∈ ∆X transforms to a final distributionp...

  30. [30]

    2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers

    By using an iterative method In order to find the minimum ofC(pX0) in Eq. 2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers. Evaluating the partial derivative ofC(pX0) with respect topX0(x′) for anx′ ∈ X; ∂ ∂pX0(x′) C(pX0) = f(x′) − X x∈X G(x|x′) lnGpX0(x) + lnpX0(x′) (A7) The normalization constraintg(pX0) = P x ...

  31. [31]

    Start with a randomly initialized probability vector q ∈ ∆X

    By using Monte Carlo method To minimize the cost function A2, we employ a simulated annealing algorithm that iteratively refines a candidate distribution over the simplex∆X while gradually reducing randomness to ensure convergence. Start with a randomly initialized probability vector q ∈ ∆X. Set the initial temperature T0, cooling rate α < 1, and interpol...

  32. [32]

    Sample a new random distributionr ∈ ∆X

  33. [33]

    Form a candidate distribution: q′ = (1 − s)r + sq, and normalize if necessary

  34. [34]

    Evaluate the cost: C(q′) = S(Gq′) − S(q′) + ⟨f ⟩q′

  35. [35]

    Accept q′ with probability: P = ( 1, if C(q′) < C(q), exp − C(q′)−C(q) T , otherwise. 17 Commands used in the RASP Symbol Definition LOAD Rn, value Load a constant or memory value into registerRn STORE Rn, addr Store value in registerRnto memory addressaddr READ Rn, addr Read memory value ataddrinto registerRn ADD R1, R2, R3 R3 ← R1 + R2 SUB R1, R2, R3 R3...

  36. [36]

    If accepted, setq ← q′ and update the best cost

  37. [37]

    The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost

    Update the temperature:T ← αT, and reduces gradually. The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost. The final distribution q approximates the minimizer ofC(pX0). This annealing strategy ensures global exploration early on and local refinement as the temperature cools, making it suitable ...

  38. [38]

    In this setting, the prior distributionqX0 is unique

    Lower bound on the worst-case mismatch cost To derive a lower bound on the MMC as a function of the state-dependent termf, we focus on the case where the mapG induces a single island decomposition. In this setting, the prior distributionqX0 is unique. The prior distributionqX0 satisfies Eq. A9, derived using the method of Lagrange multipliers: ln qX0(x) +...

  39. [39]

    output” y ∈ Y given“input

    Proof that prior is always in the interior of the simplex Consideraconditionaldistribution P (y|x)thatspecifiestheprobabilityof“output” y ∈ Y given“input”x ∈ X,where X and Y are finite. Given some Z ⊆ X , the island decompositionLZ(P ) of P, and anyp ∈ ∆X, let p(c) = P x∈c p(x) indicate the total probability within islandc, and pc(x) := ( p(x) p(c) if x ∈...

  40. [40]

    Define ˜f : Z → R via ˜f(x) = f(x) for x ∈ Z

  41. [41]

    In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z

    Define the conditional distribution˜P (y|x) for y ∈ Y , x ∈ Z via ˜P (y|x) = P (y|x) for ally ∈ Y , x ∈ Z. In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z. Now, by inspection, it can be verified that for anyp ∈ ∆X with supp p ⊆ Z, C(p) = S( ˜P ˜p) − S(˜p) + E˜p[ ˜f] =: ˜C(˜p) (B33) ...