Entropy production bounds for systems running computer programs

Abhishek Yadav; David Wolpert; Francesco Caravelli

arxiv: 2411.16088 · v4 · submitted 2024-11-25 · ❄️ cond-mat.stat-mech

Entropy production bounds for systems running computer programs

Abhishek Yadav , Francesco Caravelli , David Wolpert This is my paper

Pith reviewed 2026-05-23 17:03 UTC · model grok-4.3

classification ❄️ cond-mat.stat-mech

keywords mismatch costentropy productionthermodynamic costsorting algorithmscomputer programsheat flowsubroutinesdigital computers

0 comments

The pith

Mismatch cost provides a lower bound on entropy production that scales at least linearly with heat flow and increases when time intervals are subdivided.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that mismatch cost, a universal lower bound on entropy production for any physical process, scales at least linearly with total heat flow in the worst case over initial distributions. It further shows that this bound does not decrease, and often increases, when a time interval is split into smaller subintervals. Building on this, the authors develop a framework to compute the minimal entropy production required to run a specific computer program on any physical system that implements a digital computer. They demonstrate the framework by comparing the costs of bubble sort and bucket sort, revealing how costs vary with input size and the presence of repeated entries. The framework also extends to handle programs that invoke subroutines.

Core claim

Mismatch cost is a universally applicable lower bound on the entropy production of any fixed physical process across a given time interval. For physical systems that implement modern digital computers, this bound equals the minimal entropy production associated with running a given program and can be computed via a general framework that maps program steps to thermodynamic costs. The framework is applied to show that bubble sort and bucket sort have different mismatch costs that depend on input size and structure, and it extends to programs calling subroutines.

What carries the argument

Mismatch cost (MMC), a universally applicable lower bound on entropy production for any fixed physical process across a given time interval.

If this is right

Mismatch cost scales at least linearly with the total heat flow in the worst case over initial distributions.
The mismatch cost lower bound over a given time interval never decreases if the interval is subdivided into sub-intervals, and often increases.
The minimal entropy production for running a program depends on input size and structure such as the presence of repeated entries.
Different algorithms incur different mismatch costs, as shown by direct comparison of bubble sort and bucket sort.
The framework extends directly to programs that call subroutines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Finer time resolution in the analysis of program execution would generally produce stricter lower bounds on thermodynamic cost.
The framework could be used to rank algorithms by their minimal energy requirements for large inputs without running them on hardware.
If the mapping from digital programs to physical systems holds, the same approach might quantify costs for programs that include conditional branches or loops.

Load-bearing premise

Any physical system that implements a modern digital computer can be mapped to the general framework for computing minimal entropy production associated with running a given program.

What would settle it

A measurement on a physical computer running bubble sort on a known input where total entropy production falls below the mismatch cost value computed by the framework for that program and input.

Figures

Figures reproduced from arXiv: 2411.16088 by Abhishek Yadav, David Wolpert, Francesco Caravelli.

**Figure 2.** Figure 2: Stored program architecture: The control unit has access to multiple registers, including the instruction register and program counter. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Two C programs are translated into their corresponding lower-level RASP representations. The RASP code explicitly shows how [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: State space of the heaviside program execution defined in [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Mismatch cost of the heaviside program. The input variable x follows a binomial distribution pin(x = k) = n k α k (1 − α) n−k , parameterized by a Bernoulli parameter α, while the variable c is fixed at 5. This input distribution induces a distribution over all program variables, thereby determining the initial distribution pX0 over the program’s state space. (a) Mismatch cost incurred in each iteration … view at source ↗

**Figure 6.** Figure 6: (a) High-level source code for the bubble sort algorithm written in C. (b) Corresponding lower-level representation of the bubble [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 7.** Figure 7: State space of the BubbleSort program for input arrays of length n = 3 (top) and n = 4 (bottom). The input arrays are permutations of {1, . . . , n}. Each node represents a unique state of the program, and directed edges correspond to state transitions caused by the execution of individual instructions. Leaf nodes represent possible initial states of the program, determined by different input permutations… view at source ↗

**Figure 8.** Figure 8: Mismatch cost incurred at each iteration of the bubble sort algorithm, denoted by [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 10.** Figure 10: PMC trajectories for bubble sort for both permutations [PITH_FULL_IMAGE:figures/full_fig_p011_10.png] view at source ↗

**Figure 11.** Figure 11: A source code for bucket sort written in MATLAP [PITH_FULL_IMAGE:figures/full_fig_p012_11.png] view at source ↗

**Figure 12.** Figure 12: Cumulative mismatch cost (MMC) at each iteration of the [PITH_FULL_IMAGE:figures/full_fig_p013_12.png] view at source ↗

read the original abstract

Mismatch cost (MMC) is a universally applicable lower bound on the entropy production (EP) of any fixed physical process across a given time interval. In the first part of the paper, we establish results concerning MMC to prove that it scales at least linearly with the total heat flow in the worst case over initial distributions. We also prove that the MMC lower bound over a given time interval never decreases if the time interval is subdivided into a sequence of sub-intervals, and that the bound often increases. In the second part of the paper, we introduce a general framework for computing the minimal EP (i.e., the MMC) associated with running a computer program on any physical system that implements a modern digital computer. We apply this general framework to compare MMC of running two canonical sorting algorithms, bubble sort and bucket sort. The framework enables us to investigate how thermodynamic cost depends on features like input size and structure (e.g., with or without repeated entries). Finally, we extend the framework to programs that call subroutines.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves two MMC properties and sketches a framework to bound entropy production for concrete sorting algorithms, but the claimed generality to any physical computer implementation is asserted rather than shown.

read the letter

The first part establishes that mismatch cost scales at least linearly with total heat flow in the worst case over initial distributions, and that subdividing a time interval never lowers the MMC bound and often raises it. These are clean mathematical statements on an existing quantity. The second part supplies a framework that maps program structure—input size, repeats, and subroutine calls—onto MMC calculations for bubble sort versus bucket sort. That explicit mapping for real algorithms is the new element; earlier stochastic thermodynamics work did not do this for specific code. The sorting comparisons and the subroutine extension are the parts that could interest people who want to quantify thermodynamic cost of computation. The framework itself is presented at a level that lets one see how input structure enters the bound. The main soft spot is the assertion that the framework applies to any physical system implementing a modern digital computer. The abstract states the mapping exists but gives no construction or proof that it preserves the initial distributions, heat flows, and Markov property needed for the MMC bound to remain valid across continuous or non-Markovian realizations. If that step is only sketched, the concrete sorting results rest on an unverified generality. The MMC scaling and subdivision results stand on their own and do not depend on the mapping. This paper is for readers already working on thermodynamic bounds for computation who want to see how MMC can be computed for small programs. It is not yet a finished tool for arbitrary hardware. The MMC properties are worth referee time even if the application section needs tightening on the physical embedding. I would send it to review rather than desk reject.

Referee Report

2 major / 1 minor

Summary. The paper claims that mismatch cost (MMC) is a lower bound on entropy production that scales at least linearly with total heat flow in the worst case over initial distributions, and that the MMC bound over an interval is non-decreasing (and often strictly increasing) under subdivision into sub-intervals. It then introduces a general framework for computing MMC when running a computer program on any physical system implementing a modern digital computer, applies the framework to compare the MMC of bubble sort versus bucket sort (including dependence on input size and repeated entries), and extends the framework to programs that call subroutines.

Significance. The MMC scaling and subdivision results, if rigorously derived, are general mathematical statements about a lower bound on entropy production that could apply across non-equilibrium thermodynamics. The computational framework, if the required mapping from arbitrary physical dynamics to the MMC stochastic process is established, would enable algorithm-specific thermodynamic bounds that depend on program structure rather than implementation details, offering a route to compare the minimal EP of different algorithms such as sorting routines.

major comments (2)

[Second part (framework introduction and sorting applications)] The central claim of the second part—that MMC can be computed for running programs like bubble sort versus bucket sort on any physical implementation—rests on the existence of a general mapping from arbitrary physical dynamics (including those with continuous degrees of freedom or non-Markovian effects) to the MMC framework that preserves initial distributions, heat flows, and the stochastic process over computational states. The abstract asserts this mapping without indicating an explicit construction or proof of validity and tightness, which is load-bearing for all applications to concrete algorithms.
[First part (MMC properties) and its linkage to second part] The MMC scaling result (linear in worst-case heat flow) and the subdivision monotonicity result are presented as independent of the computational mapping, but their utility for the paper's overall conclusions on program thermodynamics requires that the mapping preserve the quantities appearing in those bounds; no verification is indicated that the mapping does so for the sorting examples.

minor comments (1)

[Abstract] The abstract refers to 'the MMC lower bound' without a forward reference to the precise definition or equation that will be used in the framework; adding an explicit pointer would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and for highlighting the need for greater clarity on the mapping from physical dynamics to the MMC framework. We address each major comment below, indicating revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: The central claim of the second part—that MMC can be computed for running programs like bubble sort versus bucket sort on any physical implementation—rests on the existence of a general mapping from arbitrary physical dynamics (including those with continuous degrees of freedom or non-Markovian effects) to the MMC framework that preserves initial distributions, heat flows, and the stochastic process over computational states. The abstract asserts this mapping without indicating an explicit construction or proof of validity and tightness, which is load-bearing for all applications to concrete algorithms.

Authors: We agree that the manuscript introduces the framework at a high level without a fully explicit construction or proof of the mapping for arbitrary physical systems (including continuous degrees of freedom or non-Markovian dynamics). The framework relies on coarse-graining any physical implementation of a digital computer to the discrete computational states while preserving the relevant thermodynamic quantities by definition of MMC. In the revision we will add an explicit section constructing the mapping, discussing its validity under the stated assumptions, and addressing tightness for the sorting applications. revision: yes
Referee: The MMC scaling result (linear in worst-case heat flow) and the subdivision monotonicity result are presented as independent of the computational mapping, but their utility for the paper's overall conclusions on program thermodynamics requires that the mapping preserve the quantities appearing in those bounds; no verification is indicated that the mapping does so for the sorting examples.

Authors: The MMC scaling and subdivision results are general mathematical statements derived independently of any specific mapping. In the computational framework, the mapping is constructed precisely so that initial distributions, heat flows, and the stochastic process over computational states are preserved; thus the bounds apply directly. We will add explicit verification of this preservation for the bubble-sort and bucket-sort examples in the revised manuscript to make the linkage clear. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected in derivation chain

full rationale

The paper's first part establishes new mathematical results on MMC scaling linearly with heat flow and non-decreasing under subdivision; these are presented as independent proofs without reference to fitted parameters or self-referential definitions. The second part introduces a new general framework for mapping physical computer implementations to MMC calculations and applies it to sorting algorithms and subroutines. No equations, self-citations, or ansatzes are quoted that reduce claims to inputs by construction. The framework and applications constitute original content rather than renaming or self-definition. The mapping assertion is a modeling choice, not a circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only; no explicit free parameters, axioms, or invented entities are stated beyond the given definition of mismatch cost.

axioms (1)

domain assumption Mismatch cost is a universally applicable lower bound on the entropy production of any fixed physical process across a given time interval.
Stated directly in the opening sentence of the abstract as the starting point for all subsequent results.

pith-pipeline@v0.9.0 · 5704 in / 1149 out tokens · 28492 ms · 2026-05-23T17:03:32.601465+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Mismatch cost (MMC) … MC(pX0) = D(pX0 ∥ qX0) − D(GpX0 ∥ GqX0)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

RASP model … adjacency matrix G … pXi+1 = G pXi

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

The Thermodynamic Costs of Simple Linear Regression
cond-mat.stat-mech 2026-05 unverdicted novelty 5.0

Thermodynamic lower bounds are approximated for exact and SGD linear regression, producing energy-aware scaling laws for optimal training dataset size given a target generalization error.

Reference graph

Works this paper leans on

41 extracted references · 41 canonical work pages · cited by 1 Pith paper

[1]

Heaviside program Consider the heaviside program introduced in Fig.3, whosestatespaceisdepictedinFig.4,generatedthroughsim- ulation of the corresponding low-level RASP code. The state of the program is given by the joint value of variablesx, c, a, pc , where x and c are the input variables,a is the con- ditional flag variable, andpc is the program counter...

work page
[2]

Bubble sort program We now turn to a sorting programs for integer arrays of length n. Fig. 6 depicts the source code forBubbleSortpro- gram and the corresponding low level RASP like code. The state of the program consists of an arrayarr of length n, boolean flagswapped, loop counteri, and a temporary vari- able temp. The full state of the program at any s...

work page
[3]

Stochastic thermodynamics, fluctuation theorems and molecular machines

Udo Seifert. Stochastic thermodynamics, fluctuation theorems and molecular machines. Reports on Progress in Physics , 75(12):126001, 2012

work page 2012
[4]

Thermodynamics of information

Juan MR Parrondo, Jordan M Horowitz, and Takahiro Sagawa. Thermodynamics of information. Nature Physics, 11(2):131– 139, 2015

work page 2015
[5]

The stochastic thermodynamics of compu- tation

David H Wolpert. The stochastic thermodynamics of compu- tation. Journal of Physics A: Mathematical and Theoretical , 52(19):193001, 2019. See arXiv:1905.05669 for updated ver- sion

work page arXiv 2019
[6]

Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

Artemy Kolchinsky and David H Wolpert. Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

work page 2021
[7]

On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings

Leo Szilard. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Behavioral Science, 9(4):301–310, 1964

work page 1964
[8]

Irreversibility and heat generation in the com- puting process

Rolf Landauer. Irreversibility and heat generation in the com- puting process. IBM Journal of Research and Development , 5(3):183–191, 1961

work page 1961
[9]

The thermodynamics of computation—a review

Charles H Bennett. The thermodynamics of computation—a review. International Journal of Theoretical Physics , 21:905– 940, 1982

work page 1982
[10]

Thermodynamic and logical reversibilities revisited

Takahiro Sagawa. Thermodynamic and logical reversibilities revisited. Journal of Statistical Mechanics: Theory and Exper- iment, 2014(3):P03025, 2014

work page 2014
[11]

ChristianVandenBroeckandMassimilianoEsposito.Ensemble and trajectory thermodynamics: a brief introduction.Physica A, 418:6–16, 2015

work page 2015
[12]

Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

David H Wolpert. Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

work page 2020
[13]

Dependenceofdissi- pationontheinitialdistributionoverstates

ArtemyKolchinskyandDavidHWolpert. Dependenceofdissi- pationontheinitialdistributionoverstates. Journal of Statistical Mechanics: Theory and Experiment , 2017(8):083202, 2017

work page 2017
[14]

Thermodynamics of computingwithcircuits

David H Wolpert and Artemy Kolchinsky. Thermodynamics of computingwithcircuits. New Journal of Physics,22(6):063047, 2020

work page 2020
[15]

Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

David H Wolpert. Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

work page 2020
[16]

New Journal of Physics, 25(12):123013, 2023

ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. New Journal of Physics, 25(12):123013, 2023

work page 2023
[17]

Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses

Andreas Dechant. Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses. Journal of Physics A: Mathematical and Theoretical , 55(9):094001, 2022

work page 2022
[18]

Thermodynamic uncertainty relation for biomolecular processes

Andre C Barato and Udo Seifert. Thermodynamic uncertainty relation for biomolecular processes. Physical review letters , 114(15):158101, 2015

work page 2015
[19]

Dissipation bounds all steady-state current fluctuations

Todd R Gingrich, Jordan M Horowitz, Nikolay Perunov, and Jeremy L England. Dissipation bounds all steady-state current fluctuations. Physical review letters, 116(12):120601, 2016

work page 2016
[20]

Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

Jordan M Horowitz and Todd R Gingrich. Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

work page 2020
[21]

A stochastic model of mathematicsandscience

David H Wolpert and David B Kinney. A stochastic model of mathematicsandscience. Foundations of Physics,2024. Invited Contribution

work page 2024
[22]

Michael sipser

Lance Fortnow. Michael sipser. introduction to the theory of computation. pws publishing company, boston etc. 1997, xv + 396 pp. Journal of Symbolic Logic , 64(1):403–403, March 15 1999

work page 1997
[23]

Cambridge University Press, Cambridge, 2009

Sanjeev Arora and Boaz Barak.Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge, 2009

work page 2009
[24]

Lecturenotesoninformation theory

YuryPolyanskiyandYihongWu. Lecturenotesoninformation theory. Lecture Notes for ECE563 (UIUC) and,6(2012-2016):7, 2014

work page 2012
[25]

arXiv preprint arXiv:2208.06895, 2022

ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. arXiv preprint arXiv:2208.06895, 2022

work page arXiv 2022
[26]

Speed limit for classical stochastic processes

Naoto Shiraishi, Ken Funo, and Keĳi Saito. Speed limit for classical stochastic processes. Physical review letters , 121(7):070601, 2018

work page 2018
[27]

Physical Review E, 102(6):062132, 2020

VanTuanVo,TanVanVu,andYoshihikoHasegawa.Unifiedap- proach to classical speed limit and thermodynamic uncertainty relation. Physical Review E, 102(6):062132, 2020

work page 2020
[28]

Unified thermodynamic–kineticuncertaintyrelation

Tan Van Vu, Yoshihiko Hasegawa, et al. Unified thermodynamic–kineticuncertaintyrelation. Journal of Physics A: Mathematical and Theoretical, 55(40):405004, 2022

work page 2022
[29]

Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021

Ken Hiura and Shin-ichi Sasa. Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021. Appendix A: Methods of estimation of prior distribution Considerasystemwithdiscretespace X andlet ∆X denotethesimplexonthestatespace. Supposethataninitialdistribution pX0 ∈ ∆X transforms to a final distributionp...

work page 2021
[30]

2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers

By using an iterative method In order to find the minimum ofC(pX0) in Eq. 2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers. Evaluating the partial derivative ofC(pX0) with respect topX0(x′) for anx′ ∈ X; ∂ ∂pX0(x′) C(pX0) = f(x′) − X x∈X G(x|x′) lnGpX0(x) + lnpX0(x′) (A7) The normalization constraintg(pX0) = P x ...

work page
[31]

Start with a randomly initialized probability vector q ∈ ∆X

By using Monte Carlo method To minimize the cost function A2, we employ a simulated annealing algorithm that iteratively refines a candidate distribution over the simplex∆X while gradually reducing randomness to ensure convergence. Start with a randomly initialized probability vector q ∈ ∆X. Set the initial temperature T0, cooling rate α < 1, and interpol...

work page
[32]

Sample a new random distributionr ∈ ∆X

work page
[33]

Form a candidate distribution: q′ = (1 − s)r + sq, and normalize if necessary

work page
[34]

Evaluate the cost: C(q′) = S(Gq′) − S(q′) + ⟨f ⟩q′

work page
[35]

Accept q′ with probability: P = ( 1, if C(q′) < C(q), exp − C(q′)−C(q) T , otherwise. 17 Commands used in the RASP Symbol Definition LOAD Rn, value Load a constant or memory value into registerRn STORE Rn, addr Store value in registerRnto memory addressaddr READ Rn, addr Read memory value ataddrinto registerRn ADD R1, R2, R3 R3 ← R1 + R2 SUB R1, R2, R3 R3...

work page
[36]

If accepted, setq ← q′ and update the best cost

work page
[37]

The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost

Update the temperature:T ← αT, and reduces gradually. The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost. The final distribution q approximates the minimizer ofC(pX0). This annealing strategy ensures global exploration early on and local refinement as the temperature cools, making it suitable ...

work page
[38]

In this setting, the prior distributionqX0 is unique

Lower bound on the worst-case mismatch cost To derive a lower bound on the MMC as a function of the state-dependent termf, we focus on the case where the mapG induces a single island decomposition. In this setting, the prior distributionqX0 is unique. The prior distributionqX0 satisfies Eq. A9, derived using the method of Lagrange multipliers: ln qX0(x) +...

work page
[39]

output” y ∈ Y given“input

Proof that prior is always in the interior of the simplex Consideraconditionaldistribution P (y|x)thatspecifiestheprobabilityof“output” y ∈ Y given“input”x ∈ X,where X and Y are finite. Given some Z ⊆ X , the island decompositionLZ(P ) of P, and anyp ∈ ∆X, let p(c) = P x∈c p(x) indicate the total probability within islandc, and pc(x) := ( p(x) p(c) if x ∈...

work page
[40]

Define ˜f : Z → R via ˜f(x) = f(x) for x ∈ Z

work page
[41]

In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z

Define the conditional distribution˜P (y|x) for y ∈ Y , x ∈ Z via ˜P (y|x) = P (y|x) for ally ∈ Y , x ∈ Z. In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z. Now, by inspection, it can be verified that for anyp ∈ ∆X with supp p ⊆ Z, C(p) = S( ˜P ˜p) − S(˜p) + E˜p[ ˜f] =: ˜C(˜p) (B33) ...

work page

[1] [1]

Heaviside program Consider the heaviside program introduced in Fig.3, whosestatespaceisdepictedinFig.4,generatedthroughsim- ulation of the corresponding low-level RASP code. The state of the program is given by the joint value of variablesx, c, a, pc , where x and c are the input variables,a is the con- ditional flag variable, andpc is the program counter...

work page

[2] [2]

Bubble sort program We now turn to a sorting programs for integer arrays of length n. Fig. 6 depicts the source code forBubbleSortpro- gram and the corresponding low level RASP like code. The state of the program consists of an arrayarr of length n, boolean flagswapped, loop counteri, and a temporary vari- able temp. The full state of the program at any s...

work page

[3] [3]

Stochastic thermodynamics, fluctuation theorems and molecular machines

Udo Seifert. Stochastic thermodynamics, fluctuation theorems and molecular machines. Reports on Progress in Physics , 75(12):126001, 2012

work page 2012

[4] [4]

Thermodynamics of information

Juan MR Parrondo, Jordan M Horowitz, and Takahiro Sagawa. Thermodynamics of information. Nature Physics, 11(2):131– 139, 2015

work page 2015

[5] [5]

The stochastic thermodynamics of compu- tation

David H Wolpert. The stochastic thermodynamics of compu- tation. Journal of Physics A: Mathematical and Theoretical , 52(19):193001, 2019. See arXiv:1905.05669 for updated ver- sion

work page arXiv 2019

[6] [6]

Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

Artemy Kolchinsky and David H Wolpert. Dependence of in- tegrated, instantaneous, and fluctuating entropy production on the initial state in quantum and classical processes.Physical Review E, 104(5):054107, 2021

work page 2021

[7] [7]

On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings

Leo Szilard. On the decrease of entropy in a thermodynamic system by the intervention of intelligent beings. Behavioral Science, 9(4):301–310, 1964

work page 1964

[8] [8]

Irreversibility and heat generation in the com- puting process

Rolf Landauer. Irreversibility and heat generation in the com- puting process. IBM Journal of Research and Development , 5(3):183–191, 1961

work page 1961

[9] [9]

The thermodynamics of computation—a review

Charles H Bennett. The thermodynamics of computation—a review. International Journal of Theoretical Physics , 21:905– 940, 1982

work page 1982

[10] [10]

Thermodynamic and logical reversibilities revisited

Takahiro Sagawa. Thermodynamic and logical reversibilities revisited. Journal of Statistical Mechanics: Theory and Exper- iment, 2014(3):P03025, 2014

work page 2014

[11] [11]

ChristianVandenBroeckandMassimilianoEsposito.Ensemble and trajectory thermodynamics: a brief introduction.Physica A, 418:6–16, 2015

work page 2015

[12] [12]

Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

David H Wolpert. Minimal entropy production rate of interact- ing systems.New Journal of Physics, 22(11):113013, 2020

work page 2020

[13] [13]

Dependenceofdissi- pationontheinitialdistributionoverstates

ArtemyKolchinskyandDavidHWolpert. Dependenceofdissi- pationontheinitialdistributionoverstates. Journal of Statistical Mechanics: Theory and Experiment , 2017(8):083202, 2017

work page 2017

[14] [14]

Thermodynamics of computingwithcircuits

David H Wolpert and Artemy Kolchinsky. Thermodynamics of computingwithcircuits. New Journal of Physics,22(6):063047, 2020

work page 2020

[15] [15]

Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

David H Wolpert. Uncertainty relations and fluctuation theo- rems for bayes nets.Physical Review Letters, 125(20):200602, 2020

work page 2020

[16] [16]

New Journal of Physics, 25(12):123013, 2023

ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. New Journal of Physics, 25(12):123013, 2023

work page 2023

[17] [17]

Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses

Andreas Dechant. Minimum entropy production, detailed bal- ance and wasserstein distance for continuous-time markov pro- cesses. Journal of Physics A: Mathematical and Theoretical , 55(9):094001, 2022

work page 2022

[18] [18]

Thermodynamic uncertainty relation for biomolecular processes

Andre C Barato and Udo Seifert. Thermodynamic uncertainty relation for biomolecular processes. Physical review letters , 114(15):158101, 2015

work page 2015

[19] [19]

Dissipation bounds all steady-state current fluctuations

Todd R Gingrich, Jordan M Horowitz, Nikolay Perunov, and Jeremy L England. Dissipation bounds all steady-state current fluctuations. Physical review letters, 116(12):120601, 2016

work page 2016

[20] [20]

Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

Jordan M Horowitz and Todd R Gingrich. Thermodynamic un- certainty relations constrain non-equilibrium fluctuations.Na- ture Physics, 16(1):15–20, 2020

work page 2020

[21] [21]

A stochastic model of mathematicsandscience

David H Wolpert and David B Kinney. A stochastic model of mathematicsandscience. Foundations of Physics,2024. Invited Contribution

work page 2024

[22] [22]

Michael sipser

Lance Fortnow. Michael sipser. introduction to the theory of computation. pws publishing company, boston etc. 1997, xv + 396 pp. Journal of Symbolic Logic , 64(1):403–403, March 15 1999

work page 1997

[23] [23]

Cambridge University Press, Cambridge, 2009

Sanjeev Arora and Boaz Barak.Computational Complexity: A Modern Approach. Cambridge University Press, Cambridge, 2009

work page 2009

[24] [24]

Lecturenotesoninformation theory

YuryPolyanskiyandYihongWu. Lecturenotesoninformation theory. Lecture Notes for ECE563 (UIUC) and,6(2012-2016):7, 2014

work page 2012

[25] [25]

arXiv preprint arXiv:2208.06895, 2022

ThomasEOuldridgeandDavidHWolpert.Thermodynamicsof deterministicfiniteautomataoperatinglocallyandperiodically. arXiv preprint arXiv:2208.06895, 2022

work page arXiv 2022

[26] [26]

Speed limit for classical stochastic processes

Naoto Shiraishi, Ken Funo, and Keĳi Saito. Speed limit for classical stochastic processes. Physical review letters , 121(7):070601, 2018

work page 2018

[27] [27]

Physical Review E, 102(6):062132, 2020

VanTuanVo,TanVanVu,andYoshihikoHasegawa.Unifiedap- proach to classical speed limit and thermodynamic uncertainty relation. Physical Review E, 102(6):062132, 2020

work page 2020

[28] [28]

Unified thermodynamic–kineticuncertaintyrelation

Tan Van Vu, Yoshihiko Hasegawa, et al. Unified thermodynamic–kineticuncertaintyrelation. Journal of Physics A: Mathematical and Theoretical, 55(40):405004, 2022

work page 2022

[29] [29]

Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021

Ken Hiura and Shin-ichi Sasa. Kinetic uncertainty relation on first-passage time for accumulated current.Physical Review E, 103(5):L050103, 2021. Appendix A: Methods of estimation of prior distribution Considerasystemwithdiscretespace X andlet ∆X denotethesimplexonthestatespace. Supposethataninitialdistribution pX0 ∈ ∆X transforms to a final distributionp...

work page 2021

[30] [30]

2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers

By using an iterative method In order to find the minimum ofC(pX0) in Eq. 2 subject to the constrain thatP x∈X pX0(x) = 1 , we use the method of Lagrange multipliers. Evaluating the partial derivative ofC(pX0) with respect topX0(x′) for anx′ ∈ X; ∂ ∂pX0(x′) C(pX0) = f(x′) − X x∈X G(x|x′) lnGpX0(x) + lnpX0(x′) (A7) The normalization constraintg(pX0) = P x ...

work page

[31] [31]

Start with a randomly initialized probability vector q ∈ ∆X

By using Monte Carlo method To minimize the cost function A2, we employ a simulated annealing algorithm that iteratively refines a candidate distribution over the simplex∆X while gradually reducing randomness to ensure convergence. Start with a randomly initialized probability vector q ∈ ∆X. Set the initial temperature T0, cooling rate α < 1, and interpol...

work page

[32] [32]

Sample a new random distributionr ∈ ∆X

work page

[33] [33]

Form a candidate distribution: q′ = (1 − s)r + sq, and normalize if necessary

work page

[34] [34]

Evaluate the cost: C(q′) = S(Gq′) − S(q′) + ⟨f ⟩q′

work page

[35] [35]

Accept q′ with probability: P = ( 1, if C(q′) < C(q), exp − C(q′)−C(q) T , otherwise. 17 Commands used in the RASP Symbol Definition LOAD Rn, value Load a constant or memory value into registerRn STORE Rn, addr Store value in registerRnto memory addressaddr READ Rn, addr Read memory value ataddrinto registerRn ADD R1, R2, R3 R3 ← R1 + R2 SUB R1, R2, R3 R3...

work page

[36] [36]

If accepted, setq ← q′ and update the best cost

work page

[37] [37]

The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost

Update the temperature:T ← αT, and reduces gradually. The algorithm terminates after a fixed number of iterations or when successive updates no longer improve the cost. The final distribution q approximates the minimizer ofC(pX0). This annealing strategy ensures global exploration early on and local refinement as the temperature cools, making it suitable ...

work page

[38] [38]

In this setting, the prior distributionqX0 is unique

Lower bound on the worst-case mismatch cost To derive a lower bound on the MMC as a function of the state-dependent termf, we focus on the case where the mapG induces a single island decomposition. In this setting, the prior distributionqX0 is unique. The prior distributionqX0 satisfies Eq. A9, derived using the method of Lagrange multipliers: ln qX0(x) +...

work page

[39] [39]

output” y ∈ Y given“input

Proof that prior is always in the interior of the simplex Consideraconditionaldistribution P (y|x)thatspecifiestheprobabilityof“output” y ∈ Y given“input”x ∈ X,where X and Y are finite. Given some Z ⊆ X , the island decompositionLZ(P ) of P, and anyp ∈ ∆X, let p(c) = P x∈c p(x) indicate the total probability within islandc, and pc(x) := ( p(x) p(c) if x ∈...

work page

[40] [40]

Define ˜f : Z → R via ˜f(x) = f(x) for x ∈ Z

work page

[41] [41]

In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z

Define the conditional distribution˜P (y|x) for y ∈ Y , x ∈ Z via ˜P (y|x) = P (y|x) for ally ∈ Y , x ∈ Z. In addition, for any distributionp ∈ ∆X with supp p ⊆ Z, let ˜p be a distribution overZ defined via˜p(x) = p(x) for x ∈ Z. Now, by inspection, it can be verified that for anyp ∈ ∆X with supp p ⊆ Z, C(p) = S( ˜P ˜p) − S(˜p) + E˜p[ ˜f] =: ˜C(˜p) (B33) ...

work page