arxiv: 2604.04462 · v1 · submitted 2026-04-06 · 🪐 quant-ph

Recognition: 3 theorem links

· Lean Theorem

A Demon that remembers: An agential approach towards quantum thermodynamics of temporal correlations

Ruo Cheng Huang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 20:04 UTC · model grok-4.3

classification 🪐 quant-ph

keywords quantum thermodynamicstemporal correlationsadaptive work extractiontime-ordered free energyreinforcement learningclassical agentsmemory effectsquantum correlations

0 comments

The pith

A classical agent with memory extracts more thermodynamic work from quantum temporal correlations than non-adaptive strategies allow.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decision-theoretic model in which a classical agent lacking quantum memory performs continuous inference and adaptive decisions to extract work from a quantum system exhibiting temporal correlations. It introduces ρ*-ideal protocols to show that memory-enabled adaptive choices exceed the performance limits of non-adaptive operations. The Time-Ordered Free Energy is defined as an upper bound on extractable work under causal adaptive operations and identifies a thermodynamic gap tied to adaptive ordered discord. Reinforcement learning is then applied to the case of completely unknown sources, yielding polylogarithmic cumulative dissipation that improves on standard tomography.

Core claim

By modeling a classical agent that remembers past observations and adapts its future actions, the work shows that ρ*-ideal protocols let adaptive strategies surpass non-adaptive bounds; this is formalized by the Time-Ordered Free Energy bound, which quantifies a gap linked to adaptive ordered discord and is complemented by a reinforcement-learning procedure that simultaneously learns an unknown i.i.d. quantum state and extracts work with only polylogarithmic total dissipation.

What carries the argument

The Time-Ordered Free Energy (TOFE), a novel upper bound on work obtainable from causal, adaptive operations that accounts for memory effects and reveals the thermodynamic cost of adaptive ordered discord.

If this is right

Adaptive memory-using strategies can achieve higher work yields than memoryless ones in any system whose correlations are time-ordered.
The thermodynamic gap quantified by the Time-Ordered Free Energy sets a concrete performance ceiling for causal agents.
Reinforcement-learning agents can learn and extract work from unknown quantum sources without first performing full tomography.
The framework separates the cost of inference from the cost of extraction, allowing quantitative comparison of different adaptive policies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Extending the agent to possess limited quantum memory could close or widen the identified thermodynamic gap, depending on how the extra coherence interacts with the TOFE bound.
The same decision-theoretic structure may apply to other tasks such as cooling or state preparation, not only work extraction.
Testing the polylogarithmic scaling on near-term quantum hardware would require only repeated single-shot measurements and classical feedback, providing a low-overhead benchmark.
If the TOFE bound proves tight, it supplies a practical design rule for scheduling adaptive measurements in quantum thermodynamic engines.

Load-bearing premise

The assumption that a classical agent without quantum memory can realize ρ*-ideal adaptive protocols without incurring thermodynamic costs outside those already captured by the Time-Ordered Free Energy bound.

What would settle it

An experiment or simulation in which an adaptive classical agent extracts strictly more work from a known quantum state with temporal correlations than the non-adaptive bound permits, or in which a reinforcement-learning agent achieves cumulative dissipation scaling as polylog N for an unknown i.i.d. source.

Figures

Figures reproduced from arXiv: 2604.04462 by Ruo Cheng Huang.

**Figure 1.1.** Figure 1.1: Illustration of Szilard’s Engine. The box has an initial volume V = αL. After isothermal expansion of the single particle within the box, the attached weight gains energy ∆U = kBT ln 2. However, in order for the demon to acquire knowledge of the particle’s position again, a minimum energy cost of ∆W = kBT ln 2 must be expended. The net energy change is at most zero, thereby preserving the second law of t… view at source ↗

**Figure 1.2.** Figure 1.2: Depiction of temporally correlated systems. Panel (a) shows a sequence of boxes that remains invariant over time; no feedback control is required to extract work from such a sequence. Panel (b) illustrates a sequence with an alternating pattern, where the agent must retain information about the preceding box to extract work effectively. Panel (c) depicts a system with more complex temporal correlations,… view at source ↗

**Figure 2.1.** Figure 2.1: Diagrammatic representation of qubits on a Bloch sphere. θ is the angle from the Z-axis and ϕ is the angle measured from the X-axis. Pure states reside on the surface while mixed states occupy the interior of the sphere. For a 2-dimensional quantum state or a “qubit”, its parametrization is given by |ψ⟩ = cos θ 2 |0⟩ + e iϕ sin θ 2 |1⟩ . (2.5) where θ ∈ [0, π] and ϕ ∈ [0, 2π] are the angles measured from… view at source ↗

**Figure 2.2.** Figure 2.2: Venn diagram of entropic quantities. The blue and red circles represent the entropies of random variables X and Y , respectively. Their union corresponds to the joint entropy of H(X, Y ), and the intersection represents the mutual information I(X; Y ). In quantum information theory, the analogue of Shannon entropy is the von Neumann entropy, defined for a quantum state ρ as S(ρ) = − tr(ρ log ρ) = − X i λ… view at source ↗

**Figure 2.3.** Figure 2.3: A circuit diagram representation of the work extraction protocol. The system Q represents the system where free energy is drawn, B is a battery, and R represents a thermal reservoir as an ancillary system. The protocol aims to transform ρQ to a thermal state γQ with the help of the thermal states from the reservoir; the free energy lost in system Q will be balanced by the increase in energy of the batter… view at source ↗

**Figure 2.4.** Figure 2.4: Distinction between collective processing vs single-copy(local) processing. Left panel: collective processing, where subsystems A1, . . . , AN are jointly acted upon by a global operation EA1,...,AN . Right panel: local processing, where only individual subsystems are operated on separately, with operations restricted to one subsystem at a time [PITH_FULL_IMAGE:figures/full_fig_p050_2_4.png] view at source ↗

**Figure 3.1.** Figure 3.1: An illustration of how the protocol works. At stage 1, U (ρ ∗) QB is applied to the system-battery joint state; it effectively attempts to diagonalize ρ ∗ in the energy eigenbasis. Stage 2 consists of M swap operations with the tunable thermal reservoir; the thermal qubit in each step becomes increasingly mixed. All energy changes during the operations are stored in the battery, B. of the battery. Mathem… view at source ↗

**Figure 4.1.** Figure 4.1: Latent-state sources of correlated quantum processes. Each arrow represents a transition between latent states; the label p : σ (x) indicates that the transition happens with probability p and produces a quantum state σ (x) . (a) Perturbed-coin process. (b) 2-1 golden-mean process [PITH_FULL_IMAGE:figures/full_fig_p074_4_1.png] view at source ↗

**Figure 4.2.** Figure 4.2: Basic form of a information ratchet considered in [1]. The ratchet contains within itself some internal state X ∈ X , it interacts with the input tape consisting of symbols Y ∈ Y according to some predetermined policy. The thermal reservoir provides the heat exchange necessary for work extraction. when the tape exhibits temporal correlations. Which is why it can be viewed as the classical analogue to the… view at source ↗

**Figure 4.3.** Figure 4.3: Schematic diagram illustrating the evolution of belief states over time. At each time step, the agent accesses its internal belief state Kt−1, which determines its action At . The agent then interacts with the quantum state σ (Xt) Qt and receives a corresponding reward Wt . A fundamental drawback of this history-dependent approach is that the required memory capacity scales linearly with time, rendering… view at source ↗

**Figure 4.4.** Figure 4.4: The effective dynamic of Perturbed Coin in [PITH_FULL_IMAGE:figures/full_fig_p084_4_4.png] view at source ↗

**Figure 4.5.** Figure 4.5: Schematic diagram of the sequential work extraction. At each time step, the process will take a quantum system, σQt , reservoir qubit, R, battery, B, and memory, M, as input. The ‘Work extraction’ box should be interpreted as a memory-dependent unitary. States of memory are recycled. The single wires represent quantum information being passed along, while the double wires represent classical information.… view at source ↗

**Figure 4.6.** Figure 4.6: The update rule for different parameters. Panel (a) shows the update map for parameters p = 0.9, r = 0.2, while panel (b) shows the update for p = 0.7, r = 0.2. The red and blue solid lines represent the update functions corresponding to the two work values {w (i)}i=0,1, and the black dotted line represents the identity map. states can be parametrized by a single variable as ηt = (1/2 + ϵt , 1/2 − ϵt), … view at source ↗

**Figure 4.7.** Figure 4.7: Comparison of average work-extraction rates across different approaches. The parameter p characterizes the transition probability between the two latent states in the perturbed-coin process, while r quantifies the overlap between the corresponding quantum outputs. (a) illustrates the enhancement in work extraction due to memory, and (b) shows the quantum advantage in work extraction. Panels (c) and (d) … view at source ↗

**Figure 4.8.** Figure 4.8: Comparison of the asymptotic work extraction rate of agents with varying block-length L, both memory-assisted and memoryless. The work extraction rate no doubt increased when we consider higher L, but notice that the phase boundary memory-advantageous and memory-apathetic region remains invariant with L. The addition of quantum memory thereby improves the physical efficiency of the extraction process wi… view at source ↗

**Figure 5.1.** Figure 5.1: Graph of reward and dissipation, conditioned on the belief state K = π. The action space is parametrized by θ ∈ [0, 2π]. The blue line represents V1(K0 = π) represented in Eq. (5.23), the black line represents the dissipation incurred at the second time step, in Eq. (5.21). The blue and black dotted lines correspond to the optimal action taken at t = 1 and t = 2, respectively. The first term represents t… view at source ↗

**Figure 5.2.** Figure 5.2: Comparison of asymptotic work extraction rate. The left panel shows the difference between the non-equilibrium free energy rate in Eq. (5.29) and the asymptotic TOFE rate in Eq. (5.26). The right panel shows the difference between the asymptotic TOFE rate and the asymptotic work extraction rate of an LO-agent that just aims to minimize immediate expected dissipation. bound obtained in Sec. 5.8.1. As ill… view at source ↗

**Figure 5.3.** Figure 5.3: Comparison between simulated work extraction using dynamic programming (DP) and the analytical adaptive multipartite discord defined in Eq. (5.33). The top row corresponds to a four-subsystem state, while the bottom row corresponds to a tripartite system. Panels (a) and (d) show the simulated work deficit under the optimal adaptive policy. Panels (b) and (e) show the corresponding analytical values of th… view at source ↗

**Figure 5.4.** Figure 5.4: Comparison of Bloch vectors of expected states in panel (b) and (c) with tailored states in panel (a) and (d) under different parameters against varying belief states parametrized by η = (1/2 + ϵ, 1/2 − ϵ). Panel (a)(b)(c) has parameter p = 0.6, r = 0.6, panel (e)(d)(f) has parameter p = 0.9, r = 0.2. Panels (c) and (f) show the expected dissipation. The orange line represents the Y -component, blue the … view at source ↗

**Figure 6.1.** Figure 6.1: Sketch of the sequential work extraction protocol with a thermal reservoir. At each time step k ∈ [N], the agent receives a copy of an unknown qubit state ψ and performs a thermal operation involving the reservoir and a battery. A measurement of the battery system is carried out to determine the extracted work ∆Wk, which is then used as feedback to improve the extraction strategy in subsequent rounds. ca… view at source ↗

**Figure 6.2.** Figure 6.2: Illustration of the iterations of the thermal operation in the full system QBR, where arrows represent Bloch vectors of states, showing that the system qubit becomes more and more mixed as the process goes. The energy gaps {νk,i}i of successive reservoir Hamiltonian forms a strictly decreasing sequence, making the successive thermal states more mixed. At each step, we take a new qubit from the reservoir … view at source ↗

**Figure 6.3.** Figure 6.3: Diagrammatic representation of the learning process. Based on past observations, the algorithm constructs a confidence region Ct around the unknown state ψ. It then selects two directions {Π (+) k , Π (+) k }to probe the space of maximum reward uncertainty, influencing the next state estimate ψk. Nevertheless, the learning protocol guarantees success with probability at least 1 − δ. By choosing δ = 1 N ,… view at source ↗

**Figure 6.4.** Figure 6.4: Performance scaling of the adaptive work extraction protocol. Cumulative dissipation (a) and dissipation rate (b) versus the number of rounds T (rate = average dissipation per copy). Blue: our adaptive protocol. Red: a tomography-first baseline that uses O(1/ √ T) of the available copies for learning and then applies the state-aware extraction protocol on the remainder. For our protocol, we probe four d… view at source ↗

read the original abstract

This thesis develops a decision-theoretic framework for extracting thermodynamic work from temporal correlations in quantum systems. We model a classical agent -- lacking quantum memory -- performing adaptive work extraction through continuous inference and decision-making under uncertainty. By introducing $\rho^*$-ideal protocols, we demonstrate that exploiting memory effects allows adaptive strategies to surpass non-adaptive bounds. We formalize this via the Time-Ordered Free Energy (TOFE), a novel upper bound for causal, adaptive operations that reveals a thermodynamic gap linked to adaptive ordered discord. Additionally, we tackle work extraction from unknown sources using reinforcement learning. By adapting multi-armed bandit algorithms, we show an agent can simultaneously learn an unknown i.i.d. quantum state and extract work, achieving polylogarithmic cumulative dissipation that significantly outperforms standard tomography. Overall, this work lays the foundation for predictive and learning-based quantum thermodynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This thesis sketches an agential framework for quantum work extraction with some fresh RL angles, but the claimed advantages rest on unverified protocol costs.

read the letter

The main takeaway is that this work brings a decision-theoretic lens to temporal correlations in quantum thermodynamics. It models a classical agent doing adaptive work extraction on unknown states and claims that memory effects let adaptive strategies beat non-adaptive ones, quantified by a new Time-Ordered Free Energy bound tied to adaptive ordered discord. The reinforcement-learning part adapts multi-armed bandit methods to learn an i.i.d. state while extracting work, reporting polylogarithmic cumulative dissipation that beats standard tomography.

Referee Report

3 major / 2 minor

Summary. The manuscript develops a decision-theoretic framework for thermodynamic work extraction from temporal correlations in quantum systems by a strictly classical agent lacking quantum memory. It introduces ρ*-ideal protocols to show that adaptive strategies exploiting memory effects can surpass non-adaptive bounds, formalized via the novel Time-Ordered Free Energy (TOFE) upper bound that quantifies a thermodynamic gap tied to adaptive ordered discord. It further applies reinforcement learning (adapted multi-armed bandit algorithms) to simultaneously learn an unknown i.i.d. quantum state and extract work, claiming polylogarithmic cumulative dissipation that outperforms standard tomography.

Significance. If the derivations are sound and the key modeling assumptions hold, this work could meaningfully connect quantum thermodynamics with decision theory and online learning, providing new bounds for adaptive causal operations and efficient protocols for unknown sources. The TOFE construction and RL performance claims, if independently grounded, would represent a concrete advance in handling temporal correlations without requiring quantum memory.

major comments (3)

[Framework for ρ*-ideal protocols (near abstract and main derivation)] The central modeling assumption that ρ*-ideal protocols can be realized by a strictly classical, memoryless agent without incurring additional thermodynamic costs (e.g., measurement back-action or control overhead from continuous inference) not captured by TOFE is load-bearing for both the claimed advantage over non-adaptive bounds and the RL result. This requires explicit justification or an auxiliary bound in the section introducing ρ*-ideal protocols.
[TOFE definition and proof] Derivation of the Time-Ordered Free Energy (TOFE) as an upper bound for causal adaptive operations: it must be shown whether TOFE is independently derived from the decision-theoretic axioms or reduces to a quantity fitted to the protocol class, as the latter would undermine the claimed thermodynamic gap linked to adaptive ordered discord.
[Reinforcement learning application] The reinforcement learning result claiming polylogarithmic cumulative dissipation outperforming tomography: the protocol details, including how the agent handles quantum measurements on the unknown state and the precise error analysis or regret bound, need to be expanded to verify the scaling and the comparison.

minor comments (2)

Ensure consistent notation for TOFE, ρ*, and related quantities across the manuscript and figures.
Add explicit statements on the scope of the classical-agent assumption and any implicit costs in the protocol implementation.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed report. Their comments identify key areas where additional justification and expansion will strengthen the manuscript. We address each major comment below and commit to the indicated revisions.

read point-by-point responses

Referee: [Framework for ρ*-ideal protocols (near abstract and main derivation)] The central modeling assumption that ρ*-ideal protocols can be realized by a strictly classical, memoryless agent without incurring additional thermodynamic costs (e.g., measurement back-action or control overhead from continuous inference) not captured by TOFE is load-bearing for both the claimed advantage over non-adaptive bounds and the RL result. This requires explicit justification or an auxiliary bound in the section introducing ρ*-ideal protocols.

Authors: We agree that the implementation of ρ*-ideal protocols by a strictly classical agent requires explicit justification to rule out unaccounted costs. In the revised manuscript we will add a dedicated subsection following the definition of ρ*-ideal protocols. This subsection will (i) specify that the agent maintains only classical memory of measurement outcomes, (ii) model inference as a classical Bayesian update whose thermodynamic cost is already subsumed in the TOFE accounting of temporal correlations, and (iii) supply an auxiliary inequality showing that any residual control overhead is bounded by a term that vanishes in the thermodynamic limit, thereby preserving the claimed advantage. revision: yes
Referee: [TOFE definition and proof] Derivation of the Time-Ordered Free Energy (TOFE) as an upper bound for causal adaptive operations: it must be shown whether TOFE is independently derived from the decision-theoretic axioms or reduces to a quantity fitted to the protocol class, as the latter would undermine the claimed thermodynamic gap linked to adaptive ordered discord.

Authors: TOFE is obtained directly from the decision-theoretic axioms of causal adaptive operations (ordered information processing and the second law applied to time-ordered channels). It is not fitted to any particular protocol class; the gap quantified by adaptive ordered discord emerges as a consequence of the derivation. We will revise the TOFE section to present the derivation in explicit axiomatic steps, beginning from the causal decision axioms, proceeding through the definition of time-ordered extractable work, and arriving at the TOFE bound, thereby making the independence of the construction transparent. revision: yes
Referee: [Reinforcement learning application] The reinforcement learning result claiming polylogarithmic cumulative dissipation outperforming tomography: the protocol details, including how the agent handles quantum measurements on the unknown state and the precise error analysis or regret bound, need to be expanded to verify the scaling and the comparison.

Authors: We will substantially expand the reinforcement-learning section. The revised text will describe the adapted multi-armed-bandit protocol in full: the classical agent maintains a posterior over the unknown i.i.d. state, selects the next measurement basis to maximize expected work minus information gain, performs a projective measurement, and updates its belief with the classical outcome. We will include the complete regret analysis, deriving the polylogarithmic bound on cumulative dissipation and contrasting it with the linear sample complexity of full tomography. These additions will allow independent verification of the scaling claim. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; claims rest on independent formalization

full rationale

The paper introduces ρ*-ideal protocols and defines TOFE as a novel upper bound for causal adaptive operations, linking a thermodynamic gap to adaptive ordered discord. It further applies multi-armed bandit RL to achieve polylogarithmic dissipation on unknown i.i.d. states, outperforming tomography. No equations, self-citations, or derivations are shown that reduce TOFE, the adaptive advantage, or the RL bound to fitted inputs or prior self-referential results by construction. The modeling assumptions (classical memoryless agent, realizability of ρ*-ideal protocols) are explicit and external to the derivation chain itself. The work therefore remains self-contained against the stated non-adaptive benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

Ledger is constructed from abstract only; full paper likely contains additional assumptions. Free parameters and detailed axioms cannot be fully enumerated without the manuscript.

axioms (2)

domain assumption Quantum systems possess exploitable temporal correlations that affect thermodynamic work extraction
Foundational modeling premise for the entire framework
domain assumption A classical agent can perform continuous inference and adaptive decisions without quantum memory
Core restriction that defines the agent's capabilities

invented entities (2)

ρ*-ideal protocols no independent evidence
purpose: Protocols that allow adaptive strategies to surpass non-adaptive bounds by exploiting memory effects
Newly introduced to demonstrate the thermodynamic gap
Time-Ordered Free Energy (TOFE) no independent evidence
purpose: Novel upper bound on work extractable by causal adaptive operations
Formalized as the central theoretical contribution

pith-pipeline@v0.9.0 · 5435 in / 1531 out tokens · 41872 ms · 2026-05-10T20:04:47.215036+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce a family of ρ*-ideal protocols … Time-Ordered Free Energy (TOFE) … adaptive ordered discord … multi-armed bandit algorithms … polylogarithmic cumulative dissipation
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean phi_golden_ratio echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

2-1 golden-mean process … perturbed-coin process
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

dynamic programming … backward induction … search space … optimality of DP

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

130 extracted references · 31 canonical work pages

[1]

Identifying functional thermodynamics in autonomous maxwellian ratchets.New Journal of Physics, 18(2):023049, 2016

Alexander B Boyd, Dibyendu Mandal, and James P Crutchfield. Identifying functional thermodynamics in autonomous maxwellian ratchets.New Journal of Physics, 18(2):023049, 2016

2016
[2]

Optical quan- tum memory.Nature photonics, 3(12):706–714, 2009

Alexander I Lvovsky, Barry C Sanders, and Wolfgang Tittel. Optical quan- tum memory.Nature photonics, 3(12):706–714, 2009

2009
[3]

Quantum mem- ories: emerging applications and recent advances.Journal of modern optics, 63(20):2005–2028, 2016

Khabat Heshami, Duncan G England, Peter C Humphreys, Philip J Bustard, Victor M Acosta, Joshua Nunn, and Benjamin J Sussman. Quantum mem- ories: emerging applications and recent advances.Journal of modern optics, 63(20):2005–2028, 2016

2005
[4]

Thermodynamics and an introduction to thermostatistics

Herbert B Callen. Thermodynamics and an introduction to thermostatistics. John Wiley& Sons, 2, 1980

1980
[5]

Oup Oxford, 2010

Stephen J Blundell and Katherine M Blundell.Concepts in thermal physics. Oup Oxford, 2010

2010
[6]

Oxford University Press, 2021

James P Sethna.Statistical mechanics: entropy, order parameters, and com- plexity, volume 14. Oxford University Press, 2021

2021
[7]

Courier Corporation, 2012

James Clerk Maxwell.Theory of heat. Courier Corporation, 2012

2012
[8]

¨Uber die entropieverminderung in einem thermodynamischen system bei eingriffen intelligenter wesen.Zeitschrift f¨ ur Physik, 53(11):840– 856, 1929

Leo Szilard. ¨Uber die entropieverminderung in einem thermodynamischen system bei eingriffen intelligenter wesen.Zeitschrift f¨ ur Physik, 53(11):840– 856, 1929

1929
[9]

Irreversibility and heat generation in the computing process

Rolf Landauer. Irreversibility and heat generation in the computing process. IBM journal of research and development, 5(3):183–191, 1961

1961
[10]

The thermodynamics of computation—a review.Inter- national Journal of Theoretical Physics, 21:905–940, 1982

Charles H Bennett. The thermodynamics of computation—a review.Inter- national Journal of Theoretical Physics, 21:905–940, 1982

1982
[11]

Thermodynamic cost of computation, algorithmic com- plexity and the information metric.Nature, 341(6238):119–124, 1989

Wojciech H Zurek. Thermodynamic cost of computation, algorithmic com- plexity and the information metric.Nature, 341(6238):119–124, 1989

1989
[12]

Col- loquium: Non-markovian dynamics in open quantum systems.Reviews of Modern Physics, 88(2):021002, 2016

Heinz-Peter Breuer, Elsi-Mari Laine, Jyrki Piilo, and Bassano Vacchini. Col- loquium: Non-markovian dynamics in open quantum systems.Reviews of Modern Physics, 88(2):021002, 2016. 127 128BIBLIOGRAPHY

2016
[13]

Mem- ory effects in quantum dynamics modelled by quantum renewal processes

Nina Megier, Manuel Ponzi, Andrea Smirne, and Bassano Vacchini. Mem- ory effects in quantum dynamics modelled by quantum renewal processes. Entropy, 23(7):905, 2021

2021
[14]

Memory correc- tions to markovian langevin dynamics.Entropy, 26(5):425, 2024

Mateusz Wi´ sniewski, Jerzy Luczka, and Jakub Spiechowicz. Memory correc- tions to markovian langevin dynamics.Entropy, 26(5):425, 2024

2024
[15]

Leveraging environmental correlations: The thermodynamics of requisite variety.Journal of Statistical Physics, 167:1555–1585, 2017

Alexander B Boyd, Dibyendu Mandal, and James P Crutchfield. Leveraging environmental correlations: The thermodynamics of requisite variety.Journal of Statistical Physics, 167:1555–1585, 2017

2017
[16]

Infor- mation processing second law for an information ratchet with finite tape

Lianjie He, Andri Pradana, Jian Wei Cheong, and Lock Yue Chew. Infor- mation processing second law for an information ratchet with finite tape. Physical Review E, 105(5):054131, 2022

2022
[17]

The proper formula for relative entropy and its asymptotics in quantum probability.Communications in mathematical physics, 143:99–114, 1991

Fumio Hiai and D´ enes Petz. The proper formula for relative entropy and its asymptotics in quantum probability.Communications in mathematical physics, 143:99–114, 1991

1991
[18]

Discriminating states: The quantum chernoff bound.Physical review letters, 98(16):160501, 2007

Koenraad MR Audenaert, John Calsamiglia, Ram´ on Munoz-Tapia, Emilio Bagan, Ll Masanes, Antonio Acin, and Frank Verstraete. Discriminating states: The quantum chernoff bound.Physical review letters, 98(16):160501, 2007

2007
[20]

Learning pure quantum states (almost) without regret

Josep Lumbreras, Mikhail Terekhov, and Marco Tomamichel. Learning pure quantum states (almost) without regret.arXiv preprint arXiv:2406.18370, 2024

work page arXiv 2024
[21]

Cambridge university press, 2010

Michael A Nielsen and Isaac L Chuang.Quantum computation and quantum information. Cambridge university press, 2010

2010
[22]

Cambridge university press, 2013

Mark M Wilde.Quantum information theory. Cambridge university press, 2013

2013
[23]

Positive functions on c*-algebras.Proceedings of the American Mathematical Society, 6(2):211–216, 1955

W Forrest Stinespring. Positive functions on c*-algebras.Proceedings of the American Mathematical Society, 6(2):211–216, 1955

1955
[24]

Quantum discord: a measure of the quantumness of correlations.Physical review letters, 88(1):017901, 2001

Harold Ollivier and Wojciech H Zurek. Quantum discord: a measure of the quantumness of correlations.Physical review letters, 88(1):017901, 2001

2001
[25]

Classical, quantum and total correla- tions.Journal of physics A: mathematical and general, 34(35):6899, 2001

Leah Henderson and Vlatko Vedral. Classical, quantum and total correla- tions.Journal of physics A: mathematical and general, 34(35):6899, 2001. BIBLIOGRAPHY129

2001
[26]

Quantum discord, local operations, and maxwell’s demons.Physical Review A—Atomic, Molecular, and Optical Physics, 81(6):062103, 2010

Aharon Brodutch and Daniel R Terno. Quantum discord, local operations, and maxwell’s demons.Physical Review A—Atomic, Molecular, and Optical Physics, 81(6):062103, 2010

2010
[27]

Necessary and suffi- cient condition for nonzero quantum discord.Physical review letters, 105 (19):190502, 2010

Borivoje Daki´ c, Vlatko Vedral, and ˇCaslav Brukner. Necessary and suffi- cient condition for nonzero quantum discord.Physical review letters, 105 (19):190502, 2010

2010
[28]

The classical-quantum boundary for correlations:¡? format?¿ discord and related measures.Reviews of Modern Physics, 84(4):1655–1707, 2012

Kavan Modi, Aharon Brodutch, Hugo Cable, Tomasz Paterek, and Vlatko Vedral. The classical-quantum boundary for correlations:¡? format?¿ discord and related measures.Reviews of Modern Physics, 84(4):1655–1707, 2012

2012
[29]

Quantum discord and maxwell’s demons.Physical Review A, 67(1):012320, 2003

Wojciech Hubert Zurek. Quantum discord and maxwell’s demons.Physical Review A, 67(1):012320, 2003

2003
[30]

Passive states and kms states for general quantum systems.Communications in Mathematical Physics, 58: 273–290, 1978

Wies law Pusz and Stanis law L Woronowicz. Passive states and kms states for general quantum systems.Communications in Mathematical Physics, 58: 273–290, 1978

1978
[31]

Maximal work extraction from finite quantum systems.Europhysics Letters, 67(4): 565, 2004

Armen E Allahverdyan, Roger Balian, and Th M Nieuwenhuizen. Maximal work extraction from finite quantum systems.Europhysics Letters, 67(4): 565, 2004

2004
[32]

Fundamental limitations for quantum and nanoscale thermodynamics.Nature communications, 4(1):2059, 2013

Micha l Horodecki and Jonathan Oppenheim. Fundamental limitations for quantum and nanoscale thermodynamics.Nature communications, 4(1):2059, 2013

2059
[33]

Truly work-like work extraction via a single-shot analysis

Johan ˚Aberg. Truly work-like work extraction via a single-shot analysis. Nature communications, 4(1):1925, 2013

1925
[34]

The second laws of quantum thermodynamics.Proceed- ings of the National Academy of Sciences, 112(11):3275–3279, 2015

Fernando Brandao, Micha l Horodecki, Nelly Ng, Jonathan Oppenheim, and Stephanie Wehner. The second laws of quantum thermodynamics.Proceed- ings of the National Academy of Sciences, 112(11):3275–3279, 2015

2015
[35]

Thermodynamic cost of reliability and low temperatures: Tightening lan- dauer’s principle and the second law.International Journal of Theoretical Physics, 39:2717–2753, 2000

Dominik Janzing, Pawel Wocjan, Robert Zeier, Rubino Geiss, and Th Beth. Thermodynamic cost of reliability and low temperatures: Tightening lan- dauer’s principle and the second law.International Journal of Theoretical Physics, 39:2717–2753, 2000

2000
[36]

Resource theory of quantum states out of thermal equilibrium.Physical review letters, 111(25):250404, 2013

Fernando GSL Brandao, Micha l Horodecki, Jonathan Oppenheim, Joseph M Renes, and Robert W Spekkens. Resource theory of quantum states out of thermal equilibrium.Physical review letters, 111(25):250404, 2013

2013
[37]

Quantum coherence, time-translation symmetry, and thermodynamics.Phys- ical review X, 5(2):021001, 2015

Matteo Lostaglio, Kamil Korzekwa, David Jennings, and Terry Rudolph. Quantum coherence, time-translation symmetry, and thermodynamics.Phys- ical review X, 5(2):021001, 2015

2015
[38]

The extraction of work from quantum coherence.New Journal of Physics, 18(2):023045, 2016

Kamil Korzekwa, Matteo Lostaglio, Jonathan Oppenheim, and David Jen- nings. The extraction of work from quantum coherence.New Journal of Physics, 18(2):023045, 2016. 130BIBLIOGRAPHY

2016
[39]

The minimal work cost of information processing.Nature communications, 6(1):7669, 2015

Philippe Faist, Fr´ ed´ eric Dupuis, Jonathan Oppenheim, and Renato Renner. The minimal work cost of information processing.Nature communications, 6(1):7669, 2015

2015
[40]

The resource theory of informational nonequi- librium in thermodynamics.Physics Reports, 583:1–58, 2015

Gilad Gour, Markus P M¨ uller, Varun Narasimhachar, Robert W Spekkens, and Nicole Yunger Halpern. The resource theory of informational nonequi- librium in thermodynamics.Physics Reports, 583:1–58, 2015

2015
[41]

A fully quantum asymptotic equipartition property.IEEE Transactions on information the- ory, 55(12):5840–5847, 2009

Marco Tomamichel, Roger Colbeck, and Renato Renner. A fully quantum asymptotic equipartition property.IEEE Transactions on information the- ory, 55(12):5840–5847, 2009

2009
[42]

Black box work extraction and com- posite hypothesis testing.Physical Review Letters, 133(25):250401, 2024

Kaito Watanabe and Ryuji Takagi. Black box work extraction and com- posite hypothesis testing.Physical Review Letters, 133(25):250401, 2024. doi: 10.1103/PhysRevLett.133.250401. URLhttps://doi.org/10.1103/ PhysRevLett.133.250401

work page doi:10.1103/physrevlett.133.250401 2024
[43]

Universal work extraction in quantum thermodynamics.arXiv preprint arXiv:2504.12373, 2025

Kaito Watanabe and Ryuji Takagi. Universal work extraction in quantum thermodynamics.arXiv preprint arXiv:2504.12373, 2025

work page arXiv 2025
[44]

Extractable work from correlations.Physical Review X, 5(4):041011, 2015

Mart´ ı Perarnau-Llobet, Karen V Hovhannisyan, Marcus Huber, Paul Skrzypczyk, Nicolas Brunner, and Antonio Ac´ ın. Extractable work from correlations.Physical Review X, 5(4):041011, 2015

2015
[45]

Entanglement boost for extractable work from ensembles of quantum batteries.Physical Review E—Statistical, Non- linear, and Soft Matter Physics, 87(4):042123, 2013

Robert Alicki and Mark Fannes. Entanglement boost for extractable work from ensembles of quantum batteries.Physical Review E—Statistical, Non- linear, and Soft Matter Physics, 87(4):042123, 2013

2013
[46]

Quantum sequential hypothesis testing.Physical review letters, 126(18):180502, 2021

Esteban Mart´ ınez Vargas, Christoph Hirche, Gael Sent´ ıs, Michalis Skotinio- tis, Marta Carrizo, Ramon Mu˜ noz-Tapia, and John Calsamiglia. Quantum sequential hypothesis testing.Physical review letters, 126(18):180502, 2021

2021
[47]

John Wiley & Sons, 1999

Thomas M Cover.Elements of information theory. John Wiley & Sons, 1999

1999
[48]

A tutorial on hidden markov models and selected applications in speech recognition.Proceedings of the IEEE, 77(2):257–286, 1989

Lawrence R Rabiner. A tutorial on hidden markov models and selected applications in speech recognition.Proceedings of the IEEE, 77(2):257–286, 1989

1989
[49]

Computational mechan- ics: Pattern and prediction, structure and simplicity.Journal of statistical physics, 104:817–879, 2001

Cosma Rohilla Shalizi and James P Crutchfield. Computational mechan- ics: Pattern and prediction, structure and simplicity.Journal of statistical physics, 104:817–879, 2001

2001
[50]

Spectral simplicity of apparent complexity

Paul M Riechers and James P Crutchfield. Spectral simplicity of apparent complexity. i. the nondiagonalizable metadynamics of prediction.Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(3), 2018

2018
[51]

Shannon entropy rate of hidden markov processes.Journal of Statistical Physics, 183(2):32, 2021

Alexandra M Jurgens and James P Crutchfield. Shannon entropy rate of hidden markov processes.Journal of Statistical Physics, 183(2):32, 2021. BIBLIOGRAPHY131

2021
[52]

Introduction to automata theory, languages, and computation.Acm Sigact News, 32(1):60– 65, 2001

John E Hopcroft, Rajeev Motwani, and Jeffrey D Ullman. Introduction to automata theory, languages, and computation.Acm Sigact News, 32(1):60– 65, 2001

2001
[53]

Finite-state con- trollers based on mealy machines for centralized and decentralized pomdps

Christopher Amato, Blai Bonet, and Shlomo Zilberstein. Finite-state con- trollers based on mealy machines for centralized and decentralized pomdps. InProceedings of the AAAI Conference on Artificial Intelligence, volume 24, pages 1052–1058, 2010

2010
[54]

Inferring statistical complexity.Phys- ical review letters, 63(2):105, 1989

James P Crutchfield and Karl Young. Inferring statistical complexity.Phys- ical review letters, 63(2):105, 1989

1989
[55]

Time’s barbed arrow: Irreversibility, crypticity, and stored information.Physical review letters, 103(9):094101, 2009

James P Crutchfield, Christopher J Ellison, and John R Mahoney. Time’s barbed arrow: Irreversibility, crypticity, and stored information.Physical review letters, 103(9):094101, 2009

2009
[56]

James P Crutchfield, Christopher J Ellison, Ryan G James, and John R Ma- honey. Synchronization and control in intrinsic and designed computation: An information-theoretic analysis of competing models of stochastic com- putation.Chaos: An Interdisciplinary Journal of Nonlinear Science, 20(3), 2010

2010
[57]

Prediction, retrodiction, and the amount of information stored in the present.Journal of Statistical Physics, 136(6):1005–1034, 2009

Christopher J Ellison, John R Mahoney, and James P Crutchfield. Prediction, retrodiction, and the amount of information stored in the present.Journal of Statistical Physics, 136(6):1005–1034, 2009

2009
[58]

du Buisson and H

Ariadna E Venegas-Li, Alexandra M Jurgens, and James P Crutchfield. Measurement-induced randomness and structure in controlled qubit pro- cesses.Physical Review E, 102(4):040102, 2020. doi: 10.1103/PhysRevE. 102.040102. URLhttps://doi.org/10.1103/PhysRevE.102.040102

work page doi:10.1103/physreve 2020
[59]

Optimality and complexity in measured quantum-state stochastic processes.Journal of Statistical Physics, 190(6):106, 2023

Ariadna Venegas-Li and James P Crutchfield. Optimality and complexity in measured quantum-state stochastic processes.Journal of Statistical Physics, 190(6):106, 2023

2023
[60]

Impossibility of achieving Landauer’s bound for almost every quantum state.Phys

Paul M Riechers and Mile Gu. Impossibility of achieving Landauer’s bound for almost every quantum state.Phys. Rev. A, 104:012214, Jul 2021. doi: 10. 1103/PhysRevA.104.012214. URLhttps://link.aps.org/doi/10.1103/ PhysRevA.104.012214

2021
[61]

Dominik ˇSafr´ anek, Dario Rosa, and Felix C. Binder. Work extraction from unknown quantum sources.Phys. Rev. Lett., 130:210401, May 2023. doi: 10.1103/PhysRevLett.130.210401. URLhttps://link.aps.org/doi/10. 1103/PhysRevLett.130.210401

work page doi:10.1103/physrevlett.130.210401 2023
[62]

Ther- modynamics of information.Nature physics, 11(2):131–139, 2015

Juan MR Parrondo, Jordan M Horowitz, and Takahiro Sagawa. Ther- modynamics of information.Nature physics, 11(2):131–139, 2015. doi: 10.1038/nphys3230. URLhttps://doi.org/10.1038/nphys3230. 132BIBLIOGRAPHY

work page doi:10.1038/nphys3230 2015
[63]

Second law of thermodynamics for batteries with vacuum state.Quantum, 5:408, 2021

Patryk Lipka-Bartosik, Pawe l Mazurek, and Micha l Horodecki. Second law of thermodynamics for batteries with vacuum state.Quantum, 5:408, 2021

2021
[64]

Catalytic coherence.Phys

Johan ˚Aberg. Catalytic coherence.Phys. Rev. Lett., 113:150402, Oct 2014. doi: 10.1103/PhysRevLett.113.150402. URLhttps://link.aps.org/doi/ 10.1103/PhysRevLett.113.150402

work page doi:10.1103/physrevlett.113.150402 2014
[65]

An improved landauer principle with finite- size corrections.New Journal of Physics, 16(10):103011, 2014

David Reeb and Michael M Wolf. An improved landauer principle with finite- size corrections.New Journal of Physics, 16(10):103011, 2014

2014
[66]

Implications of non-markovian quantum dynamics for the landauer bound.New Journal of Physics, 18(12):123018, 2016

Marco Pezzutto, Mauro Paternostro, and Yasser Omar. Implications of non-markovian quantum dynamics for the landauer bound.New Journal of Physics, 18(12):123018, 2016

2016
[67]

Initial-state dependence of thermodynamic dissipation for any quantum process.Physical Review E, 103(4):042145,

Paul M Riechers and Mile Gu. Initial-state dependence of thermodynamic dissipation for any quantum process.Physical Review E, 103(4):042145,
[68]

URLhttps://doi.org/10

doi: 10.1103/PhysRevE.103.042145. URLhttps://doi.org/10. 1103/PhysRevE.103.042145

work page doi:10.1103/physreve.103.042145
[70]

Finite-time quantum landauer principle and quantum coherence.Phys

Tan Van Vu and Keiji Saito. Finite-time quantum landauer principle and quantum coherence.Phys. Rev. Lett., 128:010602, Jan 2022. doi: 10.1103/PhysRevLett.128.010602. URLhttps://link.aps.org/doi/10. 1103/PhysRevLett.128.010602

work page doi:10.1103/physrevlett.128.010602 2022
[71]

Landauer versus nernst: What is the true cost of cooling a quantum system?PRX Quantum, 4(1):010332, 2023

Philip Taranto, Faraj Bakhshinezhad, Andreas Bluhm, Ralph Silva, Nicolai Friis, Maximilian PE Lock, Giuseppe Vitagliano, Felix C Binder, Tiago De- barba, Emanuel Schwarzhans, et al. Landauer versus nernst: What is the true cost of cooling a quantum system?PRX Quantum, 4(1):010332, 2023

2023
[72]

Strang.Calculus

G. Strang.Calculus. Wellesley-Cambridge Press, 2019. ISBN 9780980232752. URLhttps://www.cambridge.org/us/universitypress/subjects/ mathematics/real-and-complex-analysis/calculus-3rd-edition-1

2019
[73]

Quantifying coherence.Physical review letters, 113(14):140401, 2014

Tillmann Baumgratz, Marcus Cramer, and Martin B Plenio. Quantifying coherence.Physical review letters, 113(14):140401, 2014

2014
[74]

Woods and Micha l Horodecki

Mischa P. Woods and Micha l Horodecki. Autonomous quantum devices: When are they realizable without additional thermodynamic costs?Phys. Rev. X, 13:011016, Feb 2023. doi: 10.1103/PhysRevX.13.011016. URL https://link.aps.org/doi/10.1103/PhysRevX.13.011016

work page doi:10.1103/physrevx.13.011016 2023
[75]

Unified view of quantum and classical correlations.Phys

Kavan Modi, Tomasz Paterek, Wonmin Son, Vlatko Vedral, and Mark Williamson. Unified view of quantum and classical correlations.Phys. Rev. Lett., 104:080501, Feb 2010. doi: 10.1103/PhysRevLett.104.080501. URL https://link.aps.org/doi/10.1103/PhysRevLett.104.080501. BIBLIOGRAPHY133

work page doi:10.1103/physrevlett.104.080501 2010
[76]

Work extraction and thermodynamics for individual quantum systems.Nature communica- tions, 5, 2014

Paul Skrzypczyk, Anthony J Short, and Sandu Popescu. Work extraction and thermodynamics for individual quantum systems.Nature communica- tions, 5, 2014. doi: 10.1038/ncomms5185. URLhttps://doi.org/10.1038/ ncomms5185

work page doi:10.1038/ncomms5185 2014
[77]

Quantum and information thermodynamics: A unifying framework based on repeated interactions.Phys

Philipp Strasberg, Gernot Schaller, Tobias Brandes, and Massimiliano Es- posito. Quantum and information thermodynamics: A unifying framework based on repeated interactions.Phys. Rev. X, 7:021003, Apr 2017. doi: 10.1103/PhysRevX.7.021003. URLhttps://link.aps.org/doi/10.1103/ PhysRevX.7.021003

work page doi:10.1103/physrevx.7.021003 2017
[78]

Reconstructing Waddington’s landscape from data

Dibyendu Mandal and Christopher Jarzynski. Work and information pro- cessing in a solvable model of Maxwell’s demon.Proceedings of the Na- tional Academy of Sciences, 109(29):11641–11645, 2012. doi: 10.1073/pnas. 1204263109. URLhttps://doi.org/10.1073/pnas.1204263109

work page doi:10.1073/pnas 2012
[79]

Boyd, Dibyendu Mandal, and James P

Alexander B. Boyd, Dibyendu Mandal, and James P. Crutchfield. Correlation-powered information engines and the thermodynamics of self- correction.Phys. Rev. E, 95:012152, Jan 2017. doi: 10.1103/PhysRevE.95. 012152. URLhttps://link.aps.org/doi/10.1103/PhysRevE.95.012152

work page doi:10.1103/physreve.95 2017
[80]

Transient dissipation and structural costs of physical information trans- duction.Physical review letters, 118(22):220602, 2017

Alexander B Boyd, Dibyendu Mandal, Paul M Riechers, and James P Crutch- field. Transient dissipation and structural costs of physical information trans- duction.Physical review letters, 118(22):220602, 2017

2017
[81]

Measurement of quantum mechanical operators.Physical Review, 120(2):622, 1960

Huzihiro Araki and Mutsuo M Yanase. Measurement of quantum mechanical operators.Physical Review, 120(2):622, 1960

1960
[82]

Pomdp inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance.Machine Learning, 113(10):7967–7995, 2024

Giacomo Arcieri, Cyprien Hoelzl, Oliver Schwery, Daniel Straub, Konstanti- nos G Papakonstantinou, and Eleni Chatzi. Pomdp inference and robust solution via deep reinforcement learning: An application to railway optimal maintenance.Machine Learning, 113(10):7967–7995, 2024

2024

Showing first 80 references.