`kl_divergence_antitone`

proved

show as:

module: IndisputableMonolith.Thermodynamics.SecondLaw
domain: Thermodynamics
line: 199 · github
papers citing: none yet

plain-language theorem explainer

KL divergence to equilibrium decreases monotonically along trajectories generated by any J-descent operator. Information theorists and physicists reconstructing the second law from recognition-based first principles would cite this inequality. The argument is a direct one-line application of the single-step contraction property to the definition of antitone maps on the naturals.

Claim. Let $p_{eq}$ be a probability distribution on a finite nonempty set $Ω$. Let $R$ be an operator on probability distributions satisfying $R(p_{eq})=p_{eq}$ and $D_{KL}(R(q)‖p_{eq})≤D_{KL}(q‖p_{eq})$ for every distribution $q$. Then for any initial distribution $q_0$, the map $n↦D_{KL}((R^n q_0)‖p_{eq})$ is antitone.

background

In the Recognition Science derivation of thermodynamics, a J-descent operator abstracts the recognition step on probability distributions over a finite configuration space $Ω$. As defined in the module, it is a map that fixes the equilibrium distribution and is non-expansive in KL divergence to that equilibrium: for every $q$, $D_{KL}(R(q)‖p_{eq})≤D_{KL}(q‖p_{eq})$. The KL divergence is the standard sum $∑ q(ω)log(q(ω)/p(ω))$ (with the usual conventions that the summand vanishes when either probability is zero). This lemma sits inside the module that assembles the second law from RS first principles, assuming a positive recognition temperature and a Gibbs equilibrium measure constructed from a positive cost coordinate $X$.

proof idea

The proof is a term-mode one-liner. After introducing the natural numbers $n$, $m$ and the hypothesis $n≤m$, it applies the auxiliary one-step KL contraction lemma to conclude that the divergence at step $n$ is at least the divergence at step $m$.

why it matters

This inequality is invoked directly by the master second-law theorem, which packages KL monotonicity together with the corresponding statement for recognition free energy and the variational lower bound. It supplies the deepest single inequality in the derivation; the free-energy version follows by a single multiplication with the temperature factor. In the broader framework it realizes the data-processing inequality for the abstract recognition operator, consistent with the composition law for the J-cost function and the eight-tick octave structure.

Switch to Lean above to see the machine-checked source, dependencies, and usage graph.

papers checked against this theorem

No papers in the current corpus map onto this theorem yet.