log_sum_inequality_fiber
plain-language theorem explainer
The fiberwise log-sum inequality states that for nonnegative real functions a and b that are positive on the support of a decidable predicate P with at least one element in the support, the conditional weighted sum of log-ratios is bounded below by the total a-mass times the log of the ratio of total masses. Information theorists and Recognition Science researchers proving thermodynamic monotonicity would cite it when deriving the data processing inequality for KL divergence. The proof reduces the claim to the standard log-sum inequality by re-
Claim. Let $I$ be a finite index set and $P:I→Prop$ a decidable predicate with nonempty support. For nonnegative $a,b:I→ℝ$ that are strictly positive on the support of $P$, one has $∑_{i∈I} 1_{P(i)} a(i) log(a(i)/b(i)) ≥ (∑_{i∈I} 1_{P(i)} a(i)) log( (∑_{i∈I} 1_{P(i)} a(i)) / (∑_{i∈I} 1_{P(i)} b(i)) ).
background
This theorem sits inside the FreeEnergyMonotone module, whose goal is to show that Recognition Free Energy is non-increasing under coarse-graining and equilibration, thereby supplying the Recognition Science version of the Second Law. The log-sum inequality supplies the analytic engine for the Data Processing Inequality on relative entropy. Upstream structures include PhiForcingDerived.of (encoding convexity of the J-cost J(x)=(x+x^{-1})/2−1) and QuantumLedger.probability (Born-rule probabilities). The module also imports MaxEntFromCost, which links maximum-entropy distributions to J-cost minimization.
proof idea
The tactic proof constructs the subtype ι'={i//P i}, restricts a and b to positive functions a' and b' on ι', invokes the sibling log_sum_inequality on a' and b', then applies Finset.sum_subtype and sum_filter to translate the restricted sums back to the original conditional expressions. The final rw steps equate the left- and right-hand sides of the target inequality.
why it matters
The result is used directly by data_processing_inequality, which asserts that KL divergence is nonincreasing under push-forwards; that theorem in turn feeds coarse_graining_decreases_free_energy and rs_arrow_of_time. It therefore closes the analytic step from the Recognition Composition Law and eight-tick dynamics (T7) to thermodynamic stability. The parent statement in the downstream doc-comment reads: 'Coarse-graining reduces the distinguishability of distributions.'
Switch to Lean above to see the machine-checked source, dependencies, and usage graph.