entropy_is_expected_surprisal
plain-language theorem explainer
Shannon entropy of a discrete probability distribution equals its expected surprisal, the sum of each positive probability multiplied by its surprisal. Recognition Science researchers cite this when equating entropy to total J-cost over distributions. The proof is a term reduction that rewrites via shannon_equals_jcost then unfolds totalJCost, probJCost and surprisal to reach reflexivity.
Claim. For a probability distribution $d$ over $n$ outcomes with probabilities $p_i$, the Shannon entropy equals the sum over $i$ of $p_i$ times the surprisal of $p_i$ (equal to $-log p_i$) whenever $p_i > 0$, and zero otherwise.
background
The module Information.ShannonEntropy derives Shannon entropy from Recognition Science's J-cost structure. J-cost is $J(x) = (x + x^{-1})/2 - 1$, measuring recognition effort; the shifted $H(x) = J(x) + 1$ satisfies the d'Alembert equation from CostAlgebra. A ProbDist $n$ is a structure with probabilities that are non-negative and sum to one. The module shows entropy as total J-cost, building on the entropy definition from InitialCondition that equates it to total defect.
proof idea
The proof is a term-mode reduction. It rewrites shannonEntropy d using shannon_equals_jcost to obtain totalJCost. It then unfolds totalJCost, probJCost and surprisal. Reflexivity closes the goal because both sides expand to the identical conditional sum over positive probabilities.
why it matters
This theorem embeds Shannon entropy inside the Recognition Science framework by expressing it as expected recognition cost. It fills the information-theory derivation step referenced in the module doc-comment for the paper on foundations of physics from RS and connects to J-uniqueness T5. With zero downstream uses recorded, its integration into mass formulas or observer forcing remains open.
Switch to Lean above to see the machine-checked source, dependencies, and usage graph.