pith. sign in
def

arithmeticCoding

definition
show as:
module
IndisputableMonolith.Information.Compression
domain
Information
line
191 · github
papers citing
none yet

plain-language theorem explainer

Arithmetic coding is defined as the scheme achieving near-optimal compression with average length L converging to entropy H for long messages. Researchers applying Recognition Science to information theory cite it when linking compression to J-cost minimization. The declaration is a direct string assignment that summarizes the single-number encoding property and its relation to dictionary methods.

Claim. Arithmetic coding is the encoding that represents a message by a single number in the unit interval such that the expected code length $L$ satisfies $L → H$ as message length tends to infinity, where $H$ is the Shannon entropy.

background

The Information.Compression module derives compression limits from J-cost. In Recognition Science, information carries J-cost and compressed forms lower this cost while remaining faithful; the entropy bound is therefore the minimum J-cost for lossless representation. Shannon's source coding theorem supplies the classical limit: average code length is at least $H(X) = -∑ p(x) log₂ p(x)$. Upstream, the shifted cost $H(x) = J(x) + 1$ satisfies the d'Alembert equation $H(xy) + H(x/y) = 2 H(x) H(y)$, and configuration entropy is defined as total defect.

proof idea

The declaration is a direct string assignment. It records the three properties listed in the doc-comment (near-optimality for any distribution, $L → H$ in the long-message limit, and single-number encoding) together with a brief contrast to Lempel-Ziv dictionary methods.

why it matters

The definition fills the arithmetic-coding slot inside INFO-003, the module establishing that compression equals J-cost minimization. It supplies the concrete scheme that saturates the entropy bound derived from the Recognition Composition Law and the initial-condition entropy definition. No downstream theorems are recorded, leaving open the question of a formal Lean implementation that would discharge the descriptive string into an executable encoder.

Switch to Lean above to see the machine-checked source, dependencies, and usage graph.