Pareto-type finite-block optimality for source codes: a constrained Markov example

Stefano Della Fiore

arxiv: 2605.03552 · v1 · submitted 2026-05-05 · 💻 cs.IT · math.IT

Pareto-type finite-block optimality for source codes: a constrained Markov example

Stefano Della Fiore This is my paper

Pith reviewed 2026-05-07 13:46 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords injective source codingPareto optimalityfinite-block lengthconstrained Markov sourceinformation costshortlex orderingDalai-Leonardi code

0 comments

The pith

A canonical shortlex code ordered by information cost achieves expected block lengths strictly below 3/2 n for every n greater than 1 on this constrained Markov source, showing the reversible Dalai-Leonardi code is not Pareto-optimal.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines a Pareto-type optimality for injective source codes that compares the full sequence of expected block lengths instead of asymptotic rates alone. It takes the four-symbol constrained Markov source from Dalai and Leonardi as a test case and builds a canonical code by sorting admissible strings first by increasing information cost K(u), then by length and lexicographic order, before mapping them to binary strings in shortlex order. For this source the code has expected length exactly 3/2 on single symbols and strictly less than 3/2 times n on longer blocks, with an extra saving of order c over square root of n for large n. This establishes that the reversible Dalai-Leonardi code fails the Pareto comparison. The proof uses exact enumeration of strings by cost class together with a shortlex gap identity that splits each cost class evenly between two consecutive lengths.

Core claim

By ordering admissible strings according to increasing information cost K(u) = -log2 P(u), then by length and lex, and assigning binary codes in shortlex order, the resulting code C satisfies E[|C(X1)|] = 3/2 and E[|C(X1^n)|] < (3/2) n for n >= 2, with E[|C(X1^n)|] <= (3/2)n - c/sqrt(n) for large n and sufficiently small c > 0. Consequently the reversible Dalai-Leonardi code fails to be Pareto-optimal with respect to finite-block average length on this source.

What carries the argument

The canonical injective mapping C that orders admissible strings by increasing K(u), then length and lex order, and assigns binary strings in shortlex order, which yields the stated length bounds through cost-class enumeration and the shortlex gap identity.

Load-bearing premise

The probabilities of the four-symbol constrained Markov source allow the canonical ordering by information cost to produce the claimed expected-length bounds.

What would settle it

Direct computation of the expected length under the reversible Dalai-Leonardi code for blocks of length n=2 or n=3; if this value is smaller than the bound proved for C, the Pareto non-optimality claim fails.

read the original abstract

We study a Pareto-type notion of finite-block optimality for injective source codes, where two codes are compared through the full sequence of expected block lengths. As a concrete and fully analyzable test case, we revisit the four-symbol constrained Markov source introduced by Dalai and Leonardi in their "meaningful example'' on constrained-source decodability. For each admissible nonempty string $u=x_1^m \in \mathscr{A} \subset \mathscr{X}^+$, let $$ K(u):=-\log_2 \mathbb{P}(X_1^m=u) $$ denote its information cost. We construct a canonical injective binary mapping $C:\mathscr{A} \to \{0,1\}^+$ by ordering admissible strings by increasing $K(u)$, then by length and lexicographic order, and assigning binary strings in shortlex order. For the length-$n$ block $X_1^n$ we prove $$ \mathbb{E}[|C(X_1)|]=\tfrac32, \qquad \mathbb{E}[|C(X_1^n)|]<\tfrac32\,n\quad (n\ge 2). $$ Moreover, for every fixed $$ 0<c<\frac{\sqrt2}{18\sqrt\pi} $$ we have $$ \mathbb{E}[|C(X_1^n)|]\le \tfrac32\,n-\frac{c}{\sqrt n} $$ for all sufficiently large $n$. Thus, for this source, the reversible Dalai-Leonardi code is not Pareto-optimal with respect to finite-block average length. The proof is based on an exact enumeration of admissible strings by information cost and on a shortlex gap identity implying that each cost class splits evenly between lengths $K(u)-1$ and $K(u)$. The example is simple, but it already exhibits the kind of finite-block Pareto comparison that seems natural for injective source coding under source constraints.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper shows that the reversible Dalai-Leonardi code for their four-symbol constrained Markov source loses on finite-block expected length to a simple canonical injective code, with an explicit gap of order 1/sqrt(n).

read the letter

The main thing to know is that for this specific constrained Markov source the reversible Dalai-Leonardi code is not Pareto optimal once you compare the full sequence of finite-block expected lengths. The authors build a canonical code that orders admissible strings first by increasing information cost K(u) = -log P(u), then by length and lex order, and assigns binary strings in shortlex order. They prove the new code gives exactly 3/2 for single symbols and strictly less than (3/2)n for blocks of length n >= 2, plus a further c/sqrt(n) improvement for large n with a small positive c. This directly beats the DL code on every finite n they check.

Referee Report

0 major / 3 minor

Summary. The paper introduces a Pareto-type finite-block optimality notion for injective source codes, comparing them via the full sequence of expected lengths E[|C(X_1^n)|]. For the four-symbol constrained Markov source of Dalai and Leonardi, it defines information cost K(u) = -log2 P(X_1^m = u) for admissible u, constructs a canonical injective code C by ordering admissible strings first by increasing K(u), then length and lexicographic order, and mapping to binary strings in shortlex order. It proves E[|C(X_1)|] = 3/2, E[|C(X_1^n)|] < (3/2)n for n >= 2, and the stronger bound E[|C(X_1^n)|] <= (3/2)n - c/sqrt(n) for 0 < c < sqrt(2)/(18 sqrt(pi)) and large n. This establishes that the reversible Dalai-Leonardi code is not Pareto-optimal under this finite-block criterion. The argument rests on exact enumeration of admissible strings by K(u) and a shortlex gap identity showing even splitting of each cost class between lengths K(u)-1 and K(u).

Significance. If the enumeration and gap identity are correct, the paper supplies a fully explicit, parameter-free example separating asymptotic optimality from finite-block Pareto optimality for constrained sources. The exact closed-form expectations, the 1/sqrt(n) improvement, and the direct comparison to the Dalai-Leonardi code constitute concrete, verifiable contributions that can serve as a benchmark for future work on finite-block coding. The derivation is self-contained, proceeding directly from the source probabilities and the ordering rule without fitted parameters or circularity.

minor comments (3)

[Proof of the main inequalities] The shortlex gap identity and the resulting even splitting of cost classes are central to the exact expressions; a dedicated lemma or subsection stating the identity formally (with a short proof or reference to the enumeration) would improve readability and verifiability.
[Asymptotic improvement] While the abstract states the bound on c, the derivation of the specific constant sqrt(2)/(18 sqrt(pi)) is not sketched; adding a brief outline of how the constant arises from the enumeration would strengthen the asymptotic claim.
[Construction of C] Notation for the admissible set (mathscr{A}) and the canonical mapping C is introduced concisely; an explicit small table of the first few admissible strings, their K(u) values, and assigned codewords would allow immediate checking of the E[|C(X_1)|] = 3/2 result.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and detailed summary of our manuscript, as well as for recognizing its significance as an explicit, parameter-free example separating asymptotic optimality from finite-block Pareto optimality. The recommendation for minor revision is noted.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation begins from the explicit four-symbol constrained Markov source probabilities and defines K(u) directly as -log2 P(u). It then applies a fully specified canonical ordering (increasing K(u), then length and lex, with shortlex binary assignment) to construct the injective code C. The length expectations E[|C(X1)|]=3/2 and E[|C(X1^n)|]<(3/2)n are obtained via direct enumeration of admissible strings grouped by K(u) together with the shortlex gap identity that each cost class splits evenly across two consecutive lengths. These combinatorial identities yield the 1/sqrt(n) improvement bound without any fitted parameters, self-definitional loops, or load-bearing self-citations. The comparison establishing that the reversible Dalai-Leonardi code is not Pareto-optimal follows immediately from the strict inequality in the derived finite-block sequence. All steps are self-contained against the source model and ordering rule.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain-specific probability model of the constrained Markov source and standard properties of shortlex ordering and information costs; no free parameters are fitted and no new entities are postulated.

axioms (1)

domain assumption The source is the four-symbol constrained Markov source defined by Dalai and Leonardi
All probability calculations and admissible-string enumerations depend on this specific model.

pith-pipeline@v0.9.0 · 5636 in / 1317 out tokens · 94878 ms · 2026-05-07T13:46:22.959528+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

[1]

Lagrange inversion,

I. M. Gessel, “Lagrange inversion,”J. Combin. Theory Ser. A, vol. 144, pp. 212–249, 2016

work page 2016
[2]

Fifty years of Shannon theory,

S. Verdú, “Fifty years of Shannon theory,”IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2057– 2078, Oct. 1998

work page 2057
[3]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,”Bell System Technical Journal, vol. 27, pp. 379–423 and 623–656, 1948

work page 1948
[4]

Two inequalities implied by unique decipherability,

B. McMillan, “Two inequalities implied by unique decipherability,”IRE Trans. Inform. Theory, vol. IT-2, no. 4, pp. 115–116, Dec. 1956

work page 1956
[5]

L. G. Kraft,A device for quantizing, grouping, and coding amplitude-modulated pulses, master’s thesis, Department of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 1949

work page 1949
[6]

T. M. Cover and J. A. Thomas,Elements of Information Theory. New York: Wiley, 1991

work page 1991
[7]

R. G. Gallager,Information Theory and Reliable Communication. New York: Wiley, 1968

work page 1968
[8]

On unique decodability,

M. Dalai and R. Leonardi, “On unique decodability,”IEEE Trans. Inform. Theory, vol. 54, no. 11, pp. 5068–5072, Nov. 2008

work page 2008
[9]

A lower bound on the expected length of one-to-one codes,

N. Alon and A. Orlitsky, “A lower bound on the expected length of one-to-one codes,”IEEE Trans. Inform. Theory, vol. 40, no. 5, pp. 1670–1672, Sep. 1994. 12

work page 1994

[1] [1]

Lagrange inversion,

I. M. Gessel, “Lagrange inversion,”J. Combin. Theory Ser. A, vol. 144, pp. 212–249, 2016

work page 2016

[2] [2]

Fifty years of Shannon theory,

S. Verdú, “Fifty years of Shannon theory,”IEEE Trans. Inform. Theory, vol. 44, no. 6, pp. 2057– 2078, Oct. 1998

work page 2057

[3] [3]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,”Bell System Technical Journal, vol. 27, pp. 379–423 and 623–656, 1948

work page 1948

[4] [4]

Two inequalities implied by unique decipherability,

B. McMillan, “Two inequalities implied by unique decipherability,”IRE Trans. Inform. Theory, vol. IT-2, no. 4, pp. 115–116, Dec. 1956

work page 1956

[5] [5]

L. G. Kraft,A device for quantizing, grouping, and coding amplitude-modulated pulses, master’s thesis, Department of Electrical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 1949

work page 1949

[6] [6]

T. M. Cover and J. A. Thomas,Elements of Information Theory. New York: Wiley, 1991

work page 1991

[7] [7]

R. G. Gallager,Information Theory and Reliable Communication. New York: Wiley, 1968

work page 1968

[8] [8]

On unique decodability,

M. Dalai and R. Leonardi, “On unique decodability,”IEEE Trans. Inform. Theory, vol. 54, no. 11, pp. 5068–5072, Nov. 2008

work page 2008

[9] [9]

A lower bound on the expected length of one-to-one codes,

N. Alon and A. Orlitsky, “A lower bound on the expected length of one-to-one codes,”IEEE Trans. Inform. Theory, vol. 40, no. 5, pp. 1670–1672, Sep. 1994. 12

work page 1994