A Statistical-Physics Refinement of Soft Covering

Neri Merhav

arxiv: 2605.01839 · v1 · submitted 2026-05-03 · 💻 cs.IT · math.IT

A Statistical-Physics Refinement of Soft Covering

Neri Merhav This is my paper

Pith reviewed 2026-05-09 16:08 UTC · model grok-4.3

classification 💻 cs.IT math.IT

keywords annealed free energyrandom codingsoft coveringphase transitionsRényi spectrumchannel resolvabilityguessworkhypothesis testing

0 comments

The pith

The annealed free energy of random code outputs splits into competing bulk and sparse branches from typical and atypical codewords.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives a single-letter formula for the annealed free energy of the output distribution produced by sending a fixed-type random code through a discrete memoryless channel. This formula decomposes into a bulk term driven by typical codewords and a sparse term driven by atypical codewords, whose relative strength depends on rate and inverse temperature. The authors characterize the phase boundary between these regimes explicitly when the inverse temperature is at least one and sketch the full phase diagram in the beta-R plane. This decomposition refines the soft-covering picture by showing how different codeword populations control the Rényi spectrum of the induced output distribution. The results supply exact expressions usable for guesswork, resolvability, and hypothesis testing.

Core claim

The annealed free energy, given by the normalized log of the sum over output sequences of the beta-power of the conditional output probability given the code, admits a single-letter expression that separates into a bulk branch psi_b(beta, R) governed by typical codewords and a sparse branch psi_s(beta, R) governed by atypical codewords. These branches compete, and for beta at least one their equality defines an explicit phase boundary R star of beta. The resulting phase diagram in the first quadrant is divided into four regions by the curves R equals I^b(beta), R equals R star of beta, and R equals I^s(beta), all three meeting at the point (one, mutual information).

What carries the argument

The partition function Z_n(beta | C) equal to the sum over y^n of [P_{Y^n | C}(y^n)]^beta, whose normalized logarithm is the annealed free energy; it encodes the full Rényi spectrum of the output distribution and is analyzed by separating contributions from typical and atypical codeword populations.

If this is right

The phase diagram consists of four regions separated by three boundaries that meet at (beta, R) equals (one, mutual information).
The bulk branch governs the free energy below the phase boundary while the sparse branch governs it above, for beta at least one.
The same single-letter expressions apply directly to refined analyses of guesswork, channel resolvability, and binary hypothesis testing.
Both branches are derived for every beta greater than zero, although the explicit competition boundary is given only for beta at least one.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the bulk-sparse decomposition is accurate, then atypical codewords can dominate output statistics at moderate rates in ways invisible to ordinary typical-set arguments.
The same statistical-physics partition-function approach could be applied to other coding problems whose performance depends on the entire output distribution rather than only error probability.
Numerical verification on finite-length codes near the predicted phase boundary would test whether the typicality separation remains sharp outside the asymptotic regime.

Load-bearing premise

The derivation assumes a random code ensemble with a fixed input type transmitted over a discrete memoryless channel, so that standard typicality arguments cleanly separate the bulk population of typical codewords from the sparse population of atypical codewords.

What would settle it

For the Z-channel example in the paper, fix beta greater than or equal to one and a rate near the predicted R star of beta, then compute the empirical annealed free energy from many short random codebooks; the switch from the bulk formula to the sparse formula should occur at the same rate predicted by the closed-form boundary.

Figures

Figures reproduced from arXiv: 2605.01839 by Neri Merhav.

**Figure 1.** Figure 1: plots all phase boundaries together in the relevant quadrant of the (β, R) plane. 1 2 3 4 5 6 (inverse temperature, 1) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 R (n ats) I(X; Y) A = b b: unc. s: constr. B = b b: unc. s < b C = s = R(1 ) + C( ) D b: unc. s > b = s = R(1 ) + C( ) b: cond. s > b = 1 R = I * ( ) (bulk condensation boundary) R = R * ( ) (bulk/sparse crossover) R = I * * ( ) (sparse closed-form boundary) view at source ↗

**Figure 2.** Figure 2: R1 = 0.12 < I(X; Y ). The sparse branch ψs(β, R1) lies above the bulk branch for all β ≥ 1, so the annealed free energy is dominated by the sparse branch; both lie well above the i.i.d. reference, reflecting ensemble inflation by rare codebooks. 16 view at source ↗

**Figure 3.** Figure 3: R2 = I(X; Y ) = 0.2441 nats (the soft-covering threshold). Both branches and the i.i.d. reference all vanish at β = 1 (since Zn(1) = 1), and diverge differently for β > 1. 1 2 3 4 5 6 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 Free energy (nats) R3 = 0.35 > I(X; Y) b( , R) s( , R) iid( ) = log y PY(y) view at source ↗

**Figure 4.** Figure 4: R3 = 0.35 > I(X; Y ). The bulk branch dominates throughout and tracks the i.i.d. reference closely, confirming that above the soft-covering threshold the output mixture approximates P ⊗n Y in the R´enyi sense. In all three figures, ψiid(β) ≤ ψ(β, R), with strict inequality below I(X; Y ): the random-coding mixture has strictly higher R´enyi entropy than the i.i.d. output distribution at every order β ≥ 1. 17 view at source ↗

read the original abstract

We study the channel output distribution induced by a rate-$R$ random code via statistical physics. The partition function is $Z_n(\beta|\mathcal{C}) = \sum_{y^n}[P_{Y^n|\mathcal{C}}(y^n)]^\beta$, where $\mathcal{C}$ is the code and $\beta > 0$ is inverse temperature. Our focus is on the free energy which is the normalized logarithm of this quantity, which encodes the full R\'{e}nyi spectrum of the output distribution. The single-letter formula derived for the annealed free energy decomposes into two branches which reflect a ``competition'' between two populations of codewords. One is the \emph{bulk branch}, $\psi_{\mbox{\tiny b}}(\beta,R)$, which is driven by typical codewords and the other one is the \emph{sparse branch} $\psi_{\mbox{\tiny s}}(\beta,R)$, which is driven by a-typical codewords, where the qualifiers `typical' and `atypical' are in a sense to become apparent later. We analyze the phase structure of each branch separately and characterize their competition. Both branches are derived for all $\beta > 0$. The phase boundary $R^\star(\beta)$, where the two branches are equal, is analyzed for $\beta \geq 1$, where it has an explicit closed-form expression. The phase diagram in the first quadrant of the $(\beta, R)$ plane has four regions separated by three boundaries: $R = I^{\mbox{\tiny b}}(\beta)$ (bulk branch transition), $R = R^\star(\beta)$ (bulk--sparse competition boundary), and $R = I^{\mbox{\tiny s}}(\beta)$ (sparse branch transition), all meeting at the point $(\beta, R) = (1, I(X;Y))$, where $I(X;Y)$ is the mutual information induced by the input type and the channel. Applications to guesswork, channel resolvability, and hypothesis testing are discussed, and all results are illustrated with a numerical example of a Z-channel.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Merhav splits the annealed free energy of random-code outputs into bulk and sparse branches, with a closed-form phase boundary R-star(beta) for beta at least 1 and a four-region diagram.

read the letter

Merhav's paper applies statistical physics ideas to the output distribution of a random code over a DMC. It defines a partition function Z as the sum over y of the output prob to the power beta, then looks at the annealed free energy, which is (1/n) log E[Z]. The main move is to split this into a bulk part from typical codewords and a sparse part from atypical ones, then find where they compete. What stands out is the explicit phase boundary R-star(beta) for beta at least 1, and the four regions in the beta-R plane. The boundaries meet at the mutual information point when beta equals 1. This organizes the Rényi spectrum in a way that previous soft covering results did not. The applications to guesswork, resolvability, and hypothesis testing are sketched, and the Z-channel numbers make the regions concrete. The work is grounded in standard random coding and method of types. That helps. The decomposition looks reproducible from the setup. The potential issue is whether the separation between bulk and sparse really holds without leftover terms. The stress-test point is right to ask about large-deviation precision on the atypical contributions. If the paper shows that the cross terms vanish and the exponents match exactly, then the phase diagram is solid. But if it relies on rough typicality without o(1) control inside the log E, the location of the boundary could move. Since the closed form is only for beta >=1, the beta <1 side might need more work. This paper is for people in information theory who already know soft covering and want to see a statistical mechanics angle on it. It will interest those working on Rényi divergences or random code ensembles. It is worth sending to peer review because the claims are precise enough to be checked and the topic connects several areas.

Referee Report

2 major / 2 minor

Summary. The paper derives a single-letter expression for the annealed free energy (1/n) log E_C[Z_n(β|C)] of the partition function Z_n(β|C) = sum_y [P_{Y^n|C}(y^n)]^β induced by a rate-R random code ensemble with fixed input type over a DMC. The expression decomposes into a bulk branch ψ_b(β,R) driven by typical codewords and a sparse branch ψ_s(β,R) driven by atypical codewords; their competition is governed by an explicit phase boundary R^*(β) for β ≥ 1. The phase diagram in the (β,R) plane has four regions delimited by R = I^b(β), R = R^*(β), and R = I^s(β), all meeting at (1, I(X;Y)). Results are illustrated on the Z-channel and applied to guesswork, channel resolvability, and hypothesis testing.

Significance. If the claimed decomposition and error control hold, the explicit single-letter formulas and closed-form R^*(β) constitute a concrete refinement of soft-covering analysis, supplying a phase diagram that separates regimes where typical versus atypical codewords dominate the Rényi spectrum of the output distribution. The Z-channel numerics and applications to guesswork/resolvability provide immediate testability and potential utility for exponent calculations.

major comments (2)

[§3] §3 (sparse-branch derivation): the additivity of bulk and sparse contributions to log E[Z] after partitioning by input type requires an explicit large-deviation bound showing that the probability of the atypical set times the conditional growth factor yields an exponent with o(1) remainder inside the log E[·] operation; without this, the location of R^*(β) can shift by a sub-exponential term.
[§4] §4 (phase-boundary analysis for β ≥ 1): the claim that R^*(β) is the unique crossing point where ψ_b = ψ_s assumes the two branches are each convex and that cross terms vanish; the manuscript must verify that the method-of-types approximation for the atypical conditional output distribution introduces no additional linear-in-n terms that would alter the equality.

minor comments (2)

[Abstract] The abstract states that both branches are derived for all β > 0 while the explicit R^*(β) is given only for β ≥ 1; a brief remark on the technical obstacle for β < 1 would improve clarity.
[Figure 1] Figure 1 (Z-channel phase diagram): label the three boundaries explicitly with the corresponding expressions I^b(β), R^*(β), I^s(β) to match the text description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and insightful comments. We address each major comment below and indicate the revisions planned to strengthen the derivations.

read point-by-point responses

Referee: [§3] §3 (sparse-branch derivation): the additivity of bulk and sparse contributions to log E[Z] after partitioning by input type requires an explicit large-deviation bound showing that the probability of the atypical set times the conditional growth factor yields an exponent with o(1) remainder inside the log E[·] operation; without this, the location of R^*(β) can shift by a sub-exponential term.

Authors: We agree that an explicit large-deviation bound is needed to rigorously control the error in the sparse-branch derivation. The current §3 analysis uses method-of-types bounds that capture the leading exponential behavior, but the o(1) remainder inside log E[·] is implicit. In the revision we will insert a dedicated lemma deriving the large-deviation estimate for atypical input types, confirming that the sparse contribution to (1/n) log E[Z_n] has an o(1) remainder and that R^*(β) is unaffected by sub-exponential shifts. revision: yes
Referee: [§4] §4 (phase-boundary analysis for β ≥ 1): the claim that R^*(β) is the unique crossing point where ψ_b = ψ_s assumes the two branches are each convex and that cross terms vanish; the manuscript must verify that the method-of-types approximation for the atypical conditional output distribution introduces no additional linear-in-n terms that would alter the equality.

Authors: We acknowledge the need to verify convexity and the absence of linear-in-n error terms at the crossing point. Convexity of both branches follows from the underlying convex rate functions, but we will state this explicitly. We will also add a refined error analysis showing that the method-of-types approximation for the atypical conditional output distribution contributes only o(n) to the exponent, with no linear-in-n terms that could shift ψ_b = ψ_s. These clarifications will be incorporated in the revised §4. revision: yes

Circularity Check

0 steps flagged

Standard method-of-types derivation with no self-referential reduction or load-bearing self-citation

full rationale

The annealed free energy decomposition into bulk branch ψ_b(β,R) and sparse branch ψ_s(β,R) is obtained by partitioning the random code ensemble according to whether codewords satisfy the fixed input type (typical) or lie in the atypical set, then applying large-deviation exponents to E[Z_n]. This is a direct application of the method of types to the DMC with fixed composition; the phase boundary R^*(β) for β≥1 arises from equating the resulting exponents without any parameter fitted inside the paper or renamed as a prediction. No equation reduces to a prior result by the paper's own definitions, and no uniqueness theorem or ansatz is imported via self-citation. The derivation is self-contained against external benchmarks of random coding and typicality.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard random-coding assumptions rather than new free parameters or invented entities.

axioms (1)

domain assumption Random code ensemble generated according to a fixed input type over a discrete memoryless channel
Invoked to define the output distribution P_{Y^n|C} and to separate typical and atypical codeword populations.

pith-pipeline@v0.9.0 · 5679 in / 1432 out tokens · 53916 ms · 2026-05-09T16:08:47.488743+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Soft Covering Through the Lens of Hypothesis Testing
cs.IT 2026-05 unverdicted novelty 7.0

Exact single-letter exponents for false-alarm and missed-detection probabilities are derived for the hypothesis-testing formulation of soft covering, exposing a phase diagram with collapse at the mutual-information rate.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages · cited by 1 Pith paper

[1]

Approximation theory of output statistics,

Han, T. S.; Verd´ u, S. “Approximation theory of output statistics,”IEEE Trans. Inf. Theory1993, vol. 39, no. 3, pp. 752–772

work page
[2]

Soft covering with high probability,

Cuff, P. “Soft covering with high probability,”Proc. IEEE Int. Symp. Inf. Theory (ISIT)2016, Barcelona, Spain, pp. 2963–2967

work page 2016
[3]

General nonasymptotic and asymptotic formulas in channel resolv- ability and identification capacity and their application to wire-tap channel,

Hayashi, M. “General nonasymptotic and asymptotic formulas in channel resolv- ability and identification capacity and their application to wire-tap channel,”IEEE Trans. Inf. Theory2006, vol. 52, no. 4, pp. 1562–1575

work page
[4]

Effective secrecy: Reliability, confusion and stealth,

Hou, J.; Kramer, G. “Effective secrecy: Reliability, confusion and stealth,”Proc. IEEE Int. Symp. Inf. Theory (ISIT)2014, pp. 601–605

work page 2014
[5]

M´ ezard, M.; Montanari, A.Information, Physics, and Computation, Oxford Univer- sity Press, New York, 2009

work page 2009
[6]

Statistical physics and information theory,

Merhav, N. “Statistical physics and information theory,”Foundations and Trends in Communications and Information Theory2010, vol. 6, no. 1–2, pp. 1–212

work page
[7]

Exact channel resolvability exponents for the soft-covering problem,

Yu, L.; Tan, V. Y.-F. “Exact channel resolvability exponents for the soft-covering problem,”IEEE Trans. Inf. Theory2020, vol. 66, no. 4, pp. 2098–2112

work page 2098
[8]

R´ enyi resolvability and its applications to the wiretap chan- nel,

Yu, L.; Tan, V. Y.-F. “R´ enyi resolvability and its applications to the wiretap chan- nel,”IEEE Trans. Inf. Theory2019, vol. 65, no. 3, pp. 1862–1897

work page
[9]

An inequality on guessing and its application to sequential decoding,

Arıkan, E. “An inequality on guessing and its application to sequential decoding,” IEEE Trans. Inf. Theory1996, vol. 42, no. 1, pp. 99–105

work page
[10]

Hypothesis testing and information theory,

Blahut, R. E. “Hypothesis testing and information theory,”IEEE Trans. Inf. Theory 1974, vol. 20, no. 4, pp. 405–417

work page 1974
[11]

Spin-glass models as error-correcting codes,

Sourlas, N. “Spin-glass models as error-correcting codes,”Nature1989, vol. 339, pp. 693–695

work page
[12]

Turbo codes: The phase transition,

Montanari, A. “Turbo codes: The phase transition,”European Physical Journal B 2001, vol. 18, pp. 321–333

work page 2001
[13]

Cambridge University Press, 2011

Csisz´ ar, I.; K¨ orner, J,Information Theory: Coding Theorems for Discrete Memo- ryless Systems, 2nd ed. Cambridge University Press, 2011

work page 2011
[14]

The method of types,

Csisz´ ar, I. “The method of types,”IEEE Trans. Inf. Theory1998, vol. 44, no. 6, pp. 2505–2523

work page
[15]

A toolbox for refined information-theoretic analyses with applications,

Merhav, N.; Weinberger, N. “A toolbox for refined information-theoretic analyses with applications,”Foundations and Trends in Communications and Information Theory2025, vol. 22, no. 1, pp. 1–184. 22

work page

[1] [1]

Approximation theory of output statistics,

Han, T. S.; Verd´ u, S. “Approximation theory of output statistics,”IEEE Trans. Inf. Theory1993, vol. 39, no. 3, pp. 752–772

work page

[2] [2]

Soft covering with high probability,

Cuff, P. “Soft covering with high probability,”Proc. IEEE Int. Symp. Inf. Theory (ISIT)2016, Barcelona, Spain, pp. 2963–2967

work page 2016

[3] [3]

General nonasymptotic and asymptotic formulas in channel resolv- ability and identification capacity and their application to wire-tap channel,

Hayashi, M. “General nonasymptotic and asymptotic formulas in channel resolv- ability and identification capacity and their application to wire-tap channel,”IEEE Trans. Inf. Theory2006, vol. 52, no. 4, pp. 1562–1575

work page

[4] [4]

Effective secrecy: Reliability, confusion and stealth,

Hou, J.; Kramer, G. “Effective secrecy: Reliability, confusion and stealth,”Proc. IEEE Int. Symp. Inf. Theory (ISIT)2014, pp. 601–605

work page 2014

[5] [5]

M´ ezard, M.; Montanari, A.Information, Physics, and Computation, Oxford Univer- sity Press, New York, 2009

work page 2009

[6] [6]

Statistical physics and information theory,

Merhav, N. “Statistical physics and information theory,”Foundations and Trends in Communications and Information Theory2010, vol. 6, no. 1–2, pp. 1–212

work page

[7] [7]

Exact channel resolvability exponents for the soft-covering problem,

Yu, L.; Tan, V. Y.-F. “Exact channel resolvability exponents for the soft-covering problem,”IEEE Trans. Inf. Theory2020, vol. 66, no. 4, pp. 2098–2112

work page 2098

[8] [8]

R´ enyi resolvability and its applications to the wiretap chan- nel,

Yu, L.; Tan, V. Y.-F. “R´ enyi resolvability and its applications to the wiretap chan- nel,”IEEE Trans. Inf. Theory2019, vol. 65, no. 3, pp. 1862–1897

work page

[9] [9]

An inequality on guessing and its application to sequential decoding,

Arıkan, E. “An inequality on guessing and its application to sequential decoding,” IEEE Trans. Inf. Theory1996, vol. 42, no. 1, pp. 99–105

work page

[10] [10]

Hypothesis testing and information theory,

Blahut, R. E. “Hypothesis testing and information theory,”IEEE Trans. Inf. Theory 1974, vol. 20, no. 4, pp. 405–417

work page 1974

[11] [11]

Spin-glass models as error-correcting codes,

Sourlas, N. “Spin-glass models as error-correcting codes,”Nature1989, vol. 339, pp. 693–695

work page

[12] [12]

Turbo codes: The phase transition,

Montanari, A. “Turbo codes: The phase transition,”European Physical Journal B 2001, vol. 18, pp. 321–333

work page 2001

[13] [13]

Cambridge University Press, 2011

Csisz´ ar, I.; K¨ orner, J,Information Theory: Coding Theorems for Discrete Memo- ryless Systems, 2nd ed. Cambridge University Press, 2011

work page 2011

[14] [14]

The method of types,

Csisz´ ar, I. “The method of types,”IEEE Trans. Inf. Theory1998, vol. 44, no. 6, pp. 2505–2523

work page

[15] [15]

A toolbox for refined information-theoretic analyses with applications,

Merhav, N.; Weinberger, N. “A toolbox for refined information-theoretic analyses with applications,”Foundations and Trends in Communications and Information Theory2025, vol. 22, no. 1, pp. 1–184. 22

work page