The Random Subsequence Model and Uniform Codes for the Deletion Channel
Pith reviewed 2026-05-10 17:09 UTC · model grok-4.3
The pith
Uniformly random codes achieve positive rate in the deletion channel for every deletion probability less than one.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the regime where N and M go to infinity at fixed ratio alpha = M/N in (0,1), strict asymptotic separations hold between the null annealed free energy and the quenched free energies of the null and planted models at every alpha. These separations imply that uniformly random codes achieve positive rate in the deletion channel for all deletion probabilities p in [0,1). An exact analytic formula for the annealed free energy of the planted model supplies a complementary upper bound on the same rate.
What carries the argument
The Random Subsequence Model, a spin glass model on pairs of random strings whose partition function counts the number of subsequence embeddings of one string inside the other.
If this is right
- Uniformly random codes achieve positive rate for deletion probabilities arbitrarily close to one.
- The null and planted models remain in a zero-temperature spin-glass phase throughout the dense regime.
- The new upper and lower bounds on the random-coding rate for the deletion channel lie much closer together than all previously known bounds on deletion-channel capacity.
- An exact formula for the planted annealed free energy is now available for every density.
Where Pith is reading between the lines
- The same free-energy separations may supply new upper or lower bounds on the length of the longest common subsequence between two random strings.
- The Random Subsequence Model could be adapted to analyze random codes for other synchronization-error channels such as insertions or substitutions.
- The gap between the analytic upper bound and the positive-rate lower bound leaves room for tighter characterizations of the exact random-coding capacity of the deletion channel.
Load-bearing premise
The strict asymptotic separations between the null annealed free energy and the quenched free energies of the null and planted models imply that uniformly random codes achieve positive rate in the deletion channel.
What would settle it
A direct computation or numerical check showing that the quenched free energy equals the annealed free energy for some fixed density alpha in (0,1) would remove the separation and thereby falsify the positive-rate claim for random codes.
Figures
read the original abstract
We introduce the Random Subsequence Model, a spin glass model on pairs of random strings $(X,Y) \in \{0,1\}^N \times \{0,1\}^M$ whose partition function counts subsequence embeddings of $Y$ into $X$. We study two variants: the null model, where $X$ and $Y$ are independent and uniform, and the planted model, where $X$ is uniform and $Y$ is a uniformly-random length-$M$ subsequence of $X$. We connect the Random Subsequence Model to longstanding problems in various fields, including the best rate achievable by uniformly-random codes in the deletion channel, the longest common subsequence problem between two random strings, and models of directed polymers in statistical physics. In the regime where $N,M\to\infty$ at a fixed ratio $\alpha = M/N \in (0,1)$, we exhibit strict asymptotic separations between the null annealed free energy and the quenched free energies of the null and planted models at all values of the density parameter $\alpha$. This suggests that these models are in a spin glass phase at zero temperature throughout the entire dense regime. As a consequence, we show that uniformly-random codes achieve a positive rate in the deletion channel for all deletion probabilities $p\in [0,1),$ settling multiple conjectures of the second author, Isik and Weissman (2024) and proving the first such positive rate result for the regime $p \geq 1/2$. We also give an exact analytic formula for the annealed free energy of the planted model for all values of the density parameter. This implies a corresponding analytic upper bound on the best rate achievable by uniformly-random codes in the deletion channel, complementing the lower bound from our first result. Our upper and lower bounds for the capacity of the deletion channel under uniform codes are far closer to each other than the best known upper and lower bounds for the capacity of the deletion channel.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces the Random Subsequence Model (RSM), a spin-glass model whose partition function counts subsequence embeddings of one random string into another. It considers null (independent strings) and planted (one string a random subsequence of the other) variants. For N,M→∞ at fixed ratio α=M/N∈(0,1), the paper proves strict asymptotic separations between the null annealed free energy and the quenched free energies of both models at every α. This separation is invoked to conclude that uniformly random codes achieve positive rate over the deletion channel for every deletion probability p=1−α∈[0,1), including the previously open regime p≥1/2. An exact closed-form expression is also derived for the annealed free energy of the planted model, supplying a matching upper bound on the uniform-code rate.
Significance. If the central claims hold, the work supplies the first positive-rate result for uniform random codes on the deletion channel when p≥1/2, thereby settling multiple conjectures from Isik–Weissman (2024) and earlier literature. The exact analytic formula for the planted annealed free energy is parameter-free and constitutes a concrete strength. The resulting upper and lower bounds on the uniform-code deletion-channel capacity are substantially tighter than the best known bounds for the unrestricted deletion-channel capacity. The RSM also furnishes a new rigorous link between deletion-channel coding, the longest-common-subsequence problem, and zero-temperature spin-glass models.
major comments (1)
- [section deriving the implication for the deletion channel] The load-bearing step from the strict asymptotic separation between the null annealed free energy and the quenched free energies of the null and planted models (for all α∈(0,1)) to a strictly positive uniform-code rate r>0 must be made fully explicit. In particular, the inequality or theorem that converts the free-energy gap ΔF(α) into a coding-rate lower bound r>0 needs to be stated without non-strict steps, vanishing o(1) terms, or regime-specific concentration arguments that could fail when α≤1/2 (p≥1/2).
minor comments (2)
- [Abstract] The abstract refers to 'settling multiple conjectures of the second author, Isik and Weissman (2024)' without naming the specific statements; adding a one-sentence pointer to the conjectures would aid readability.
- [model definition] Early introduction of the partition function Z_N,M and its free-energy normalizations would benefit from an explicit displayed equation to facilitate later cross-references to the annealed and quenched quantities.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work and for the constructive feedback. We address the major comment in detail below and will revise the manuscript accordingly to improve clarity.
read point-by-point responses
-
Referee: [section deriving the implication for the deletion channel] The load-bearing step from the strict asymptotic separation between the null annealed free energy and the quenched free energies of the null and planted models (for all α∈(0,1)) to a strictly positive uniform-code rate r>0 must be made fully explicit. In particular, the inequality or theorem that converts the free-energy gap ΔF(α) into a coding-rate lower bound r>0 needs to be stated without non-strict steps, vanishing o(1) terms, or regime-specific concentration arguments that could fail when α≤1/2 (p≥1/2).
Authors: We agree that the connection between the free-energy separation and the positive uniform-code rate requires a more explicit statement to ensure rigor, especially for the regime p ≥ 1/2. In the revised manuscript, we will add a clear derivation in the relevant section that directly links the strict asymptotic gap ΔF(α) > 0 to a strictly positive lower bound on the achievable rate r > 0. This will be done by explicitly stating the relevant inequality from the random coding argument for deletion channels, taking the asymptotic limit first to remove any o(1) terms, and verifying that the argument holds uniformly without relying on concentration inequalities that might not apply at high deletion probabilities. We will present this as a standalone proposition for transparency. revision: yes
Circularity Check
No circularity: free-energy separations and coding-rate implication are independently derived
full rationale
The paper defines the Random Subsequence Model explicitly, derives an exact closed-form expression for the annealed free energy of the planted model, and proves strict asymptotic separations between the null annealed free energy and the quenched free energies of both models for every alpha in (0,1). These separations are exhibited via model-specific analysis rather than by fitting parameters or renaming inputs. The positive-rate result for uniform codes in the deletion channel is stated as a direct consequence of the separations (with p = 1 - alpha), without any reduction by construction, self-citation load-bearing, or ansatz smuggling. The upper bound on the uniform-code rate is likewise analytic from the planted annealed free energy. No step in the provided derivation chain collapses to a tautology or fitted input; the central claims rest on asymptotic arguments internal to the model definitions and are self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Existence of the limit of the normalized log-partition function as N,M -> infinity with M/N = alpha fixed
- domain assumption The Random Subsequence Model partition function directly governs the achievable rate of uniform random codes over the deletion channel
invented entities (1)
-
Random Subsequence Model
no independent evidence
Reference graph
Works this paper leans on
-
[1]
On transmission over deletion channels
[Ale94] K. S. Alexander. The Rate of Convergence of the Mean Length of the Longest Common Subsequence.The Annals of Applied Probability4.4 (1994), pp. 1074–1082. [Bou99] J. Boutet de Monvel. Extensive Simulations for Longest Common Subsequences: Finite Size Scaling, a Cavity Solution, and Configuration Space Properties.The European Physical Journal B7 (19...
work page 1994
-
[2]
Mutual information for a deletion channel
Citeseer. 2001, pp. 573–582. [DM06] E. Drinea and M. Mitzenmacher. A Simple Lower Bound for the Capacity of the Deletion Channel.IEEE Transactions on Information Theory52.10 (2006), pp. 4657–4660. [Dob67] R. L. Dobrushin. Shannon’s theorems for channels with synchronization errors.Problemy Peredachi Informatsii3.4 (1967), pp. 18–36. [DP95] V. Dan ˇc´ık an...
work page 2001
-
[3]
[EA75] S. F. Edwards and P. W. Anderson. Theory of spin glasses.Journal of Physics F: Metal Physics5.5 (1975), pp. 965–974. [Gal61] R. G. Gallager.Sequential Decoding for Binary Channels with Noise and Synchronization Errors. Tech. rep. MIT Lincoln Laboratory,
work page 1975
-
[4]
The zero-rate threshold for adversarial bit-deletions is less than 1/2
[GHL22] V. Guruswami, X. He, and R. Li. “The zero-rate threshold for adversarial bit-deletions is less than 1/2”.2021 IEEE 62nd Annual Symposium on Foundations of Computer Science (FOCS). IEEE. 2022, pp. 727–738. [Hei+24] G. T. Heineman, C. Miller, D. Reichman, A. Salls, G. S ´ark¨ozy, and D. Soiffer. Improved Lower Bounds on the Expected Length of Longes...
-
[5]
[Ste82] J. M. Steele. Long Common Subsequences and the Proximity of Two Random Strings.SIAM Journal on Applied Math- ematics42.4 (1982), pp. 731–737. [WJ08] M. J. Wainwright and M. I. Jordan. Graphical models, exponential families, and variational inference.Foundations and Trends®in Machine Learning1.1-2 (2008), pp. 1–305. [Zig69] K. S. Zigangirov. Sequen...
work page 1982
- [6]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.