Simple Finite-Length Achievability and Converse Bounds for the Deletion Channel and the Insertion Channel
Pith reviewed 2026-05-22 18:21 UTC · model grok-4.3
The pith
A specific reference output distribution yields tighter finite-length upper bounds on code size for deletion and insertion channels than the binary erasure channel.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For a general discrete channel a reference output distribution can be selected so that the general converse bound simplifies to a closed-form expression; when this expression is evaluated on the deletion channel with finite block length it produces an upper bound on code size that is tighter than the binary-erasure-channel bound for the same length and frame-error target.
What carries the argument
A reference output distribution chosen for any finite-input finite-output channel that turns the general converse into a simple, explicit upper bound on code size.
If this is right
- The upper bound applies directly to both i.i.d. deletion and insertion channels.
- It is computable without optimizing over distributions or using data-dependent tuning.
- For deletion channels the bound is strictly tighter than the only previously available finite-length converse based on the binary erasure channel.
- An exponential-time algorithm computes a corresponding achievability bound for any discrete-input discrete-output channel.
- The same reference-distribution technique can be reused for other channels with finite alphabets.
Where Pith is reading between the lines
- These bounds could serve as quick sanity checks when designing practical deletion-correcting codes for short packets.
- Extending the reference distribution to channels with memory or continuous alphabets might produce analogous finite-length results.
- Comparing the bound against known asymptotic capacity expressions for large block lengths would test consistency.
- Implementing the algorithm for small block lengths could reveal how close the converse and achievability bounds sit for deletion channels.
Load-bearing premise
The chosen reference output distribution produces a bound that remains valid and reasonably tight for the specific deletion and insertion channels without requiring further optimization or data-dependent tuning.
What would settle it
For small block lengths, compute the actual largest code size that achieves the target frame error probability over the deletion channel and check whether it exceeds the new upper bound.
Figures
read the original abstract
We develop upper bounds on code size for an independent and identically distributed deletion and insertion channels for a given code length and target frame error probability. The bounds are obtained as a variation of a general converse bound, which, though available for any channel, is inefficient and not easily computable without a good reference distribution over the output alphabet. We obtain a reference output distribution for a general finite-input finite-output channel and provide a simple formula for the converse bound on the capacity employing this distribution. We then evaluate the bound for the deletion channel with a finite block length, and show that the resulting upper bound on the code size is tighter than that for a binary erasure channel, which is the only alternative converse bound for the finite-length setting. We also provide similar results for the insertion channel. Furthermore, we present a simple algorithm for computing an achievability bound for a general discrete-input discrete-output channel. Although the algorithm has exponential complexity, it is useful for comparison purposes.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper develops finite-length upper bounds on code size for i.i.d. deletion and insertion channels by applying a variation of a general converse bound that relies on an explicit reference output distribution over the (variable-length) output alphabet. It derives a simple closed-form expression for the resulting converse and evaluates it numerically to claim a strictly tighter bound than the binary erasure channel alternative; it also supplies an exponential-time algorithm for computing matching achievability bounds on general discrete channels.
Significance. If the reference distribution is shown to be valid for variable-length outputs, the work supplies the first explicit, non-optimized finite-length converse for deletion/insertion channels that improves on the only previously available bound (BEC). The explicit reference distribution and the simple achievability algorithm constitute practical, reproducible tools that enable direct numerical comparison without parameter tuning.
major comments (2)
- [Section describing the reference distribution and its application to deletion/insertion channels] The central claim that the derived upper bound is strictly tighter than the BEC bound rests on the reference output distribution remaining valid once the output alphabet is the union of {0,1}^k for k = 0 … n. The manuscript must explicitly construct this distribution over the variable-length space and verify that it dominates the true output measure induced by the capacity-achieving input (or at least that the resulting information-spectrum bound stays valid).
- [Evaluation section for the deletion channel] The abstract states that the bound is obtained 'as a variation of a general converse bound' and then 'evaluated' for the deletion channel. The manuscript should include the precise statement of the general converse (including the exact divergence or information-spectrum quantity used) together with the numerical values of the resulting bound versus the BEC bound for at least one block length and error probability, so that the claimed improvement can be reproduced.
minor comments (2)
- The achievability algorithm is described as having 'exponential complexity' yet 'useful for comparison purposes.' Adding a short pseudocode listing or a complexity table for small n would improve clarity.
- Notation for the output alphabet of the deletion channel (variable-length binary strings) should be introduced once and used consistently when defining the reference distribution.
Simulated Author's Rebuttal
We thank the referee for the thorough review and insightful comments on our manuscript. We address each major comment below in a point-by-point manner and indicate planned revisions to strengthen the presentation and reproducibility of the results.
read point-by-point responses
-
Referee: [Section describing the reference distribution and its application to deletion/insertion channels] The central claim that the derived upper bound is strictly tighter than the BEC bound rests on the reference output distribution remaining valid once the output alphabet is the union of {0,1}^k for k = 0 … n. The manuscript must explicitly construct this distribution over the variable-length space and verify that it dominates the true output measure induced by the capacity-achieving input (or at least that the resulting information-spectrum bound stays valid).
Authors: We agree that an explicit construction over the variable-length output space is essential for rigor. In the revised manuscript we will insert a dedicated subsection that defines the reference distribution explicitly as the product measure induced by independent Bernoulli(1/2) symbols for each possible output length k = 0 to n, normalized appropriately to account for the deletion probability. We will then prove that this distribution satisfies the required dominance property with respect to the true output measure under the uniform input distribution, thereby ensuring the validity of the information-spectrum converse bound. This construction follows directly from the general finite-input finite-output reference distribution derived earlier in the paper but is specialized to the union of binary strings of length at most n. revision: yes
-
Referee: [Evaluation section for the deletion channel] The abstract states that the bound is obtained 'as a variation of a general converse bound' and then 'evaluated' for the deletion channel. The manuscript should include the precise statement of the general converse (including the exact divergence or information-spectrum quantity used) together with the numerical values of the resulting bound versus the BEC bound for at least one block length and error probability, so that the claimed improvement can be reproduced.
Authors: We concur that the current presentation would benefit from greater explicitness. We will revise the manuscript to include the precise statement of the general converse bound, specifying that the upper bound on the code size is given by the information-spectrum quantity involving the divergence between the induced output distribution and the chosen reference distribution. In addition, we will add a table in the evaluation section that reports the numerical values of our bound and the BEC bound for at least one concrete block length (e.g., n = 20) and target error probability (e.g., 10^{-3}), allowing direct reproduction of the claimed improvement. revision: yes
Circularity Check
No significant circularity; bounds derived from general converse with independently obtained reference distribution
full rationale
The paper presents a variation of a general converse bound for finite-length deletion and insertion channels by first obtaining a reference output distribution for any finite-input finite-output channel and then evaluating the resulting code-size upper bound. This reference distribution and the simple formula for the converse are derived as a general tool rather than fitted to deletion/insertion data or reduced to a self-citation chain. The claimed tightness is shown by direct comparison to the external binary erasure channel bound, with no equations or steps in the provided text reducing the target result to its own inputs by construction. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Existence of a general converse bound applicable to any channel
Reference graph
Works this paper leans on
-
[1]
Three decades of na nopore sequencing,
D. Deamer, M. Akeson, and D. Branton, “Three decades of na nopore sequencing,” Nature biotechnology, vol. 34, pp. 518–524, 05 2016. 10 10−2 100 6 9 12 FER log2 of code size LO−CVB ι=0.05 AVB ι=0.05 LO−CVB ι=0.2 AVB ι=0.2 LO−CVB ι=0.5 AVB ι=0.5 insertion channel, m=12, n=1 Fig. 3: GA VB for channel I (ι) 12 . 0.7 0.8 0.9 1 1.1 10 20 50 100 200 500 1000 200...
work page 2016
-
[2]
Next-generation dig ital information storage in DNA,
G. M. Church, Y . Gao, and S. Kosuri, “Next-generation dig ital information storage in DNA,” Science, vol. 337, no. 6102, p. 1628, 2012
work page 2012
-
[3]
Towards practical, high-capacity, low-maintenance inf ormation storage in synthesized DNA,
N. Goldman, P . Bertone, S. Chen et al. , “Towards practical, high-capacity, low-maintenance inf ormation storage in synthesized DNA,” Nature, vol. 494, no. 7435, pp. 77–80, Feb 2013
work page 2013
-
[4]
P . L. Antkowiak et al. , “Low cost DNA data storage using photolithographic synthe sis and advanced information reconstruction and error corr ection,” Nature Commun. , vol. 11, no. 1, pp. 1–10, Dec 2020
work page 2020
-
[5]
An end-to-end coding scheme fo r DNA-based data storage with nanopore-sequenced reads,
L. Welter, R. Sokolovskii, T. Heinis, A. Wachter-Zeh, E. Rosnes, and A. Graell i Amat, “An end-to-end coding scheme fo r DNA-based data storage with nanopore-sequenced reads,” 2024. [Online]. Availabl e: https://arxiv.org/abs/2406.12955
-
[6]
Channel coding r ate in the finite blocklength regime,
Y . Polyanskiy, H. V . Poor, and S. V erdu, “Channel coding r ate in the finite blocklength regime,” IEEE Transactions on Information Theory , vol. 56, no. 5, pp. 2307–2359, 2010
work page 2010
-
[7]
J. Piao, K. Niu, J. Dai, and C. Dong, “Approaching the norm al approximation of the finite blocklength capacity within 0 .025 db by short polar codes,” IEEE Wireless Communications Letters , vol. 9, no. 7, pp. 1089–1092, 2020
work page 2020
-
[8]
Fini te blocklength performance bound for the DNA storage channe l,
I. Maarouf, G. Liva, E. Rosnes, and A. Graell i Amat, “Fini te blocklength performance bound for the DNA storage channe l,” in 2023 12th International Symposium on Topics in Coding (ISTC) , 2023, pp. 1–5
work page 2023
-
[9]
C. E. Shannon, Lower Bounds to Error Probability for Coding on Discrete Mem oryless Channels. I . IEEE, 1993, pp. 385–423
work page 1993
-
[10]
T. Erseghe, “On the evaluation of the Polyanskiy-Poor– Verdú converse bound for finite block-length coding in AWGN, ” IEEE Transactions on Information Theory, vol. 61, no. 12, pp. 6578–6590, 2015
work page 2015
-
[11]
The DNA storage channel: C apacity and error probability bounds,
N. Weinberger and N. Merhav, “The DNA storage channel: C apacity and error probability bounds,” IEEE Transactions on Information Theory , vol. 68, no. 9, pp. 5657–5700, 2022
work page 2022
-
[12]
Information rates of the noisy nanopore channel,
B. McBain, E. Viterbo, and J. Saunderson, “Information rates of the noisy nanopore channel,” IEEE Transactions on Information Theory , vol. 70, no. 8, pp. 5640–5652, 2024
work page 2024
-
[13]
Asymptotic Estimates in Information Theory with Non-Vanishing Error Probabilities
V . Y . F. Tan, “Asymptotic estimates in information theo ry with non-vanishing error probabilities,” CoRR, vol. abs/1504.02608, 2015. [Online]. Available: http://arxiv.org/abs/1504.02608
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[14]
A general formula for channel cap acity,
S. V erdu and T. S. Han, “A general formula for channel cap acity,” IEEE Transactions on Information Theory , vol. 40, no. 4, pp. 1147–1157, 1994
work page 1994
-
[15]
Novel bounds on the capaci ty of the binary deletion channel,
D. Fertonani and T. M. Duman, “Novel bounds on the capaci ty of the binary deletion channel,” IEEE Transactions on Information Theory , vol. 56, no. 6, pp. 2753–2765, 2010
work page 2010
-
[16]
Upper bounds on the capacity of deletion channels using channel fragmentation,
M. Rahmati and T. M. Duman, “Upper bounds on the capacity of deletion channels using channel fragmentation,” IEEE Transactions on Information Theory, vol. 61, no. 1, pp. 146–156, 2015
work page 2015
-
[17]
Bounds on the capacity of random insertion and dele tion-additive noise channels,
——, “Bounds on the capacity of random insertion and dele tion-additive noise channels,” IEEE Transactions on Information Theory , vol. 59, no. 9, pp. 5534–5546, 2013
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.