Some New Results on Sequence Reconstruction Problem for Deletion Channels
Pith reviewed 2026-05-16 15:42 UTC · model grok-4.3
The pith
The maximum intersection size of two deletion balls of radius 4 with centers at least distance 3 apart is exactly 20n-166 for sequence length n at least 13.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors prove that N(n,3,4) equals 20n-166 for every n at least 13. They first construct two centers at distance 3 whose balls of radius 4 intersect in exactly 20n-166 sequences, establishing the lower bound. Then they show that no pair of centers at distance 3 can have a larger intersection by exhaustive enumeration of possible overlapping patterns for n at least 13. This confirms the conjectured value and provides a lower bound expression that is tight only for t=4.
What carries the argument
N(n,d,t), the largest possible intersection size between two balls of radius t in the Levenshtein deletion metric whose centers are at least distance d apart.
If this is right
- The exact number of common neighbors for distance-3 centers in the deletion graph is now known for radius 4.
- Reconstruction algorithms for deletion channels can use this bound to determine the minimum number of traces needed.
- The linear growth rate of 20n is the precise worst-case overlap for these parameters.
- Techniques involving case analysis on deletion patterns may extend to nearby parameters.
Where Pith is reading between the lines
- Similar exact formulas might exist for other small values of d and t beyond those covered here.
- DNA sequencing applications could apply this bound directly when modeling deletion errors.
- The general lower bound for t greater than 4 suggests that overlaps grow roughly linearly with n regardless of t in this regime.
Load-bearing premise
The case analysis and constructions remain valid without missing exceptional cases for every sequence length n at least 13.
What would settle it
An explicit pair of length-n sequences at distance exactly 3 whose common deletion ball of radius 4 contains more than 20n-166 strings, for some n at least 13.
Figures
read the original abstract
Levenshtein first introduced the sequence reconstruction problem in $2001$. In the realm of combinatorics, the sequence reconstruction problem is equivalent to determining the value of $N(n,d,t)$, which represents the maximum size of the intersection of two metric balls of radius $t$, given that the distance between their centers is at least $d$ and the sequence length is $n$. In this paper, We present a lower bound on $N(n,3,t)$ for $n\geq \max\{13,t+8\}$ and $t \geq 4$. For $t=4$, we prove that this lower bound is tight. This settles an open question posed by Pham, Goyal, and Kiah, confirming that $N(n,3,4)=20n-166$ for all $n \geq 13$.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies the sequence reconstruction problem equivalent to determining N(n,d,t), the maximum intersection size of two Levenshtein balls of radius t whose centers are at distance at least d. It establishes a general lower bound on N(n,3,t) for n ≥ max{13,t+8} and t ≥ 4, and proves that the bound is tight for t=4, yielding the exact formula N(n,3,4)=20n−166 for all n≥13. This resolves an open question posed by Pham, Goyal, and Kiah.
Significance. If the claimed equality holds, the result supplies the first exact closed-form determination of N(n,3,4) for sufficiently large n. This is a concrete advance in combinatorial coding theory for deletion channels, as the quantity directly governs the worst-case number of common subsequences that must be disambiguated when reconstructing a sequence from two reads at Levenshtein distance 3. The explicit constructions achieving the lower bound and the matching upper-bound argument via case analysis on ball intersections constitute a non-trivial combinatorial contribution.
major comments (2)
- [§4] §4 (upper-bound argument): the claim that N(n,3,4) ≤ 20n−166 for all n≥13 rests on an exhaustive enumeration of all pairs of centers at distance exactly 3 together with the common subsequences they can share inside radius-4 balls. The manuscript must demonstrate that the case division is complete, overlap-free, and free of missed configurations for every n≥13; a single overlooked pair permitting a larger intersection would falsify the exact formula.
- [§3] §3 (lower-bound construction): the explicit constructions achieving 20n−166 are stated to work for n≥13, but the verification that they remain valid without exceptions for arbitrarily large n (including the handling of boundary effects when n grows) is not fully detailed in the visible derivation steps.
minor comments (1)
- [§2] Notation: the definition of the Levenshtein ball and the precise meaning of “common subsequence inside radius t” should be restated once in the main text (rather than only by reference) to improve readability for readers outside the immediate subfield.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for acknowledging the resolution of the open question on N(n,3,4). We address the major comments point by point below and will revise the manuscript to add the requested clarifications.
read point-by-point responses
-
Referee: [§4] §4 (upper-bound argument): the claim that N(n,3,4) ≤ 20n−166 for all n≥13 rests on an exhaustive enumeration of all pairs of centers at distance exactly 3 together with the common subsequences they can share inside radius-4 balls. The manuscript must demonstrate that the case division is complete, overlap-free, and free of missed configurations for every n≥13; a single overlooked pair permitting a larger intersection would falsify the exact formula.
Authors: We agree that explicit verification of completeness is essential. The case analysis in §4 enumerates all possible Levenshtein edit combinations yielding distance exactly 3 and classifies the resulting ball intersections by the positions and types of edits. For n≥13 the patterns are independent of boundary effects. We will add a dedicated subsection (or appendix) that tabulates the cases, proves they are exhaustive and disjoint via exhaustive enumeration of edit position tuples, and confirms no larger intersections exist. revision: yes
-
Referee: [§3] §3 (lower-bound construction): the explicit constructions achieving 20n−166 are stated to work for n≥13, but the verification that they remain valid without exceptions for arbitrarily large n (including the handling of boundary effects when n grows) is not fully detailed in the visible derivation steps.
Authors: The constructions in §3 are defined via fixed-length differing blocks embedded in a periodic background that scales linearly with n. The condition n≥13 ensures the blocks fit without overlap or boundary interference. We will expand the derivation with an inductive argument on n (base case n=13 verified directly, inductive step shows the intersection size increases exactly by 20) together with an explicit check that boundary positions never produce extra common subsequences. revision: yes
Circularity Check
No circularity: exact value obtained via explicit constructions and case analysis
full rationale
The derivation establishes N(n,3,4)=20n-166 through an explicit combinatorial construction achieving the lower bound and a separate exhaustive case analysis proving the matching upper bound for all n≥13. Neither step reduces to a self-definition, fitted parameter renamed as prediction, or load-bearing self-citation; the case division enumerates pairs of centers at distance 3 and their radius-4 intersections directly from the metric definition. The result settles an external open question without internal reduction.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math The deletion distance and the metric-ball intersection size N(n,d,t) are well-defined for all positive integers n, d, t.
Reference graph
Works this paper leans on
-
[1]
Next-generation digital information storage in DNA,
G. M. Church, Y . Gao, and S. Kosuri, “Next-generation digital information storage in DNA,”Science, vol. 337, no. 6102, pp. 1628–1628, 2012
work page 2012
-
[2]
DNA-based storage: Trends and methods,
S. Yazdi, H.M. Kiah, E.R. Garcia, J. Ma, H. Zhao, and O. Milenkovic, “DNA-based storage: Trends and methods,”IEEE Trans. Molecular, Biological, Multi-Scale Commun., vol. 1, no. 3, pp. 230–248, 2015
work page 2015
-
[3]
Coding over sets for DNA storage,
A. Lenz, P. H. Siegel, A. Wachter-Zeh, and E. Yaakobi E, “Coding over sets for DNA storage,”IEEE Trans. Inf. Theory, vol. 66, no. 4, pp. 2331–2351, 2020
work page 2020
-
[4]
Magnetic domain-wall racetrack memory,
S. S. Parkin, M. Hayashi, and L. Thomas, “Magnetic domain-wall racetrack memory,”Science, vol. 320, pp. 190–194, 2008
work page 2008
-
[5]
Coding for racetrack memories,
Y . M. Chee, H. M. Kiah, A. Vardy, E. Yaakobi, and V . K. Vu, “Coding for racetrack memories,”IEEE Trans. Inf. Theory, vol. 64, no. 11, pp. 7094–7112, 2018
work page 2018
-
[6]
Reconstruction of permutations distorted by single reversal errors,
E. Konstantinova, “Reconstruction of permutations distorted by single reversal errors,”Discrete Applied Mathematics, vol. 155, pp. 2426–2434, 2007
work page 2007
-
[7]
Efficient reconstruction of sequences,
V . I. Levenshtein, “Efficient reconstruction of sequences,”IEEE Trans. Inf. Theory, vol. 47, no. 1, pp. 2–22, 2001
work page 2001
-
[8]
Efficient reconstruction of sequences from their subsequences or supersequences,
V . I. Levenshtein, “Efficient reconstruction of sequences from their subsequences or supersequences,”Journal of Combin. Theory, Ser. A, vol. 93, no.2, pp. 310–332, 2001
work page 2001
-
[9]
Reconstruction of a graph from2-vicinities of its vertices,
V . I. Levenshtein, E. Konstantinova, E. Konstantinov, and S. Molodtsov, “Reconstruction of a graph from2-vicinities of its vertices,”Discrete Applied Mathematics, vol. 156, pp. 1399–1406, 2008
work page 2008
-
[10]
Error graphs and the reconstruction of elements in groups,
V . I. Levenshtein and J. Siemons, “Error graphs and the reconstruction of elements in groups,”Journal of Combin. Theory, Ser. A, vol. 116, pp. 795–815, 2009
work page 2009
-
[11]
Exact reconstruction from insertions in synchronization codes,
F. Sala, R. Gabrys, C. Schoeny, and L. Dolecek, “Exact reconstruction from insertions in synchronization codes,”IEEE Trans. Inf. Theory, vol. 63, no. 4, pp. 2428–2445, 2017
work page 2017
-
[12]
Sequence reconstruction over the deletion channel,
R. Gabrys and E. Yaakobi, “Sequence reconstruction over the deletion channel,”IEEE Trans. Inf. Theory, vol. 64, no. 4, pp. 2924–2931, 2018
work page 2018
-
[13]
Sequence reconstruction problem for deletion channels: a complete asymptotic solution,
V . L. P. Pham, K. Goyal, and H. M. Kiah, “Sequence reconstruction problem for deletion channels: a complete asymptotic solution,” InProc. IEEE Int. Symp. Inform. Theory, pp. 992–997, 2022. 32
work page 2022
-
[14]
Sequence reconstruction problem for deletion channels: a complete asymptotic solution,
V . L. P. Pham, K. Goyal, and H. M. Kiah, “Sequence reconstruction problem for deletion channels: a complete asymptotic solution,”Journal of Combin. Theory, Ser. A, https://doi.org/10.1016/j.jcta.2024.105980, 2025
-
[15]
Sequence reconstruction for Grassmann graphs and permutations,
E. Yaakobi, M. Schwartz, M. Langberg, and J. Bruck, “Sequence reconstruction for Grassmann graphs and permutations,” InProc. Int. Symp. Inform. Theory, pp. 874–878, 2013
work page 2013
-
[16]
Reconstruction of permutations distorted by single Kendallτ-errors,
X. Wang, “Reconstruction of permutations distorted by single Kendallτ-errors,”Cryptogr. Commun., vol. 15, pp. 131–144, 2023
work page 2023
-
[17]
The sequence reconstruction problem for permutations with the Hamming distance,
X. Wang and E. V . Konstantinova, “The sequence reconstruction problem for permutations with the Hamming distance,”Cryptogr. Commun., vol. 16, pp. 1033–1057, 2024
work page 2024
-
[18]
The sequence reconstruction problem for permutations with the Hamming distance,
X. Wang, F.-W. Fu, and E. V . Konstantinova, “The sequence reconstruction problem for permutations with the Hamming distance,”Des. Codes Cryptogr., vol. 93, no. 1, pp. 11–37, 2025
work page 2025
-
[19]
On the computation of Levenshtein’s distances,
L. Calabi, “On the computation of Levenshtein’s distances,”TN-9-0030, Parke Math. Labs., Inc., Carlisle, MA., 1967
work page 1967
-
[20]
Correcting deletions with multiple reads,
J. Chrisnata, H. M. Kiah, and E. Yaakobi, “Correcting deletions with multiple reads,”IEEE Trans. Inf. Theory, vol. 68, no. 11, pp. 7141–7158, 2022
work page 2022
-
[21]
Sequence reconstruction under channels with multiple bursts of insertions or deletions,
Z. Lan, Y . Sun, W. Yu, and G. Ge, “Sequence reconstruction under channels with multiple bursts of insertions or deletions,”IEEE Trans. Inf. Theory, vol. 72, no. 1, pp. 315-330, 2026
work page 2026
-
[22]
Sequence reconstruction under single-burst insertion/deletion/edit channel,
Y . Sun, Y . Xi, and G. Ge, “Sequence reconstruction under single-burst insertion/deletion/edit channel,”IEEE Trans. Inf. Theory, vol. 69, no. 7, pp. 4466-4483, 2023
work page 2023
-
[23]
Sequence reconstruction over 3-deletion Channels,
D. Zhang, G. Ge, and Y . Zhang, “Sequence reconstruction over 3-deletion Channels,” InProc. IEEE Int. Symp. Inf. Theory, pp. 891-896, 2024
work page 2024
-
[24]
Sequence reconstruction for the single-deletion single-substitution channel,
W. Song, K. Cai, and T. Q. S. Quek, “Sequence reconstruction for the single-deletion single-substitution channel,”IEEE Jou. on Sel. Areas Inf. Theory, vol. 6, pp. 232-247, 2025
work page 2025
-
[25]
Coding for sequence reconstruction for single edits,
K. Cai, H. M. Kiah, T. T. Nguyen, and E. Yaakobi, “Coding for sequence reconstruction for single edits,”IEEE Trans. Inf. Theory, vol. 68, no. 1, pp. 66-79, 2022
work page 2022
-
[26]
Correcting two-deletion with a constant number of reads,
Y . Sun and G. Ge, “Correcting two-deletion with a constant number of reads,”IEEE Trans. Inf. Theory, vol. 69, no. 5, pp. 2969-2982, 2023
work page 2023
-
[27]
Bounds and constructions ofℓ-read codes under the Hamming metric,
Y . Sun and G. Ge, “Bounds and constructions ofℓ-read codes under the Hamming metric,”IEEE Trans. Inf. Theory, vol. 71, no. 8, pp. 5868-5883, 2025
work page 2025
-
[28]
Codes for correcting a burst of edits using weighted-summation VT sketch,
Y . Sun and G. Ge, “Codes for correcting a burst of edits using weighted-summation VT sketch,”IEEE Trans. Inf. Theory, vol. 71, no. 3, pp. 1631-1646, 2025
work page 2025
-
[29]
Reconstruction of sequences distorted by two insertions,
Z. Ye, X. Liu, X. Zhang, and G. Ge, “Reconstruction of sequences distorted by two insertions,”IEEE Trans. Inf. Theory, vol. 69, no. 8, pp. 4977-4992, 2023
work page 2023
-
[30]
Balanced reconstruction codes for single edits,
R. Wu and X. Zhang, “Balanced reconstruction codes for single edits,”Des. Codes Cryptogr.,vol. 92, pp. 2011-2029, 2024
work page 2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.