Sequence Reconstruction for Sticky Insertion/Deletion Channels
Pith reviewed 2026-05-22 18:12 UTC · model grok-4.3
The pith
A recursive formula determines the fewest distinct outputs needed to uniquely recover messages sent through channels with bounded sticky insertions and deletions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For the (t, s)-sticky-insdel channel, the minimum number of distinct outputs required to uniquely reconstruct the transmitted vector is given by a recursive formula. An efficient algorithm reconstructs the transmitted vector from a set of erroneous sequences once their count meets or exceeds this number.
What carries the argument
The recursive formula that computes the minimum number of distinct traces sufficient for unique reconstruction in the (t, s)-sticky-insdel model.
If this is right
- The minimal number of distinct outputs for any fixed t and s can be calculated directly from the recursion.
- Reconstruction becomes possible in polynomial time once the required number of outputs is supplied.
- The sticky-insertion-only case reduces to a known tandem-duplication reconstruction problem.
- Unique recovery is guaranteed precisely when the number of distinct outputs reaches the recursive threshold.
Where Pith is reading between the lines
- Storage systems that read multiple copies of the same data block could use the formula to decide how many reads suffice for reliable recovery.
- The same recursive counting technique might apply to other channels whose errors are confined to runs of repeated symbols.
- An implementation of the reconstruction algorithm could be tested on synthetic traces generated from the exact (t, s) model to measure runtime scaling.
Load-bearing premise
The channel produces only runs of at most t identical-symbol insertions and s identical-symbol deletions, and the distinct outputs behave exactly as this bounded process predicts.
What would settle it
A concrete counter-example in which the number of distinct outputs equals the value given by the recursion yet two different input vectors remain consistent with every output.
read the original abstract
The sequence reconstruction problem for insertion/deletion channels has attracted significant attention owing to their applications recently in some emerging data storage systems, such as racetrack memories, DNA-based data storage. Our goal is to investigate the reconstruction problem for sticky-insdel channels where both sticky-insertions and sticky-deletions occur. If there are only sticky-insertion errors, the reconstruction problem for sticky-insertion channel is a special case of the reconstruction problem for tandem-duplication channel which has been well-studied. In this work, we consider the $(t, s)$-sticky-insdel channel where there are at most $t$ sticky-insertion errors and $s$ sticky-deletion errors when we transmit a message through the channel. For the reconstruction problem, we are interested in the minimum number of distinct outputs from these channels that are needed to uniquely recover the transmitted vector. We first provide a recursive formula to determine the minimum number of distinct outputs required. Next, we provide an efficient algorithm to reconstruct the transmitted vector from erroneous sequences.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies the sequence reconstruction problem over the (t,s)-sticky-insdel channel, in which a transmitted vector experiences at most t sticky insertions and s sticky deletions. It claims to derive a recursive formula that gives the minimum number of distinct channel outputs sufficient for unique recovery of the input and to supply an efficient algorithm that performs the reconstruction from those outputs. Sticky insertions are positioned as a special case of tandem-duplication channels.
Significance. A correct recursive characterization together with a polynomial-time reconstruction procedure would be useful for DNA-based storage and racetrack memory applications, where sticky errors arise naturally. The combination of an exact counting formula and an explicit algorithm is a constructive strength; however, the practical value hinges on whether the recursion fully enumerates worst-case distinguishable error patterns.
major comments (2)
- [Problem setup and recursive formula] The central claim that the recursive formula yields the exact minimum number of outputs rests on the assumption that all interactions between sticky insertions and deletions (especially on runs of identical symbols) are captured. The abstract and problem formulation do not exhibit an explicit accounting for sequential or overlapping errors on the same run; if such cases are undercounted, the stated threshold would be insufficient to guarantee uniqueness.
- [Reconstruction algorithm] The reconstruction algorithm is asserted to be efficient, yet no complexity analysis or correctness argument relative to the recursive count is supplied in the visible sections. This is load-bearing because the algorithm must succeed precisely when the number of outputs meets the recursive threshold.
minor comments (2)
- [Abstract] Notation for the (t,s)-sticky-insdel channel should be introduced once and used consistently; the current abstract mixes “sticky-insertion errors” and “tandem-duplication” without a clear mapping.
- [Main results] A small example (e.g., t=1, s=1, short binary vector) illustrating both the recursive count and the algorithm output would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading of the manuscript and the constructive comments. We respond to each major comment below and outline the revisions that will be incorporated in the next version.
read point-by-point responses
-
Referee: [Problem setup and recursive formula] The central claim that the recursive formula yields the exact minimum number of outputs rests on the assumption that all interactions between sticky insertions and deletions (especially on runs of identical symbols) are captured. The abstract and problem formulation do not exhibit an explicit accounting for sequential or overlapping errors on the same run; if such cases are undercounted, the stated threshold would be insufficient to guarantee uniqueness.
Authors: We thank the referee for highlighting this point. The recursive formula is derived by enumerating error patterns that explicitly include sequential and overlapping sticky insertions and deletions acting on the same run of identical symbols; the recursion tracks changes to run lengths after each error type. Nevertheless, we agree that the initial problem formulation section does not make this accounting sufficiently explicit. In the revised manuscript we will expand the problem setup with a dedicated paragraph and two short examples that illustrate how the recursion handles overlapping errors on a single run, thereby confirming that the threshold accounts for all such interactions. revision: yes
-
Referee: [Reconstruction algorithm] The reconstruction algorithm is asserted to be efficient, yet no complexity analysis or correctness argument relative to the recursive count is supplied in the visible sections. This is load-bearing because the algorithm must succeed precisely when the number of outputs meets the recursive threshold.
Authors: We agree that an explicit correctness argument tied to the recursive threshold and a formal complexity analysis are required. The current manuscript describes the algorithm and states that it is efficient, but does not supply the supporting proofs. We will add a new subsection that (i) proves by induction on the number of outputs that the algorithm recovers the unique input whenever the number of distinct outputs meets or exceeds the value given by the recursion, and (ii) establishes that the running time is polynomial in the sequence length and the number of outputs. revision: yes
Circularity Check
Derivation is self-contained; recursive formula and algorithm introduce independent content
full rationale
The paper states a recursive formula for the minimum number of distinct outputs needed to uniquely recover the transmitted vector under the (t,s)-sticky-insdel channel and supplies an efficient reconstruction algorithm. The observation that sticky-insertion errors form a special case of tandem-duplication channels is presented as a reduction to previously studied results rather than a self-referential definition or fitted input. No equations or steps in the provided description reduce the claimed minimum or the algorithm to the inputs by construction, nor do they rely on load-bearing self-citations whose validity is assumed without external verification. The central claims therefore remain independent of the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Errors consist of at most t sticky insertions and s sticky deletions.
Reference graph
Works this paper leans on
-
[1]
Binary codes capable of correcting spu rious insertions and deletion of ones,
V . Levenshtein, “Binary codes capable of correcting spu rious insertions and deletion of ones,” Problems of Information Transmission, vol.. 1, no.. 1, pp. 8–17, 1965
work page 1965
-
[2]
Capacity Bounds for Sticky Channels,
M. Mitzenmacher “Capacity Bounds for Sticky Channels,” IEEE Transactions on Information Theory , vol. 54, no. 1, pp. 72–77, 2008
work page 2008
-
[3]
Sharp analytical capacit y upper bounds for sticky and related channels,
M. Cheraghchi and J. Ribeiro, “Sharp analytical capacit y upper bounds for sticky and related channels,” IEEE Transaction on Information Theory, vol. 65, no. 11, pp. 6950–6974, 2019
work page 2019
-
[4]
Repetition error correct ing sets: Ex- plicit constructions and prefixing methods,
L. Dolecek and V . Anantharam, “Repetition error correct ing sets: Ex- plicit constructions and prefixing methods,” SIAM Journal on Discrete Mathematics, vol. 23, no. 4, pp. 2120–2146, 2010
work page 2010
-
[5]
H. Mahdavifar, A. V ardy “Asymptotically Optimal Sticky -Insertion-Correcting Codes with Efficient Encoding and De coding“ 2017 IEEE International Symposium on Information Theory (ISIT) , pp. 2683–2687, 2017
work page 2017
-
[6]
On a new class of error control c odes and symmetric functions,
L. G. Tallini and B. Bose, “On a new class of error control c odes and symmetric functions,” in IEEE International Symposium on Information Theory, pp. 980–984, 2008
work page 2008
-
[7]
On efficient repet ition error correcting codes,
L. G. Tallini, N. Elarief, and B. Bose, “On efficient repet ition error correcting codes,” in IEEE International Symposium on Information Theory, pp. 1012–1016, 2010
work page 2010
-
[8]
C oding for racetrack memories,
Y . M. Chee, H. M. Kiah, A. V ardy, V . K. Vu, and E. Y aakobi, “C oding for racetrack memories,” IEEE Transactions on Information Theory, vol. 64, no. 11, pp. 7094–7112, 2018
work page 2018
-
[9]
C odes correcting limited-shift errors in racetrack memorie s,
Y . M. Chee, H. M. Kiah, A. V ardy, V . K. Vu, and E. Y aakobi, “C odes correcting limited-shift errors in racetrack memorie s,” Proc. IEEE International Symposium on Information Theory, pp. 96–100, 2018
work page 2018
-
[10]
Funda mental bounds for sequence reconstruction from nanopore se quencers,
A. Magner, J. Duda, W. Szpankowski, and A. Grama, “Funda mental bounds for sequence reconstruction from nanopore se quencers,” IEEE Trans Mol Biol Multiscale Commun, vol. 2, no. 1, pp. 92–106, 2016
work page 2016
-
[11]
D uplication- correcting codes for data storage in the DNA of l iving organisms,
S. Jain, F. F. Hassanzadeh, M. Schwartz, and J. Bruck, “D uplication- correcting codes for data storage in the DNA of l iving organisms,” IEEE Transactions on Information Theory, vol. 63, no. 8, pp. 4996–5010, 2017
work page 2017
-
[12]
Asymptotically optimal cod es correcting fixed-length duplication errors in DNA storag e systems,
M. Kovacevic and V . Y . Tan, “Asymptotically optimal cod es correcting fixed-length duplication errors in DNA storag e systems,” IEEE Communications Letters, vol. 22, no. 11, pp. 2194–2197, 2018. 9
work page 2018
-
[13]
Efficient reconstruction of sequen ces,
V . I. Levenshtein, “Efficient reconstruction of sequen ces,” IEEE Transaction on Information Theory, vol. 47, no. 1, pp. 2–22, 2001
work page 2001
-
[14]
Efficient reconstruction of sequen ces from their sub-sequences or super-sequences,
V . I. Levenshtein, “Efficient reconstruction of sequen ces from their sub-sequences or super-sequences,” Journal of Combinatorial Theory, Series A, vol. 93, pp. 310–332, 2001
work page 2001
-
[15]
Next-generation di gital information storage in DNA,
G. M. Church, Y . Gao, and S. Kosuri, “Next-generation di gital information storage in DNA,” Science, vol. 337, no. 6102, pp. 1628, 2012
work page 2012
-
[16]
DNA-based storage: Trends and methods,
S. Y azdi, H. M. Kiah, E. R. Garcia, J. Ma, H. Zhao, and O. Mi lenkovic., “DNA-based storage: Trends and methods,” IEEE Trans. Molecular , Biological, Multi-Scale Commun., vol. 1, no. 3, pp. 230–248, 2015
work page 2015
-
[17]
Reconstruction from seletions in racetrack memories,
Y . M. Chee, R. Gabrys, A. V ardy, V . K. Vu, and E. Y aakobi, “ Reconstruction from seletions in racetrack memories,” in IEEE Information Theory W orkshop,2018
work page 2018
-
[18]
Correcting deletions in multiple -heads racetrack memories
J. Sima and J. Bruck, “Correcting deletions in multiple -heads racetrack memories”, in IEEE International Symposium on Information Theory, 2019
work page 2019
-
[19]
Sequence reconstruction ov er the deletion channel,
R. Gabrys, and E. Y aakobi, “Sequence reconstruction ov er the deletion channel,” IEEE Trans. on Information Theory, vol. 64, no. 4, pp. 2924–2931, 2018
work page 2018
-
[20]
Levenshtein’s reconstru ction problem under insertions, deletions, and substituti ons,
M. Abu-Sini, and E. Y aakobi, “Levenshtein’s reconstru ction problem under insertions, deletions, and substituti ons,” IEEE Transaction on Information Theory, vol. 67, no. 11, pp. 7132–7158, 2021
work page 2021
-
[21]
Optimal reconstr uction codes for deletion channels,
J. Chrisnata, H. M. Kiah, E. Y aakobi, “Optimal reconstr uction codes for deletion channels,” in Proc. International Symposium on Information Theory and its Applications (ISITA), pp. 279–283, 2020
work page 2020
-
[22]
Sequence reconst ruction problem for deletion channels: a complete asymptot ic solution,
V . L. P . Pham, K. Goyal, and H. M. Kiah, “Sequence reconst ruction problem for deletion channels: a complete asymptot ic solution,” Proc. International Symposium on Information Theory, pp. 992–997, 2022
work page 2022
-
[23]
Sequence reconstruction for li mited-magnitude errors,
H. Wei and M. Schwartz, “Sequence reconstruction for li mited-magnitude errors,” IEEE Transaction on Information Theory, vol. 68, no. 7, pp. 4422–4434, 2022
work page 2022
-
[24]
Reconstruction codes for DNA sequences with uniform tandem-duplication errors,
Y . Y ehezkeally and M. Schwartz, “Reconstruction codes for DNA sequences with uniform tandem-duplication errors, ” IEEE Transaction on Information Theory, vol. 66, no. 5, pp. 2658–2668, 2019. 10 Algorithm 1: Check validity of inputs y1, . . . ,yM . Input : y1, . . . ,yM Output: x or FAILURE 1 formation1← [y1,1] 2 pos← 1 3 for j = 2 to n do 4 if y1,j̸= y1...
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.