pith. sign in

arxiv: 2504.14087 · v2 · pith:ZX4P7HREnew · submitted 2025-04-18 · 💻 cs.IT · math.IT

Channels with Input-Correlated Synchronization Errors

Pith reviewed 2026-05-22 19:02 UTC · model grok-4.3

classification 💻 cs.IT math.IT
keywords synchronization errorsinput-correlated channelschannel capacitycoding capacityDNA data storagedeletion channelsmulti-trace channelsstationary ergodic inputs
0
0 comments X

The pith

Under identified conditions on input-correlated synchronization channels, information capacity equals coding capacity and is achieved by stationary ergodic inputs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies channels in which synchronization errors such as deletions or insertions applied to each symbol can depend on the entire input string rather than occurring independently. It identifies conditions under which the information capacity of such a channel is achieved by a stationary ergodic input source and equals the coding capacity. These conditions are claimed to hold for a wide class of channels that includes error patterns observed in DNA-based data storage and their multi-trace versions, generalizing earlier results on synchronization channels. The general theorem is combined with existing coding techniques to produce explicit capacity-achieving codes for multi-trace channels with runlength-dependent deletions.

Core claim

Under identified conditions on the input-correlated synchronization channel, the channel's information capacity is achieved by a stationary ergodic input source and is equal to its coding capacity. This generalizes prior work and yields explicit capacity-achieving codes for multi-trace runlength-dependent deletion channels when combined with techniques from earlier papers.

What carries the argument

The conditions on the input-correlated synchronization channel that make information capacity equal to coding capacity and achievable by stationary ergodic sources.

If this is right

  • The result extends earlier capacity theorems from independent-error synchronization channels to the input-correlated case.
  • It applies to channels modeling correlated errors in DNA-based data storage systems and their multi-trace versions.
  • Explicit capacity-achieving codes become available for multi-trace channels with runlength-dependent deletions.
  • The theorem supplies a modular step that can be paired with other coding constructions for related deletion channels.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the same conditions turn out to hold for other practical storage or transmission media with input-dependent errors, the capacity equality would carry over.
  • The modular use of the general theorem with existing coding methods suggests a template for deriving explicit codes in additional correlated-error settings.
  • Empirical measurement of error statistics from real DNA sequencing runs could test whether the conditions are satisfied in deployed systems.

Load-bearing premise

The central capacity result assumes the existence of particular conditions on the input-correlated synchronization channel that the paper states are satisfied by a wide class including DNA-storage error patterns.

What would settle it

Direct verification of whether the stated conditions hold for the multi-trace runlength-dependent deletion channels; if the conditions fail, the claimed equality between information and coding capacity need not follow.

Figures

Figures reproduced from arXiv: 2504.14087 by Jo\~ao Ribeiro, Roni Con.

Figure 1
Figure 1. Figure 1: Lower bounds on the capacity of the BDC-Thr [PITH_FULL_IMAGE:figures/full_fig_p051_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Lower bounds on the capacity of the BDC-Thr [PITH_FULL_IMAGE:figures/full_fig_p052_2.png] view at source ↗
read the original abstract

"Independent and identically distributed" errors do not accurately capture the noisy behavior of real-world data storage and information transmission technologies. Motivated by this, we study channels with input-correlated synchronization errors, meaning that the distribution of synchronization errors (such as deletions and insertions) applied to the $i$-th input $x_i$ may depend on the whole input string $x$. We begin by identifying conditions on the input-correlated synchronization channel under which the channel's information capacity is achieved by a stationary ergodic input source and is equal to its coding capacity. These conditions capture a wide class of channels, including channels with correlated errors observed in DNA-based data storage systems and their multi-trace versions, and generalize prior work. To showcase the usefulness of the general capacity theorem above, we combine it with techniques of Pernice-Li-Wootters (ISIT 2022) and Brakensiek-Li-Spang (FOCS 2020) to obtain explicit capacity-achieving codes for multi-trace channels with runlength-dependent deletions, motivated by error patterns observed in DNA-based data storage systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper studies channels with input-correlated synchronization errors, where the distribution of deletions/insertions on the i-th symbol may depend on the entire input string x. It identifies technical conditions on such channels under which the information capacity is achieved by a stationary ergodic input process and equals the coding capacity; these conditions are claimed to hold for a broad class including DNA-storage error patterns and their multi-trace versions, generalizing prior independent results. The paper then combines the capacity theorem with techniques from Pernice-Li-Wootters and Brakensiek-Li-Spang to construct explicit capacity-achieving codes for multi-trace runlength-dependent deletion channels.

Significance. If the stated conditions are rigorously verified for the target models, the result supplies a useful generalization of capacity theorems for synchronization channels and yields the first explicit constructions for a practically motivated multi-trace deletion model. The combination of the general theorem with existing code-construction machinery is a clear strength when the conditions apply.

major comments (1)
  1. [Application section] Application section (following the general capacity theorem): the manuscript invokes the general theorem for the multi-trace runlength-dependent deletion channel after describing the error pattern, but does not supply an intermediate lemma or explicit verification that each listed condition (ergodicity/mixing of the joint input-error process, etc.) holds under the runlength-dependent deletion probability law and the multi-trace observation structure. Without this check, the equality of information and coding capacity and the subsequent explicit-code claim do not follow for the concrete model.
minor comments (1)
  1. Clarify the precise statement of the technical conditions in the general theorem (e.g., which mixing or ergodicity properties are required) so that readers can directly check them against new channel models.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and for identifying the need for explicit verification in the application section. We address the major comment below and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: [Application section] Application section (following the general capacity theorem): the manuscript invokes the general theorem for the multi-trace runlength-dependent deletion channel after describing the error pattern, but does not supply an intermediate lemma or explicit verification that each listed condition (ergodicity/mixing of the joint input-error process, etc.) holds under the runlength-dependent deletion probability law and the multi-trace observation structure. Without this check, the equality of information and coding capacity and the subsequent explicit-code claim do not follow for the concrete model.

    Authors: We agree that the application section would benefit from an explicit intermediate verification. In the revised manuscript we will insert a new lemma (Lemma 5.1) immediately after the channel model definition. The lemma will confirm that the runlength-dependent deletion process (with finite memory) combined with the multi-trace observation structure yields a stationary ergodic joint input-error process, satisfies the required mixing and continuity conditions of the general theorem, and preserves the equality between information and coding capacity. The argument relies on the Markovian structure of the runlength dependence (which ensures ergodicity by standard results for functions of Markov chains) and the independence of traces. This addition will make the subsequent invocation of the general capacity theorem and the explicit-code construction fully rigorous for the concrete model. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theorem generalizes independent prior results using external techniques.

full rationale

The paper identifies conditions under which information capacity equals coding capacity for input-correlated synchronization channels, states that these conditions cover DNA-storage models including multi-trace runlength-dependent deletions, and generalizes prior work. It then combines the resulting theorem with techniques from Pernice-Li-Wootters (ISIT 2022) and Brakensiek-Li-Spang (FOCS 2020) to construct explicit codes. No equations or steps reduce a claimed prediction or capacity result to a fitted parameter, self-definition, or unverified self-citation chain; the derivation remains self-contained against the stated external benchmarks and cited independent results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on standard information-theoretic background (stationary ergodic processes, channel capacity definitions) plus the domain-specific assumption that the target channels satisfy the (unspecified) conditions needed for the capacity theorem.

axioms (1)
  • domain assumption The input-correlated synchronization channel satisfies conditions under which stationary ergodic inputs achieve capacity and information capacity equals coding capacity.
    This is the load-bearing premise invoked to obtain the general capacity theorem and its application to DNA-storage channels.

pith-pipeline@v0.9.0 · 5713 in / 1318 out tokens · 52593 ms · 2026-05-22T19:02:12.125567+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · 1 internal anchor

  1. [1]

    Stronger polarization for the d eletion channel

    [AT23] Dar Arava and Ido Tal. Stronger polarization for the d eletion channel. In 2023 IEEE International Symposium on Information Theory (ISIT) , pages 1711–1716. IEEE,

  2. [2]

    [BLS20] Joshua Brakensiek, Ray Li, and Bruce Spang

    As- sociation for Computing Machinery. [BLS20] Joshua Brakensiek, Ray Li, and Bruce Spang. Coded trace reconstruction in a constant number of traces. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), pages 482–493,

  3. [3]

    Servedio, and Sa ndip Sinha

    [CDL+22] Xi Chen, Anindya De, Chin Ho Lee, Rocco A. Servedio, and Sa ndip Sinha. Near- optimal average-case approximate trace reconstruction fr om few traces. In Proceedings of the 2022 Annual ACM-SIAM Symposium on Discrete Algorithm s (SODA) , pages 779–821. SIAM,

  4. [4]

    Deletion codes in the high-noise and high- rate regimes

    [GW17] Venkatesan Guruswami and Carol Wang. Deletion codes in the high-noise and high- rate regimes. IEEE Transactions on Information Theory , 63(4):1961–1970,

  5. [5]

    [MD24] Ruslan Morozov and Tolga M. Duman. On the capacity of c hannels with Markov insertions, deletions and substitutions. In 2024 IEEE International Symposium on Information Theory (ISIT) , pages 3444–3449,

  6. [6]

    Channels with Markov Synchronization Errors: Information Stability and Capacity Bounds

    https://arxiv.org/abs/2401.16063. [MDK18] Wei Mao, Suhas N. Diggavi, and Sreeram Kannan. Model s and information- theoretic bounds for nanopore sequencing. IEEE Transactions on Information Theory , 64(4):3216–3236,

  7. [7]

    On noisy duplication channels with Markov sources

    54 [MSV24] Brendon McBain, James Saunderson, and Emanuele Viter bo. On noisy duplication channels with Markov sources. In 2024 IEEE International Symposium on Information Theory (ISIT) , pages 3438–3443,

  8. [8]

    Efficie nt capacity-achieving codes for general repeat channels

    [PL W22] Francisco Pernice, Ray Li, and Mary Wootters. Efficie nt capacity-achieving codes for general repeat channels. In 2022 IEEE International Symposium on Information Theory (ISIT) , pages 3097–3102,

  9. [9]

    Pfister and Ido Tal

    [PT21] Henry D. Pfister and Ido Tal. Polar codes for channels w ith insertions, deletions, and substitutions. In 2021 IEEE International Symposium on Information Theory (I SIT), pages 2554–2559,

  10. [10]

    Explicit and efficient constructi on of nearly optimal rate codes for the binary deletion channel and the Poisson repeat channel

    [Rub22] Ittai Rubinstein. Explicit and efficient constructi on of nearly optimal rate codes for the binary deletion channel and the Poisson repeat channel. In 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022), pages 105:1– 105:17. Schloss Dagstuhl – Leibniz-Zentrum für Informatik ,

  11. [11]

    A verage-case to (shifted) worst -case reduction for the trace recon- struction problem

    [Rub23] Ittai Rubinstein. A verage-case to (shifted) worst -case reduction for the trace recon- struction problem. In Kousha Etessami, Uriel Feige, and Gab riele Puppis, editors, 50th International Colloquium on Automata, Languages, and Programming (ICALP 2023), volume 261 of Leibniz International Proceedings in Informatics (LIPIcs ), pages 102:1–102:20, ...

  12. [12]

    55 [SGPY21] Sundara Rajan Srinivasavaradhan, Sivakanth Gopi , Henry D

    Schloss Dagstuhl – Leibniz-Zentrum für In- formatik. 55 [SGPY21] Sundara Rajan Srinivasavaradhan, Sivakanth Gopi , Henry D. Pfister, and Sergey Yekhanin. Trellis BMA: Coded trace reconstruction on IDS cha nnels for DNA stor- age. In 2021 IEEE International Symposium on Information Theory (I SIT), pages 2453–2458,