pith. sign in

arxiv: 2604.08234 · v1 · submitted 2026-04-09 · 💻 cs.IT · math.IT

On the Capacity of Sequences of Coloring Channels

Pith reviewed 2026-05-10 17:22 UTC · model grok-4.3

classification 💻 cs.IT math.IT
keywords coloring channelssequence capacitypairs graphsunflower sequencesintersecting setspath sequencessequence reconstructionmolecular storage
0
0 comments X

The pith

The capacity of sequences of coloring channels depends solely on a pairs graph built from the allowed letter sets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies sequences of coloring channels, each defined by a subset of letters that pass through while others are deleted, which model multiple observations of the same transmitted sequence in applications like protein identification. The authors compute exact capacities for uniform sunflower sequences, sequences from two arbitrary intersecting sets, and path sequences. They establish that these capacities are completely determined by a pairs graph whose vertices are alphabet letters and whose edges connect letters that co-occur in at least one allowed set. If correct, this equivalence reduces the information-theoretic rate calculation to a graph property, enabling exact results for alphabet size 4 in all cases except a 4-cycle, where bounds are given instead.

Core claim

We provide exact capacities of several sequences of coloring channels: uniform sunflowers, two arbitrary intersecting sets, and paths. We also show how this capacity depends solely on a related graph we define, called the pairs graph. Using this equivalence, we prove lower and upper bounds on the capacity, and a tailored bound for a coloring-channel sequence forming a cycle. In particular, for an alphabet of size 4, these results give the exact capacity of all coloring-channel sequences except for a cycle of length 4, for which we only provide bounds.

What carries the argument

The pairs graph, with letters as vertices and edges between any pair that appears together inside at least one allowed set of the sequence; it carries the argument by proving that channel capacity equals a function of this graph alone.

If this is right

  • Exact capacities are now available for all uniform sunflower sequences via the pairs graph.
  • Exact capacities are now available for all path sequences of coloring channels via the pairs graph.
  • Exact capacities are now available for sequences formed by any two intersecting allowed sets via the pairs graph.
  • For any 4-letter alphabet the only remaining open coloring-channel sequence is the 4-cycle, for which the pairs graph supplies tight bounds but not necessarily the exact value.
  • Lower and upper bounds derived from the pairs graph apply to arbitrary cycle sequences beyond length 4.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The pairs-graph reduction might allow exact capacity formulas for longer cycles or other regular graphs once the cycle case is settled.
  • This graph equivalence could be tested on non-coloring sequence-reconstruction models that also involve multiple partial observations of the same string.
  • If the pairs graph fully governs capacity in general, then capacity computation for large alphabets reduces to enumerating edges rather than optimizing over all possible channel sequences.

Load-bearing premise

The capacity of any sequence of coloring channels is completely determined by the pairs graph constructed from the allowed letter sets.

What would settle it

Directly compute the capacity for the 4-cycle sequence on a 4-letter alphabet by enumerating typical sequences or using linear programming and verify whether the value lies strictly inside the paper's derived bounds or exactly matches one of them.

read the original abstract

A single coloring channel is defined by a subset of letters it allows to pass through, while deleting all others. A sequence of coloring channels provides multiple views of the same transmitted letter sequence, forming a type of sequence-reconstruction problem useful for protein identification and information storage at the molecular level. We provide exact capacities of several sequences of coloring channels: uniform sunflowers, two arbitrary intersecting sets, and paths. We also show how this capacity depends solely on a related graph we define, called the pairs graph. Using this equivalence, we prove lower and upper bounds on the capacity, and a tailored bound for a coloring-channel sequence forming a cycle. In particular, for an alphabet of size $4$, these results give the exact capacity of all coloring-channel sequences except for a cycle of length $4$, for which we only provide bounds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper defines a coloring channel by a subset of the alphabet that it transmits (deleting others) and considers sequences of such channels as a reconstruction problem. It claims exact capacities for sequences corresponding to uniform sunflowers, two arbitrary intersecting sets, and paths. It defines a pairs graph from the allowed letter sets and asserts that capacity depends solely on this graph for the examined families. Using the reduction it derives bounds in general and a tailored bound for cycles; for alphabet size 4 the results yield exact capacities for all sequences except the 4-cycle, where only bounds are given.

Significance. If the derivations hold, the work supplies concrete exact capacities for several natural families of coloring channels arising in molecular sequence reconstruction. The pairs-graph reduction, when rigorously shown to be non-circular and independent of the capacity expression, would be a useful structural tool. The near-complete characterization for |alphabet|=4 is a tangible advance for the small-alphabet regime.

major comments (2)
  1. [Main results / pairs-graph section] The central claim that capacity is completely determined by the pairs graph (stated for the examined families) is load-bearing for all exact-capacity results. The manuscript must make explicit whether the graph is constructed independently of any capacity expression and whether the equivalence proof avoids circularity by using only combinatorial properties of the allowed sets.
  2. [Cycle bound subsection] For the cycle-of-length-4 case with |alphabet|=4 the paper supplies only bounds; if this is the sole remaining open case, the manuscript should state whether the gap is due to a technical obstruction in the pairs-graph method or merely computational complexity, and whether the bounds are conjectured to be tight.
minor comments (2)
  1. [Abstract] The abstract asserts 'exact capacities' for the listed families but does not restate the alphabet size; adding this would clarify the scope.
  2. [Notation / definitions] Notation for the pairs graph (vertices, edges, induced subgraphs) should be introduced once and used consistently; any re-definition in later sections risks confusion.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful review and positive recommendation. We address the major comments point by point below, indicating the changes we will implement to improve the manuscript.

read point-by-point responses
  1. Referee: [Main results / pairs-graph section] The central claim that capacity is completely determined by the pairs graph (stated for the examined families) is load-bearing for all exact-capacity results. The manuscript must make explicit whether the graph is constructed independently of any capacity expression and whether the equivalence proof avoids circularity by using only combinatorial properties of the allowed sets.

    Authors: The pairs graph is constructed independently of any capacity expression, relying solely on the combinatorial properties of the allowed letter sets in the sequence of coloring channels. Specifically, vertices represent letters and edges capture pairwise co-occurrences permitted by the channels. The equivalence proof that capacity depends only on this graph uses combinatorial arguments based on set intersections and mappings between code sequences, without invoking information-theoretic quantities in a circular manner. We will revise the relevant section to explicitly state this independence and outline the combinatorial steps in the proof to eliminate any potential ambiguity. revision: yes

  2. Referee: [Cycle bound subsection] For the cycle-of-length-4 case with |alphabet|=4 the paper supplies only bounds; if this is the sole remaining open case, the manuscript should state whether the gap is due to a technical obstruction in the pairs-graph method or merely computational complexity, and whether the bounds are conjectured to be tight.

    Authors: Yes, the 4-cycle with alphabet size 4 is the only open case for exact capacity in this regime. The gap stems from a technical obstruction within the pairs-graph method itself, as the reduction provides tight bounds for other structures but leaves a separation for this cyclic configuration due to its specific intersection patterns. It is not merely computational complexity. We conjecture that the upper and lower bounds coincide and are tight, supported by exhaustive checks for small lengths, though a general proof is not yet available. We will include a clarifying statement in the cycle bound subsection addressing these aspects. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained for examined families

full rationale

The paper defines coloring channels and the pairs graph directly from the allowed letter sets (independent of capacity). It then derives exact capacities for uniform sunflowers, intersecting sets, and paths by proving an equivalence to properties of this graph, with bounds for the C4 case. No step reduces a claimed result to a fitted parameter, self-citation, or definitional renaming; the graph construction and capacity expressions are shown to be related via explicit arguments for the listed families only. The central results remain independent of the target quantities.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on standard definitions of channel capacity together with the newly introduced pairs graph that encodes confusability across the sequence of filters.

axioms (1)
  • standard math Channel capacity is the supremum of achievable rates with vanishing error probability under the given sequence of channels
    Invoked to define the quantity whose exact value is computed for the specific channel families.
invented entities (1)
  • pairs graph no independent evidence
    purpose: Graph whose structure determines the capacity of the coloring-channel sequence
    Defined in the paper to reduce the capacity computation to a graph-theoretic quantity

pith-pipeline@v0.9.0 · 5427 in / 1296 out tokens · 39967 ms · 2026-05-10T17:22:02.932727+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    On Levenshtein’s reconstruction problem under insertions, deletions, and substitutions,

    M. Abu-Sini and E. Yaakobi, “On Levenshtein’s reconstruction problem under insertions, deletions, and substitutions,”IEEE Trans. Inform. Theory, vol. 67, no. 11, pp. 7132–7158, Nov. 2021

  2. [2]

    Sequence reconstruction over coloring channels for protein identification,

    J. Bariffi, A. Wachter-Zeh, and E. Yaakobi, “Sequence reconstruction over coloring channels for protein identification,” inProceedings of the 2025 IEEE International Symposium on Information Theory (ISIT2025), Ann Arbor, MI, USA, Jun. 2025, pp. 1–6

  3. [3]

    Coding for sequence reconstruction for single edits,

    K. Cai, H. M. Kiah, T. T. Nguyen, and E. Yaakobi, “Coding for sequence reconstruction for single edits,”IEEE Trans. Inform. Theory, vol. 68, no. 1, pp. 66–79, Jan. 2022

  4. [4]

    Correcting deletions with multiple reads,

    J. Chrisnata, H. M. Kiah, and E. Yaakobi, “Correcting deletions with multiple reads,”IEEE Trans. Inform. Theory, vol. 68, no. 11, pp. 7141–7158, Nov. 2022

  5. [5]

    The representation of a graph by set intersections,

    P. Erd ˝os, A. W. Goodman, and L. P ´osa, “The representation of a graph by set intersections,”Canadian Journal of Mathematics, vol. 18, pp. 106–112, 1966. 17 TABLE II THE CAPACITY OF ALL IRREDUCIBLE COLORING CHANNELS OVER AN ALPHABET OF SIZE4,THAT USE ALL POSSIBLE LETTERS,TO5SIGNIFICANT DIGITS PI cap(I)Example minimalIType Location 1({1,2,3,4})clique [2,...

  6. [6]

    Sequence reconstruction over the deletion channel,

    R. Gabrys and E. Yaakobi, “Sequence reconstruction over the deletion channel,”IEEE Trans. Inform. Theory, vol. 64, no. 4, pp. 2924–2931, Apr. 2018

  7. [7]

    Reconstruction of sequences over non-identical channels,

    M. Horovitz and E. Yaakobi, “Reconstruction of sequences over non-identical channels,”IEEE Trans. Inform. Theory, vol. 65, no. 2, pp. 1267–1286, Feb. 2018

  8. [8]

    On unique error patterns in the Levenshtein’s sequence reconstruction model,

    V . Junnila, T. Laihonen, and T. Lehtil ¨a, “On unique error patterns in the Levenshtein’s sequence reconstruction model,”IEEE Trans. Inform. Theory, vol. 71, no. 7, pp. 5720–5736, Jul. 2025

  9. [9]

    Efficient reconstruction of sequences,

    V . I. Levenshtein, “Efficient reconstruction of sequences,”IEEE Trans. Inform. Theory, vol. 47, no. 1, pp. 2–22, Jan. 2001

  10. [10]

    Efficient reconstruction of sequences from their subsequences or supersequences,

    ——, “Efficient reconstruction of sequences from their subsequences or supersequences,”J. Combin. Theory Ser. A, vol. 93, no. 2, pp. 310–332, Feb. 2001

  11. [11]

    F. J. MacWilliams and N. J. A. Sloane,The Theory of Error-Correcting Codes. North-Holland, 1978

  12. [12]

    J. C. Mason and D. C. Handscomb,Chebyshev Polynomials. Chapman & Hall/CRC, 2003

  13. [13]

    Simulation of single-protein nanopore sensing shows feasibility for wholeproteome identification,

    S. Ohayon, A. Girsault, M. Nasser, S. Shen-Orr, and A. Meller, “Simulation of single-protein nanopore sensing shows feasibility for wholeproteome identification,”PLoS Computational Biology, vol. 15, no. 5, p. e1007067, 2019

  14. [14]

    Sequence reconstruction problem for deletion channels: a complete asymptotic solution,

    V . L. P. Pham, K. Goyal, and H. M. Kiah, “Sequence reconstruction problem for deletion channels: a complete asymptotic solution,”J. Combin. Theory Ser. A, vol. 211, no. 105980, pp. 1–29, 2025

  15. [15]

    H. S. Wall,Analytic Theory of Continued Fractions. Chelsea Publishing Company, 1948