Smallest Suffixient Sets: Effectiveness, Resilience, and Calculation

Cristian Urbina; Giuseppe Romana; Gonzalo Navarro; Hiroto Fujimaru

arxiv: 2506.05638 · v5 · submitted 2025-06-05 · 💻 cs.FL · cs.DS· math.CO

Smallest Suffixient Sets: Effectiveness, Resilience, and Calculation

Hiroto Fujimaru , Gonzalo Navarro , Giuseppe Romana , Cristian Urbina This is my paper

Pith reviewed 2026-05-19 10:59 UTC · model grok-4.3

classification 💻 cs.FL cs.DSmath.CO

keywords suffixient setsrepetitiveness measuresBurrows-Wheeler transformpattern matchingstring operationsepisturmian wordslexicographic parseFibonacci words

0 comments

The pith

The size of the smallest suffixient set is at most linear in the number of Burrows-Wheeler runs and strictly smaller than the smallest lexicographic parse on some string families.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

A suffixient set captures the minimal information from a repetitive string that, together with random access to the text, suffices for several pattern-matching tasks. This paper measures the size χ of the smallest such set and shows that χ grows by at most two when a character is appended or prepended, which immediately yields simple linear-time online algorithms for computing it. The authors further locate χ among other repetitiveness measures by proving χ equals O(r) for the Burrows-Wheeler run count r, by exhibiting families where χ is asymptotically smaller than the smallest lexicographic parse v, and by proving tight upper bounds such as χ ≤ σ + 2 on episturmian words.

Core claim

The paper establishes that the smallest suffixient set size χ satisfies χ = O(r) where r is the number of runs in the Burrows-Wheeler transform, that there exist infinite string families with χ = o(v) for v the size of the smallest lexicographic parse, and that χ ≤ σ + 2 on episturmian words over an alphabet of size σ, with the exact value χ ≤ 4 and a complete characterization of the two possible smallest sets on Fibonacci words.

What carries the argument

The size χ of the smallest suffixient set, the minimal collection of positions whose suffixes allow reconstruction of all necessary matching information for the full string.

If this is right

Smallest suffixient sets can be maintained with simple linear-time online algorithms because appending or prepending a character increases χ by at most two.
Reversing a string may increase χ by up to O(n) yet the ratio between χ of a string and its reverse never exceeds two.
Arbitrary edit operations or rotations can increase χ by an additive Ω(√n) in the worst case.
χ is incomparable to nearly all repetitiveness measures based on explicit copy-paste mechanisms.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Because χ stays small on episturmian and Fibonacci words, suffixient sets may yield compact indexes for morphic or Sturmian sequences that arise in combinatorics on words.
The O(r) bound together with the linear-time construction suggests suffixient sets could be combined with BWT-based indexes to reduce space on highly repetitive data without sacrificing query support.
The Ω(√n) sensitivity to edits implies that dynamic maintenance of exact smallest suffixient sets will require sublinear update structures only when the number of edits remains small.
Empirical measurement of χ on DNA or versioned-text collections could reveal how tightly the theoretical O(r) relation holds in practice.

Load-bearing premise

The practical value of a suffixient set depends on the availability of random access to the underlying string so that the set can be used for pattern matching.

What would settle it

A concrete string family in which the size of the smallest suffixient set grows faster than linearly with the number of Burrows-Wheeler runs would refute the claimed O(r) bound.

Figures

Figures reproduced from arXiv: 2506.05638 by Cristian Urbina, Giuseppe Romana, Gonzalo Navarro, Hiroto Fujimaru.

**Figure 1.** Figure 1: Relations between relevant repetitiveness measures and how our results place χ among them. An arrow µ1 → µ2 means that µ1 = O(µ2) for all strings and, save for c → z, there is a string family where µ1 = o(µ2). The dotted arrows mark only this last condition, so they are not transitive. Measures in light gray nodes are known to be reachable; those in dark gray are accessible and searchable; and r is hatched… view at source ↗

read the original abstract

A suffixient set is a novel combinatorial object that captures the essential information of repetitive strings in a way that, provided with a random access mechanism, supports various forms of pattern matching. In this paper, we study the size $\chi$ of the smallest suffixient set as a repetitiveness measure. First, we study its sensitivity to various string operations. We show that $\chi$ cannot increase by more than 2 after appending or prepending a character to the string. As a consequence, we are able to give simple linear-time online algorithms to compute smallest suffixient sets. We also show that, although reversing the string can increase $\chi$ by an arbitrary $O(n)$ value, it always holds $\chi(T)/\chi(T^R)\le 2$. We also prove lower and upper bounds for the additive or multiplicative increase of $\chi$ after applying arbitrary edit operations, or rotating the text. In particular, we show that the additive increase can be as large as $\Omega(\sqrt{n})$ for all those operations. Secondly, we place $\chi$ in between known repetitiveness measures. In particular, we show $\chi = O(r)$ (where $r$ is the number of runs in the Burrows-Wheeler Transform of the string), that there are string families where $\chi=o(v)$ (where $v$ is the size of the smallext lexicographic parse of the string), and that $\chi$ is uncomparable to almost all reachable measures based on copy-paste mechanisms. In passing, we give precise bounds for $\chi$ for some relevant string families, for example $\chi \le \sigma+2$ on episturmian words over alphabets of size $\sigma$ (e.g., $\chi \le 4$ on Fibonacci strings, for which we precisely characterize the only two smallest suffixient sets).

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines suffixient sets as a new repetitiveness measure and proves explicit bounds showing chi = O(r) plus at most +2 growth on append or prepend.

read the letter

The main point is that this work defines a new object called a suffixient set and studies the size chi of the smallest one as a repetitiveness measure for strings. They prove several direct combinatorial bounds on how chi behaves under edits and how it compares to prior measures like BWT runs r and lex-parse size v. The results include chi increasing by at most 2 on append or prepend, which yields linear-time online algorithms, a ratio of at most 2 under reversal, an Omega(sqrt n) additive lower bound on the increase after arbitrary edits or rotations, and chi = O(r) in general. They also show chi = o(v) on some string families and chi <= sigma + 2 on episturmian words, with a precise characterization for Fibonacci strings. These are all worst-case statements backed by explicit constructions or counting arguments rather than reductions to fitted parameters. The placement relative to existing measures is new and avoids circularity. The proofs appear checkable from the stated claims, and the absence of self-referential loops or invented parameters keeps the work grounded. A minor soft spot is that the pattern-matching utility still requires a separate random-access mechanism, so the concrete advantage over run-based indexes is not yet quantified in the results. The Omega(sqrt n) edit lower bound is interesting but would benefit from matching upper-bound constructions to show tightness. Overall the central claims hold up on their own terms. This is for researchers in string algorithms and compressed data structures who track repetitiveness measures. A reader already working with BWT or parsing would get immediate value from the comparisons. The paper deserves serious peer review because the new object and the sensitivity results are fresh enough to warrant checking the derivations and potential follow-up work.

Referee Report

2 major / 2 minor

Summary. The paper defines a suffixient set as a combinatorial object capturing essential information in repetitive strings to support pattern matching given random access. It studies the minimal size χ of such sets as a repetitiveness measure. The authors prove sensitivity bounds under string operations, including that appending or prepending a character increases χ by at most 2 (enabling linear-time online algorithms), that χ(T)/χ(T^R) ≤ 2 despite possible O(n) increases on reversal, and that the additive increase under edits or rotations can be Ω(√n). They also relate χ to other measures by showing χ = O(r) (r = BWT runs), χ = o(v) for some families (v = smallest lexicographic parse size), incomparability to most copy-paste-based measures, and χ ≤ σ + 2 for episturmian words (with exact characterization of the two smallest suffixient sets on Fibonacci words).

Significance. If the bounds hold, the work introduces a new repetitiveness measure with favorable sensitivity and online computability properties that complement existing measures. The explicit linear-time algorithms, the O(r) relation to BWT runs, the o(v) separation from lexicographic parses, and the concrete bounds for episturmian and Fibonacci families are strengths that could inform compressed indexing and combinatorics on words.

major comments (2)

[§3] §3 (sensitivity results): the central claim that χ increases by at most 2 on append/prepend rests on an explicit construction of a new suffixient set from the old one; the manuscript should include a self-contained verification that the constructed set remains suffixient under the random-access assumption.
[§4.1] §4.1 (comparison to r): the proof that χ = O(r) is load-bearing for the positioning of the measure; the argument should state the multiplicative constant explicitly rather than leaving it implicit in the run-counting construction.

minor comments (2)

[Abstract] Abstract: 'smallext' is a typographical error and should read 'smallest'.
[Abstract] Abstract and §4.2: the statement that χ is 'uncomparable to almost all reachable measures based on copy-paste mechanisms' would benefit from a brief pointer to the specific families or theorem establishing the incomparability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading, positive assessment, and constructive suggestions. We address the major comments point by point below and have incorporated revisions to improve clarity and rigor.

read point-by-point responses

Referee: [§3] §3 (sensitivity results): the central claim that χ increases by at most 2 on append/prepend rests on an explicit construction of a new suffixient set from the old one; the manuscript should include a self-contained verification that the constructed set remains suffixient under the random-access assumption.

Authors: We agree that a self-contained verification strengthens the presentation. In the revised manuscript we have added a dedicated paragraph immediately following the construction that verifies suffixiency directly from the definition: for any pattern P, if an occurrence exists in the extended string, the random-access oracle on the original positions (combined with the at-most-two new suffixient positions) suffices to report a match without external assumptions. The argument enumerates the possible cases for where the new occurrence starts or ends relative to the appended/prepended character. revision: yes
Referee: [§4.1] §4.1 (comparison to r): the proof that χ = O(r) is load-bearing for the positioning of the measure; the argument should state the multiplicative constant explicitly rather than leaving it implicit in the run-counting construction.

Authors: We thank the referee for highlighting this. The run-counting argument in the original proof yields χ ≤ 3r for all strings (with the factor 3 arising from at most two suffixient positions per BWT run plus a constant overhead for the first and last runs). We have revised §4.1 to state the bound explicitly as χ ≤ 3r and to include a short sentence deriving the constant from the per-run contribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity in combinatorial bounds on χ

full rationale

The paper defines χ as the size of the smallest suffixient set and establishes worst-case bounds such as χ = O(r) (r = BWT runs), existence of families with χ = o(v) (v = lex-parse size), and χ ≤ σ+2 on episturmian words. These follow from explicit constructions and direct comparisons to independently defined measures r and v. Sensitivity results (e.g., χ increases by at most 2 on append/prepend, χ(T)/χ(T^R) ≤ 2) are proven via case analysis on string operations without reducing the target quantity to a fitted parameter or self-citation chain. All claims are self-contained combinatorial statements verifiable against external string families and run-counting arguments.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the definition of suffixient sets as objects that capture essential repetitive information when random access is supplied. No numeric free parameters appear; the only invented entity is the suffixient set itself, introduced by definition with no external falsifiable prediction supplied in the abstract.

axioms (1)

domain assumption A suffixient set captures the essential information of repetitive strings in a way that, provided with a random access mechanism, supports various forms of pattern matching.
This is the foundational definition of the new combinatorial object on which all subsequent size and sensitivity results depend.

invented entities (1)

suffixient set no independent evidence
purpose: Combinatorial object that captures essential repetitive information for pattern matching under random access
New object defined in the paper; no independent evidence or falsifiable prediction outside the definition is given.

pith-pipeline@v0.9.0 · 5886 in / 1538 out tokens · 76457 ms · 2026-05-19T10:59:09.352041+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We show that χ = O(r) ... χ ≤ σ+2 on episturmian words
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

χ is uncomparable to almost all reachable measures based on copy-paste mechanisms

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Smallest suffixient set maintenance in near-real-time
cs.DS 2026-04 unverdicted novelty 6.0

The smallest suffixient set can be maintained online in polyloglog time per letter in either left-to-right or right-to-left direction via Weiner's suffix tree primitives.
Compressing Suffix Trees by Path Decompositions
cs.DS 2025-06 unverdicted novelty 6.0

Introduces a suffix tree path decomposition technique that yields a suffix array sample of size at most r, improving the prior 2r bound for compressed indexes in external memory.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · cited by 2 Pith papers

[1]

Information and Computation291, 104999 (2023)

Akagi, T., Funakoshi, M., Inenaga, S.: Sensitivity of string compressors and repet- itiveness measures. Information and Computation291, 104999 (2023)

work page 2023
[2]

Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam49(7), 758–764 (1946)

Bruijn, de, N.: A combinatorial problem. Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam49(7), 758–764 (1946)

work page 1946
[3]

Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Tech. Rep. 124, Digital Equipment Corporation (1994)

work page 1994
[4]

Suffixient arrays: a new efficient suffix array compression technique.arXiv preprint 2407.18753v2, 2025

Cenzato, D., Depuydt, L., Gagie, T., Kim, S.H., Manzini, G., Olivares, F., Prezza, N.: Suffixient arrays: a new efficient suffix array compression technique. CoRR 2407.18753 (2025)

work page arXiv 2025
[5]

In: Proc

Cenzato, D., Olivares, F., Prezza, N.: On computing the smallest suffixient set. In: Proc. 31st International Symposium on String Processing and Information Re- trieval (SPIRE 2024). Lecture Notes in Computer Science, vol. 14899, pp. 73–87. Springer (2024)

work page 2024
[6]

Suf- fixient sets.arXiv preprint 2312.01359v3, 2024

Depuydt, L., Gagie, T., Langmead, B., Manzini, G., Prezza, N.: Suffixient sets. CoRR 2312.01359 (2023)

work page arXiv 2023
[7]

Information Processing Letters 55(4), 217–221 (1995)

Droubay, X.: Palindromes in the Fibonacci word. Information Processing Letters 55(4), 217–221 (1995)

work page 1995
[8]

Theoretical Computer Science255(1), 539–553 (2001)

Droubay, X., Justin, J., Pirillo, G.: Episturmian words and some constructions of de Luca and Rauzy. Theoretical Computer Science255(1), 539–553 (2001)

work page 2001
[9]

In: Proc

Fici, G., Romana, G., Sciortino, M., Urbina, C.: On the impact of morphisms on BWT-runs. In: Proc. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). LIPIcs, vol. 259, pp. 10:1–10:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

work page 2023
[10]

CoRR2504.17443 (2025)

Fici, G., Romana, G., Sciortino, M., Urbina, C.: Morphisms and BWT-run sensi- tivity. CoRR2504.17443 (2025)

work page arXiv 2025
[11]

SIAM Review 24(2), 195–221 (1982)

Fredricksen, H.: A survey of full length nonlinear shift register cycle algorithms. SIAM Review 24(2), 195–221 (1982)

work page 1982
[12]

Discrete Mathematics341(11), 2977– 2987 (2018)

Gabric, D., Sawada, J., Williams, A., Wong, D.: A framework for constructing de Bruijn sequences via simple successor rules. Discrete Mathematics341(11), 2977– 2987 (2018)

work page 2018
[13]

In: Proc

Gawrychowski, P., Kosche, M., Manea, F.: On the number of factors in the LZ- end factorization. In: Proc. 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023). Lecture Notes in Computer Science, vol. 14240, pp. 253–259. Springer (2023)

work page 2023
[14]

In: Proc

Giuliani, S., Inenaga, S., Lipták, Z., Prezza, N., Sciortino, M., Toffanello, A.: Novel results on the number of runs of the Burrows-Wheeler-Transform. In: Proc. 47th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2021). Lecture Notes in Computer Science, vol. 12607, pp. 249–

work page 2021
[15]

Theory of Computing Systems 69(2), 19 (2025)

Giuliani, S., Inenaga, S., Lipták, Z., Romana, G., Sciortino, M., Urbina, C.: Bit catastrophes for the Burrows-Wheeler transform. Theory of Computing Systems 69(2), 19 (2025)

work page 2025
[16]

RAIRO - Theoretical Informatics and Applications 43(3), 403–442 (2009)

Glen, A., Justin, J.: Episturmian words: a survey. RAIRO - Theoretical Informatics and Applications 43(3), 403–442 (2009)

work page 2009
[17]

Communications of the ACM65(6), 91–98 (2022) 14 G

Kempa, D., Kociumaka, T.: Resolution of the Burrows-Wheeler transform conjec- ture. Communications of the ACM65(6), 91–98 (2022) 14 G. Navarro, G. Romana, and C. Urbina

work page 2022
[18]

In: Proc

Kempa, D., Prezza, N.: At the roots of dictionary compression: String attractors. In: Proc. 50th Annual ACM Symposium on the Theory of Computing (STOC 2018). pp. 827–840. ACM (2018)

work page 2018
[19]

Theo- retical Computer Science483, 115–133 (2013)

Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theo- retical Computer Science483, 115–133 (2013)

work page 2013
[20]

IEEE Transactions on Information Theory 22(1), 75–81 (1976)

Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1), 75–81 (1976)

work page 1976
[21]

Encyclopedia of Mathematics and its Applications, Cambridge University Press, New York, NY, USA (2002)

Lothaire, M.: Algebraic Combinatorics on Words. Encyclopedia of Mathematics and its Applications, Cambridge University Press, New York, NY, USA (2002)

work page 2002
[22]

Information Pro- cessing Letters 12(4), 193–195 (1981)

de Luca, A.: A combinatorial property of the Fibonacci words. Information Pro- cessing Letters 12(4), 193–195 (1981)

work page 1981
[23]

Theoretical Computer Science850, 236–248 (2021)

Mantaci, S., Restivo, A., Romana, G., Rosone, G., Sciortino, M.: A combinatorial view on string attractors. Theoretical Computer Science850, 236–248 (2021)

work page 2021
[24]

ACM Computing Surveys54(2), article 29 (2021)

Navarro, G.: Indexing highly repetitive string collections, part I: Repetitiveness measures. ACM Computing Surveys54(2), article 29 (2021)

work page 2021
[25]

ACM Computing Surveys54(2), article 26 (2021)

Navarro, G.: Indexing highly repetitive string collections, part II: Compressed in- dexes. ACM Computing Surveys54(2), article 26 (2021)

work page 2021
[26]

CoRR 2004.02781 (2022)

Navarro, G.: Indexing highly repetitive string collections. CoRR 2004.02781 (2022)

work page arXiv 2004
[27]

IEEE Transactions on Information Theory67(2), 1008–1026 (2021)

Navarro,G.,Ochoa,C.,Prezza,N.:Ontheapproximationratiooforderedparsings. IEEE Transactions on Information Theory67(2), 1008–1026 (2021)

work page 2021
[28]

Acta Informatica 62(1), 14 (2025)

Navarro, G., Olivares, F., Urbina, C.: Generalized straight-line programs. Acta Informatica 62(1), 14 (2025)

work page 2025
[29]

The- oretical Computer Science1043, 115259 (2025)

Navarro, G., Urbina, C.: Repetitiveness measures based on string morphisms. The- oretical Computer Science1043, 115259 (2025)

work page 2025
[30]

Journal of the ACM29(4), 928–951 (1982)

Storer, J.A., Szymanski, T.G.: Data compression via textual substitution. Journal of the ACM29(4), 928–951 (1982)

work page 1982

[1] [1]

Information and Computation291, 104999 (2023)

Akagi, T., Funakoshi, M., Inenaga, S.: Sensitivity of string compressors and repet- itiveness measures. Information and Computation291, 104999 (2023)

work page 2023

[2] [2]

Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam49(7), 758–764 (1946)

Bruijn, de, N.: A combinatorial problem. Proceedings of the Section of Sciences of the Koninklijke Nederlandse Akademie van Wetenschappen te Amsterdam49(7), 758–764 (1946)

work page 1946

[3] [3]

Burrows, M., Wheeler, D.: A block sorting lossless data compression algorithm. Tech. Rep. 124, Digital Equipment Corporation (1994)

work page 1994

[4] [4]

Suffixient arrays: a new efficient suffix array compression technique.arXiv preprint 2407.18753v2, 2025

Cenzato, D., Depuydt, L., Gagie, T., Kim, S.H., Manzini, G., Olivares, F., Prezza, N.: Suffixient arrays: a new efficient suffix array compression technique. CoRR 2407.18753 (2025)

work page arXiv 2025

[5] [5]

In: Proc

Cenzato, D., Olivares, F., Prezza, N.: On computing the smallest suffixient set. In: Proc. 31st International Symposium on String Processing and Information Re- trieval (SPIRE 2024). Lecture Notes in Computer Science, vol. 14899, pp. 73–87. Springer (2024)

work page 2024

[6] [6]

Suf- fixient sets.arXiv preprint 2312.01359v3, 2024

Depuydt, L., Gagie, T., Langmead, B., Manzini, G., Prezza, N.: Suffixient sets. CoRR 2312.01359 (2023)

work page arXiv 2023

[7] [7]

Information Processing Letters 55(4), 217–221 (1995)

Droubay, X.: Palindromes in the Fibonacci word. Information Processing Letters 55(4), 217–221 (1995)

work page 1995

[8] [8]

Theoretical Computer Science255(1), 539–553 (2001)

Droubay, X., Justin, J., Pirillo, G.: Episturmian words and some constructions of de Luca and Rauzy. Theoretical Computer Science255(1), 539–553 (2001)

work page 2001

[9] [9]

In: Proc

Fici, G., Romana, G., Sciortino, M., Urbina, C.: On the impact of morphisms on BWT-runs. In: Proc. 34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023). LIPIcs, vol. 259, pp. 10:1–10:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2023)

work page 2023

[10] [10]

CoRR2504.17443 (2025)

Fici, G., Romana, G., Sciortino, M., Urbina, C.: Morphisms and BWT-run sensi- tivity. CoRR2504.17443 (2025)

work page arXiv 2025

[11] [11]

SIAM Review 24(2), 195–221 (1982)

Fredricksen, H.: A survey of full length nonlinear shift register cycle algorithms. SIAM Review 24(2), 195–221 (1982)

work page 1982

[12] [12]

Discrete Mathematics341(11), 2977– 2987 (2018)

Gabric, D., Sawada, J., Williams, A., Wong, D.: A framework for constructing de Bruijn sequences via simple successor rules. Discrete Mathematics341(11), 2977– 2987 (2018)

work page 2018

[13] [13]

In: Proc

Gawrychowski, P., Kosche, M., Manea, F.: On the number of factors in the LZ- end factorization. In: Proc. 30th International Symposium on String Processing and Information Retrieval (SPIRE 2023). Lecture Notes in Computer Science, vol. 14240, pp. 253–259. Springer (2023)

work page 2023

[14] [14]

In: Proc

Giuliani, S., Inenaga, S., Lipták, Z., Prezza, N., Sciortino, M., Toffanello, A.: Novel results on the number of runs of the Burrows-Wheeler-Transform. In: Proc. 47th International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM 2021). Lecture Notes in Computer Science, vol. 12607, pp. 249–

work page 2021

[15] [15]

Theory of Computing Systems 69(2), 19 (2025)

Giuliani, S., Inenaga, S., Lipták, Z., Romana, G., Sciortino, M., Urbina, C.: Bit catastrophes for the Burrows-Wheeler transform. Theory of Computing Systems 69(2), 19 (2025)

work page 2025

[16] [16]

RAIRO - Theoretical Informatics and Applications 43(3), 403–442 (2009)

Glen, A., Justin, J.: Episturmian words: a survey. RAIRO - Theoretical Informatics and Applications 43(3), 403–442 (2009)

work page 2009

[17] [17]

Communications of the ACM65(6), 91–98 (2022) 14 G

Kempa, D., Kociumaka, T.: Resolution of the Burrows-Wheeler transform conjec- ture. Communications of the ACM65(6), 91–98 (2022) 14 G. Navarro, G. Romana, and C. Urbina

work page 2022

[18] [18]

In: Proc

Kempa, D., Prezza, N.: At the roots of dictionary compression: String attractors. In: Proc. 50th Annual ACM Symposium on the Theory of Computing (STOC 2018). pp. 827–840. ACM (2018)

work page 2018

[19] [19]

Theo- retical Computer Science483, 115–133 (2013)

Kreft, S., Navarro, G.: On compressing and indexing repetitive sequences. Theo- retical Computer Science483, 115–133 (2013)

work page 2013

[20] [20]

IEEE Transactions on Information Theory 22(1), 75–81 (1976)

Lempel, A., Ziv, J.: On the complexity of finite sequences. IEEE Transactions on Information Theory 22(1), 75–81 (1976)

work page 1976

[21] [21]

Encyclopedia of Mathematics and its Applications, Cambridge University Press, New York, NY, USA (2002)

Lothaire, M.: Algebraic Combinatorics on Words. Encyclopedia of Mathematics and its Applications, Cambridge University Press, New York, NY, USA (2002)

work page 2002

[22] [22]

Information Pro- cessing Letters 12(4), 193–195 (1981)

de Luca, A.: A combinatorial property of the Fibonacci words. Information Pro- cessing Letters 12(4), 193–195 (1981)

work page 1981

[23] [23]

Theoretical Computer Science850, 236–248 (2021)

Mantaci, S., Restivo, A., Romana, G., Rosone, G., Sciortino, M.: A combinatorial view on string attractors. Theoretical Computer Science850, 236–248 (2021)

work page 2021

[24] [24]

ACM Computing Surveys54(2), article 29 (2021)

Navarro, G.: Indexing highly repetitive string collections, part I: Repetitiveness measures. ACM Computing Surveys54(2), article 29 (2021)

work page 2021

[25] [25]

ACM Computing Surveys54(2), article 26 (2021)

Navarro, G.: Indexing highly repetitive string collections, part II: Compressed in- dexes. ACM Computing Surveys54(2), article 26 (2021)

work page 2021

[26] [26]

CoRR 2004.02781 (2022)

Navarro, G.: Indexing highly repetitive string collections. CoRR 2004.02781 (2022)

work page arXiv 2004

[27] [27]

IEEE Transactions on Information Theory67(2), 1008–1026 (2021)

Navarro,G.,Ochoa,C.,Prezza,N.:Ontheapproximationratiooforderedparsings. IEEE Transactions on Information Theory67(2), 1008–1026 (2021)

work page 2021

[28] [28]

Acta Informatica 62(1), 14 (2025)

Navarro, G., Olivares, F., Urbina, C.: Generalized straight-line programs. Acta Informatica 62(1), 14 (2025)

work page 2025

[29] [29]

The- oretical Computer Science1043, 115259 (2025)

Navarro, G., Urbina, C.: Repetitiveness measures based on string morphisms. The- oretical Computer Science1043, 115259 (2025)

work page 2025

[30] [30]

Journal of the ACM29(4), 928–951 (1982)

Storer, J.A., Szymanski, T.G.: Data compression via textual substitution. Journal of the ACM29(4), 928–951 (1982)

work page 1982