The complexity of finite smooth words over binary alphabets

Julien Cassaigne; Rapha\"el Henry

arxiv: 2603.10733 · v3 · submitted 2026-03-11 · 💻 cs.FL · math.CO· math.DS

The complexity of finite smooth words over binary alphabets

Julien Cassaigne , Rapha\"el Henry This is my paper

Pith reviewed 2026-05-15 13:42 UTC · model grok-4.3

classification 💻 cs.FL math.COmath.DS

keywords smooth wordsf-smooth wordsfactorscomplexity functionderivabilityKolakoski wordbinary alphabetword combinatorics

0 comments

The pith

Finite smooth words over any binary alphabet are exactly the factors of infinite smooth words.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Smooth words are infinite sequences that remain infinitely derivable by repeatedly replacing each run of identical symbols with its length. Their finite counterparts, called f-smooth words, are shown to be precisely the finite segments that occur inside some infinite smooth word. This equivalence is established directly for every binary alphabet. The paper further resolves the exact growth rate of the number of distinct f-smooth words of length n when the alphabet sum is even, supplies a matching lower bound that holds for every binary alphabet, and raises the best previous upper bound when the alphabet sum is odd, narrowing the gap to Sing's conjectured exponent.

Core claim

The paper proves that the language of f-smooth words coincides with the set of all factors of smooth words. Over even alphabets the complexity function p(n) satisfies p(n) = Theta(n^alpha) for the explicit exponent alpha = log(a+b) / log((a+b)/2). For arbitrary binary alphabets the same lower bound is established, while the known upper bound is improved for odd alphabets, bringing both sides closer to the conjectured Theta(n^alpha) growth.

What carries the argument

The derivability map that replaces each run in a word by the integer equal to its length, iterated to define both infinite smooth words and their finite f-smooth approximations.

If this is right

The language of all smooth words can now be studied entirely through the finite f-smooth words they contain.
Over even alphabets the number of distinct blocks of length n is known to grow exactly like n raised to the explicit power log(a+b)/log((a+b)/2).
A matching lower bound on complexity now holds uniformly for every binary alphabet.
The gap between lower and upper bounds on complexity has been reduced for odd alphabets.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Enumeration algorithms for f-smooth words can now be used to generate candidate factors of smooth words without separately checking infinite extendability.
The same factor-closed property may allow transfer of complexity results from finite to infinite smooth words in both directions.
The improved bounds suggest that computational checks of the exact exponent for small odd alphabets are now within reach.

Load-bearing premise

The derivability process remains well-defined for every finite and infinite word over the alphabet without any hidden constraints that would break the exact correspondence between f-smooth words and factors of smooth words.

What would settle it

Discovery of a single finite word that is f-smooth yet fails to appear as a consecutive block inside any infinite smooth word, or an explicit computation of the complexity function for large n that falls outside the proven lower and improved upper bounds.

Figures

Figures reproduced from arXiv: 2603.10733 by Julien Cassaigne, Rapha\"el Henry.

**Figure 1.** Figure 1: The tree T over {1, 2}. 3.4 Computation of the complexity In this subsection we apply the method explained in Section 1.1.2 to compute pC ∞ f (n). We show that it suffices to study T in order to bound pC ∞ f (n), even when a < b − 1 and the strong and weak bispecial words are spread among T, T (1) , T (2) , T (3) and T (4) . 15 [PITH_FULL_IMAGE:figures/full_fig_p015_1.png] view at source ↗

read the original abstract

Smooth words over an alphabet of non-negative integers $\{a,b\}$ are infinite words that are infinitely derivable, the most famous example being the Oldenburger-Kolakoski word over $\{1,2\}$. The main way to study their language is to consider a finite version of smooth words that we call f-smooth words. In this paper we prove that the f-smooth words are exactly the factors of smooth words, and we make progress towards the conjecture of Sing that the complexity of f-smooth words over $\{a,b\}$ grows like $\Theta\left(n^{\log(a+b)/\log((a+b)/2)}\right)$: we prove it over even alphabets, we prove the lower bound over any binary alphabet and we improve the known upper bound over odd alphabets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper establishes that f-smooth words are exactly the factors of smooth words and delivers case-by-case progress on Sing's complexity conjecture for binary alphabets.

read the letter

The core contribution is the proof that f-smooth words coincide with the factors of smooth words. That equivalence looks clean and removes a layer of indirection that previous work had to carry. On top of it the authors settle the full asymptotic for even alphabets, prove the matching lower bound for any binary alphabet, and tighten the known upper bound for odd alphabets. Those distinctions are new and directly address the conjecture rather than just restating it. The arguments rest on the derivability process applied uniformly to finite and infinite words, and nothing in the outline suggests circularity or hidden constraints that would break the factor relation. The even/odd split appears to be handled by separate counting arguments that track the growth rate correctly. The main limitation is that the full derivations and edge-case checks are not visible from the abstract, so a referee will still need to verify that the process stays well-defined when words are truncated or when the alphabet values interact in longer factors. That is a standard verification task rather than a structural flaw. The paper is aimed at specialists in combinatorics on words who already know the Kolakoski and smooth-word literature; a reader outside that niche will find the setup opaque. It is narrow in scope but the results are concrete and falsifiable, so it merits a serious referee who can check the combinatorial details. I would send it to review.

Referee Report

0 major / 2 minor

Summary. The manuscript proves that f-smooth words (finite versions of infinitely derivable smooth words) over a binary alphabet are exactly the factors of (infinite) smooth words. It further establishes the conjectured complexity growth Θ(n^{log(a+b)/log((a+b)/2)}) for even alphabets, proves the matching lower bound for arbitrary binary alphabets, and improves the known upper bound for odd alphabets.

Significance. The characterization supplies a foundational equivalence for studying the language of smooth words such as the Kolakoski word. The complexity results constitute concrete progress on Sing's conjecture, with the exact asymptotic determination for even alphabets being a strong, parameter-free confirmation of the predicted exponent.

minor comments (2)

[§1] §1: the definition of the derivability process for finite words could include an explicit statement that it terminates after finitely many steps without altering the factor relation.
[§4] The proof of the even-alphabet case (presumably in §4) would benefit from a short remark confirming that the exponent is independent of the specific even pair {a,b}.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive report, accurate summary of our results on the characterization of f-smooth words and progress toward Sing's conjecture, and the recommendation of minor revision. We appreciate the recognition of the foundational equivalence and the concrete advances on the complexity exponent.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper defines smooth words as infinitely derivable infinite words over non-negative integers and introduces f-smooth words as the finite analogue. The central result proves by direct combinatorial analysis that f-smooth words coincide exactly with the factors of smooth words; this equivalence is established from the definition of the derivability process without any reduction to fitted parameters, self-referential definitions, or load-bearing self-citations. The complexity bounds (exact growth for even alphabets, lower bound for any binary alphabet, improved upper bound for odd alphabets) are then derived from this equivalence using standard counting arguments on word factors and derivability depth. No step renames a known result, imports a uniqueness theorem from the authors' prior work, or treats a fitted quantity as a prediction. The argument remains independent of any external fitted data or circular self-reference.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard domain definitions of derivability and smoothness; no free parameters, new axioms beyond the field, or invented entities are indicated in the abstract.

axioms (1)

domain assumption Smooth words are infinite words that remain infinitely derivable under the block-replacement rule
This is the foundational definition invoked for both infinite and finite versions.

pith-pipeline@v0.9.0 · 5430 in / 1169 out tokens · 45784 ms · 2026-05-15T13:42:27.061250+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Frequency of patterns in smooth sequences over the alphabet {1, 3}
math.DS 2026-04 unverdicted novelty 6.0

Asymptotic frequencies of any finite pattern in smooth sequences over {1,3} are well-defined and depend on the type sequence of derivatives, with all such subshifts uniquely ergodic.

Reference graph

Works this paper leans on

23 extracted references · 23 canonical work pages · cited by 1 Pith paper · 3 internal anchors

[1]

Baake and B

M. Baake and B. Sing. Kolakoski-(3,1) is a (deformed) model set.Canadian Mathematical Bulletin, 47(2):168–190, 2004.doi:10.4153/CMB-2004-018-6

work page doi:10.4153/cmb-2004-018-6 2004
[2]

Brlek, D

S. Brlek, D. Jamet, and G. Paquin. Smooth words on 2-letter alphabets having same parity.Theoretical Computer Science, 393(1):166–181, 2008. doi:10.1016/j.tcs.2007.11.019

work page doi:10.1016/j.tcs.2007.11.019 2008
[3]

Brlek, G

S. Brlek, G. Melançon, and G. Paquin. Properties of extremal infinite smooth words.Discrete Mathematics and Theoretical Computer Science, 9, 11 2007.doi:10.46298/dmtcs.412

work page doi:10.46298/dmtcs.412 2007
[4]

A. Carpi. On repeated factors inC ∞-words.Information Processing Letters, 52(6):289–294, 1994.doi:10.1016/0020-0190(94)00162-6

work page doi:10.1016/0020-0190(94)00162-6 1994
[5]

Cassaigne

J. Cassaigne. Complexité et facteurs spéciaux.Bulletin of the Belgian Mathe- matical Society - Simon Stevin, 4, 01 1997.doi:10.36045/bbms/1105730624

work page doi:10.36045/bbms/1105730624 1997
[6]

F. M. Dekking. Regularity and irregularity of sequences generated by automata.Seminaire de Théorie des Nombres de Bordeaux, 9:1–10, 1979. URL: http://eudml.org/doc/182065

work page 1979
[7]

F. M. Dekking. On the structure of self-generating sequences.Séminaire de théorie des nombres de Bordeaux, pages 1–6, 1981. URL:http://www.jstor. org/stable/44166389

work page arXiv 1981
[8]

F. M. Dekking. What is the long range order in the Kolakoski sequence ? 08 2001.doi:10.1007/978-94-015-8784-6_5

work page doi:10.1007/978-94-015-8784-6_5 2001
[9]

Devyatov

R. Devyatov. On factor complexity of morphic sequences. Moscow Mathematical Journal, 18:211–303, 2018.doi:10.17323/ 1609-4514-2018-18-2-211-303

work page 2018
[10]

Y. B. Huang. The complexity of smooth words on 2-letter alphabets.The- oretical Computer Science, 412(45):6327–6339, 2011.doi:10.1016/j.tcs. 2011.07.002

work page doi:10.1016/j.tcs 2011
[11]

Y. B. Huang. The powers of smooth words over arbitrary 2-letter alphabets. 2011.arXiv:0904.0562

work page internal anchor Pith review Pith/arXiv arXiv 2011
[12]

Y. B. Huang and W. D. Weakley. A note on the complexity ofC ∞- words.Theor. Comput. Sci., 411(40-42):3731–3735, 2010.doi:10.1016/ J.TCS.2010.06.024

work page 2010
[13]

The on-line encyclopedia of integer sequences

OEIS Foundation Inc. The on-line encyclopedia of integer sequences. 2026. URL:https://oeis.org

work page 2026
[14]

Frequency of patterns in smooth sequences over the alphabet {1, 3}

D. Jamet, I. Marcovici, T. de la Rue, and L. Poirier. Frequency of patterns in smooth sequences over the alphabet{1,3}. 2026.arXiv:2604.11387. 23

work page internal anchor Pith review Pith/arXiv arXiv 2026
[15]

M. S. Keane. Ergodic theory and subshifts of finite type. pages 35–70, 1991

work page 1991
[16]

Kolakoski

W. Kolakoski. Self generating runs, problem 5304.The American Mathe- matical Monthly, 72:674, 1965.doi:10.2307/2313883

work page doi:10.2307/2313883 1965
[17]

Kolakoski and N

W. Kolakoski and N. Üçoluk. Solution of advanced problem 5304.The American Mathematical Monthly, 73:681–682, 1966

work page 1966
[18]

J. Nilsson. Letter frequencies in the Kolakoski sequence.Acta Physica Polonica A, 126:549–552, 2014.doi:10.12693/APhysPolA.126.549

work page doi:10.12693/aphyspola.126.549 2014
[19]

Oldenburger

R. Oldenburger. Exponent trajectories in symbolic dynamics.Transactions of the American Mathematical Society, 46(3):453–466, 1939.doi:10/2307/ 1989933

work page 1939
[20]

M. Rao. Trucs et bidules sur la séquence de Kolakoski. 2012. URL: https://www.arthy.org/kola/kola.php

work page 2012
[21]

B. Sing. Spektrale eigenschaften der Kolakoski-sequenzen.Diploma thesis, Universität Tübingen, 2002

work page 2002
[22]

B. Sing. More Kolakoski sequences.Integers, 11B:Paper No. A14, 17, 2011. arXiv:1009.4061

work page internal anchor Pith review Pith/arXiv arXiv 2011
[23]

W. D. Weakley. On the number ofC∞-words of each length.J. Comb. Theory A, 51(1):55–62, 1989.doi:10.1016/0097-3165(89)90076-9. 24

work page doi:10.1016/0097-3165(89)90076-9 1989

[1] [1]

Baake and B

M. Baake and B. Sing. Kolakoski-(3,1) is a (deformed) model set.Canadian Mathematical Bulletin, 47(2):168–190, 2004.doi:10.4153/CMB-2004-018-6

work page doi:10.4153/cmb-2004-018-6 2004

[2] [2]

Brlek, D

S. Brlek, D. Jamet, and G. Paquin. Smooth words on 2-letter alphabets having same parity.Theoretical Computer Science, 393(1):166–181, 2008. doi:10.1016/j.tcs.2007.11.019

work page doi:10.1016/j.tcs.2007.11.019 2008

[3] [3]

Brlek, G

S. Brlek, G. Melançon, and G. Paquin. Properties of extremal infinite smooth words.Discrete Mathematics and Theoretical Computer Science, 9, 11 2007.doi:10.46298/dmtcs.412

work page doi:10.46298/dmtcs.412 2007

[4] [4]

A. Carpi. On repeated factors inC ∞-words.Information Processing Letters, 52(6):289–294, 1994.doi:10.1016/0020-0190(94)00162-6

work page doi:10.1016/0020-0190(94)00162-6 1994

[5] [5]

Cassaigne

J. Cassaigne. Complexité et facteurs spéciaux.Bulletin of the Belgian Mathe- matical Society - Simon Stevin, 4, 01 1997.doi:10.36045/bbms/1105730624

work page doi:10.36045/bbms/1105730624 1997

[6] [6]

F. M. Dekking. Regularity and irregularity of sequences generated by automata.Seminaire de Théorie des Nombres de Bordeaux, 9:1–10, 1979. URL: http://eudml.org/doc/182065

work page 1979

[7] [7]

F. M. Dekking. On the structure of self-generating sequences.Séminaire de théorie des nombres de Bordeaux, pages 1–6, 1981. URL:http://www.jstor. org/stable/44166389

work page arXiv 1981

[8] [8]

F. M. Dekking. What is the long range order in the Kolakoski sequence ? 08 2001.doi:10.1007/978-94-015-8784-6_5

work page doi:10.1007/978-94-015-8784-6_5 2001

[9] [9]

Devyatov

R. Devyatov. On factor complexity of morphic sequences. Moscow Mathematical Journal, 18:211–303, 2018.doi:10.17323/ 1609-4514-2018-18-2-211-303

work page 2018

[10] [10]

Y. B. Huang. The complexity of smooth words on 2-letter alphabets.The- oretical Computer Science, 412(45):6327–6339, 2011.doi:10.1016/j.tcs. 2011.07.002

work page doi:10.1016/j.tcs 2011

[11] [11]

Y. B. Huang. The powers of smooth words over arbitrary 2-letter alphabets. 2011.arXiv:0904.0562

work page internal anchor Pith review Pith/arXiv arXiv 2011

[12] [12]

Y. B. Huang and W. D. Weakley. A note on the complexity ofC ∞- words.Theor. Comput. Sci., 411(40-42):3731–3735, 2010.doi:10.1016/ J.TCS.2010.06.024

work page 2010

[13] [13]

The on-line encyclopedia of integer sequences

OEIS Foundation Inc. The on-line encyclopedia of integer sequences. 2026. URL:https://oeis.org

work page 2026

[14] [14]

Frequency of patterns in smooth sequences over the alphabet {1, 3}

D. Jamet, I. Marcovici, T. de la Rue, and L. Poirier. Frequency of patterns in smooth sequences over the alphabet{1,3}. 2026.arXiv:2604.11387. 23

work page internal anchor Pith review Pith/arXiv arXiv 2026

[15] [15]

M. S. Keane. Ergodic theory and subshifts of finite type. pages 35–70, 1991

work page 1991

[16] [16]

Kolakoski

W. Kolakoski. Self generating runs, problem 5304.The American Mathe- matical Monthly, 72:674, 1965.doi:10.2307/2313883

work page doi:10.2307/2313883 1965

[17] [17]

Kolakoski and N

W. Kolakoski and N. Üçoluk. Solution of advanced problem 5304.The American Mathematical Monthly, 73:681–682, 1966

work page 1966

[18] [18]

J. Nilsson. Letter frequencies in the Kolakoski sequence.Acta Physica Polonica A, 126:549–552, 2014.doi:10.12693/APhysPolA.126.549

work page doi:10.12693/aphyspola.126.549 2014

[19] [19]

Oldenburger

R. Oldenburger. Exponent trajectories in symbolic dynamics.Transactions of the American Mathematical Society, 46(3):453–466, 1939.doi:10/2307/ 1989933

work page 1939

[20] [20]

M. Rao. Trucs et bidules sur la séquence de Kolakoski. 2012. URL: https://www.arthy.org/kola/kola.php

work page 2012

[21] [21]

B. Sing. Spektrale eigenschaften der Kolakoski-sequenzen.Diploma thesis, Universität Tübingen, 2002

work page 2002

[22] [22]

B. Sing. More Kolakoski sequences.Integers, 11B:Paper No. A14, 17, 2011. arXiv:1009.4061

work page internal anchor Pith review Pith/arXiv arXiv 2011

[23] [23]

W. D. Weakley. On the number ofC∞-words of each length.J. Comb. Theory A, 51(1):55–62, 1989.doi:10.1016/0097-3165(89)90076-9. 24

work page doi:10.1016/0097-3165(89)90076-9 1989