Letter frequency in shifts of finite type with one forbidden word
Pith reviewed 2026-06-28 00:11 UTC · model grok-4.3
The pith
Forbidding one binary word shifts the frequency of 1s above, below, or exactly to 1/2 according to its border polynomial.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In binary shifts of finite type defined by a single forbidden word, the limiting frequency of the letter 1 is given by an explicit formula involving the border polynomial of that word. This formula yields a complete description of the forbidden words that produce frequency greater than 1/2, less than 1/2, or exactly equal to 1/2. The paper also states the conjecture that, among forbidden words of equal length except for four exceptions, this frequency is monotone in the number of 1s appearing in the forbidden word.
What carries the argument
The border polynomial, the two-variable auto-correlation polynomial attached to the forbidden word, which encodes all combinatorial data needed to compute the letter frequency.
If this is right
- Forbidding a word whose border polynomial satisfies a certain positivity condition forces the frequency of 1s to exceed 1/2.
- The patterns that keep the frequency exactly 1/2 are precisely those whose border polynomial vanishes at a specific point.
- Among words of fixed length the frequency ordering is conjectured to follow the ordering by the number of 1s in the forbidden word, except for four listed exceptions.
- Explicit injections between shift spaces can be constructed to prove inequalities between frequencies when the border polynomials stand in a suitable relation.
Where Pith is reading between the lines
- The same border-polynomial technique could be tested on shifts that forbid two or more words simultaneously.
- Changing the forbidden word to tune letter frequency might also alter the topological entropy or the rate of mixing of the shift.
- Selecting forbidden words according to the conjecture could be used to bias the output distribution in constrained coding schemes without changing the constraint length.
Load-bearing premise
The letter frequency is fully determined by the border polynomial of the single forbidden word.
What would settle it
Compute the border polynomial for a concrete forbidden word, derive the predicted frequency, then count the proportion of 1s in a long random allowed word generated by the shift and check whether the two numbers agree.
read the original abstract
This work considers combinatorial and statistical aspects of {\em{shifts of finite type}}, which are families of words over a finite alphabet which avoid a fixed class of {\emph{forbidden}} sub-words. The overarching question we are interested in is: how do local statistics of a uniformly random element of the shift space depend on combinatorial features of the forbidden set? We focus on the binary alphabet $\{0,1\}$, the class of shift spaces where a single pattern is forbidden, and the average frequency of $1$s (equivalently, the probability of observing $1$ at a given position). In this case, the relevant combinatorial information is encoded by a two-variable auto-correlation polynomial associated to the forbidden word, which we call the {\em{border polynomial}}. We present several results and examples characterizing the ordering of all words by their letter frequencies: for example, we describe the set of patterns which, when forbidden, cause the frequency of $1$s to increase, decrease, or stay exactly $1/2$. We conjecture that, among forbidden patterns of the same length (except for four exceptional words), the letter frequency is monotone with respect to the number of $1$s in the forbidden word. Our methodologies include novel explicit local injections and bijections, generating function analysis, and a connection with a probabilistic notion of letter frequency.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies letter frequencies (specifically the density of 1s) in binary shifts of finite type defined by a single forbidden word. It introduces a two-variable autocorrelation object called the border polynomial to encode the combinatorial data of the forbidden word, then uses generating functions, explicit local injections/bijections, and probabilistic interpretations to characterize which forbidden words raise, lower, or fix the frequency at exactly 1/2. The authors also state a conjecture that, for fixed length and excluding four exceptional words, the frequency is monotone in the number of 1s appearing in the forbidden word.
Significance. If the border polynomial is shown to determine the frequencies via the stated methods, the work supplies a concrete combinatorial classification of local statistics in one-forbidden-word subshifts. The explicit bijections and the monotonicity conjecture (with carved-out exceptions) constitute falsifiable predictions that could guide further research in symbolic dynamics and combinatorics on words.
major comments (2)
- [Abstract / border polynomial introduction] The abstract asserts that the border polynomial encodes the relevant information and that results/examples exist characterizing the ordering, yet supplies neither the explicit definition of the polynomial nor the derivation linking it to the frequency via generating functions. This is load-bearing for every ordering claim and the conjecture.
- [Conjecture statement] The monotonicity conjecture is presented without a proof sketch or computational verification for small lengths; the four exceptional words are noted but the justification for carving them out is not supplied in the summary, leaving the scope of the claim unclear.
minor comments (2)
- Notation for the border polynomial (two variables, autocorrelation) should be fixed early and used consistently when stating the frequency formula.
- The abstract mentions 'several results and examples' but does not list the theorems or the lengths for which examples are given; a short table of computed frequencies for small forbidden words would clarify the claims.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and valuable feedback on our manuscript. We address each of the major comments below, indicating where revisions will be made to improve clarity and completeness.
read point-by-point responses
-
Referee: [Abstract / border polynomial introduction] The abstract asserts that the border polynomial encodes the relevant information and that results/examples exist characterizing the ordering, yet supplies neither the explicit definition of the polynomial nor the derivation linking it to the frequency via generating functions. This is load-bearing for every ordering claim and the conjecture.
Authors: The abstract is intended as a high-level summary and does not include technical definitions or derivations, which are provided in the body of the paper. However, we acknowledge that this may cause confusion for readers relying primarily on the abstract. We will revise the abstract to include a concise definition of the border polynomial and a brief mention of the generating function approach. This will be a partial revision as the core content remains unchanged. revision: partial
-
Referee: [Conjecture statement] The monotonicity conjecture is presented without a proof sketch or computational verification for small lengths; the four exceptional words are noted but the justification for carving them out is not supplied in the summary, leaving the scope of the claim unclear.
Authors: As a conjecture, a full proof is not provided, but computational verification for small lengths and identification of the four exceptional words are included in the manuscript, with the justification based on explicit counterexamples to monotonicity for those words. To address the referee's concern, we will add a short proof sketch outline and expand the discussion of the exceptions and scope in the revised version, including clarification in the abstract. revision: yes
Circularity Check
No significant circularity
full rationale
The derivation introduces the border polynomial as an independent two-variable autocorrelation object associated to the forbidden word, then applies generating functions, explicit local injections/bijections, and probabilistic interpretations to obtain letter frequencies and ordering results. No equation or claim reduces a frequency prediction to a fitted parameter, self-citation chain, or definitional renaming; the four exceptional words are explicitly carved out. The central claims remain self-contained against external combinatorial benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard properties of shifts of finite type and generating functions for word avoidance
invented entities (1)
-
border polynomial
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Carrigan, I
J. Carrigan, I. Hollars, and E. Rowland. A natural bijection for contiguous pattern avoidance in words.Discrete Mathematics, 347(3):113793, 2024
2024
-
[2]
Chandgotia, B
N. Chandgotia, B. Marcus, J. Richey, and C. Wu. Shifts of finite type obtained by forbidding a single pattern.Discrete and Continuous Dynamical Systems, 48:538–576, 2026
2026
-
[3]
Flajolet and R
P. Flajolet and R. Sedgewick.Analytic Combinatorics. Cambridge University Press, Cam- bridge, 2009
2009
-
[4]
I. M. Gessel. An application of the Goulden–Jackson cluster theorem.Algebraic Combinatorics, 5(6):1279–1286, 2022
2022
-
[5]
I. P. Goulden and D. M. Jackson. An inversion theorem for cluster decompositions of sequences with distinguished subsequences.Journal of the London Mathematical Society, 20(3):567–576, 1979
1979
-
[6]
L. J. Guibas and A. M. Odlyzko. Periods in strings.Journal of Combinatorial Theory, Series A, 30(1):19–42, 1981
1981
-
[7]
L. J. Guibas and A. M. Odlyzko. String overlaps, pattern matching, and nontransitive games. Journal of Combinatorial Theory, Series A, 30(2):183–208, 1981
1981
-
[8]
R. Kenyon. Patterns in sequences.arXiv preprintarXiv:2601.04078, 2026
arXiv 2026
-
[9]
E. J. Kupin and D. S. Yuster. Generalizations of the Goulden–Jackson cluster method.Journal of Difference Equations and Applications, 16(12):1391–1404, 2010
2010
-
[10]
D. A. Levin, Y. Peres, and E. L. Wilmer.Markov Chains and Mixing Times. Second edition. American Mathematical Society, Providence, RI, 2017
2017
-
[11]
Lind and B
D. Lind and B. Marcus.An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, 2nd edition, 2021
2021
-
[12]
D. S. Ornstein. Bernoulli shifts with the same entropy are isomorphic.Advances in Mathe- matics, 4(3):337–352, 1970
1970
-
[13]
W. Parry. Intrinsic Markov chains.Transactions of the American Mathematical Society, 112:55–66, 1964
1964
-
[14]
Pollicott and M
M. Pollicott and M. Yuri.Dynamical Systems and Ergodic Theory. London Mathematical Society Student Texts, vol. 40. Cambridge University Press, Cambridge, 1998
1998
-
[15]
V. Vatter. Permutation classes. In M. B´ ona, editor,Handbook of Enumerative Combinatorics, pages 754–833. CRC Press, Boca Raton, FL, 2015. 21 Appendix A: Convergence to Parry Chain In this appendix we recall the construction and some properties of a finite state Markov chain corresponding to a given shift of finite type, commonly known as the Parry chain...
2015
-
[16]
Denote byS n the set of allowed paths inAof lengthn≥1: Sn ={s 1s2 · · ·s n :A sisi+1 = 1fori= 1,
Forn≥1and allx, y∈S, (P nπ)xy =µ −nℓxry (A.2) Theorem A.3.LetAbe any primitive transfer matrix with Parry chainP. Denote byS n the set of allowed paths inAof lengthn≥1: Sn ={s 1s2 · · ·s n :A sisi+1 = 1fori= 1, . . . , n−1}.(A.3) LetP n be the measure induced byPonS n, as in 3 of Fact A.2, and for integersi, j, letQ [i,n−j] be the uniform measure onS n re...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.