pith. sign in

arxiv: 2606.06655 · v1 · pith:ZYUIG52Znew · submitted 2026-06-04 · 🧮 math.CO · math.PR

Letter frequency in shifts of finite type with one forbidden word

Pith reviewed 2026-06-28 00:11 UTC · model grok-4.3

classification 🧮 math.CO math.PR
keywords shifts of finite typeforbidden wordsletter frequencyborder polynomialbinary alphabetmonotonicity conjectureauto-correlation polynomial
0
0 comments X

The pith

Forbidding one binary word shifts the frequency of 1s above, below, or exactly to 1/2 according to its border polynomial.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies binary shifts of finite type that avoid exactly one forbidden word and asks how that choice affects the average frequency of the symbol 1 in a typical allowed sequence. It shows that this frequency is completely determined by a two-variable auto-correlation polynomial called the border polynomial of the forbidden word. Using this encoding the authors classify all forbidden words that raise the frequency strictly above 1/2, lower it strictly below 1/2, or leave it exactly at 1/2. They further conjecture that, for fixed length and aside from four exceptions, the frequency increases monotonically with the number of 1s inside the forbidden word. These results matter because they link a purely local combinatorial condition to a global statistical property of the constrained system.

Core claim

In binary shifts of finite type defined by a single forbidden word, the limiting frequency of the letter 1 is given by an explicit formula involving the border polynomial of that word. This formula yields a complete description of the forbidden words that produce frequency greater than 1/2, less than 1/2, or exactly equal to 1/2. The paper also states the conjecture that, among forbidden words of equal length except for four exceptions, this frequency is monotone in the number of 1s appearing in the forbidden word.

What carries the argument

The border polynomial, the two-variable auto-correlation polynomial attached to the forbidden word, which encodes all combinatorial data needed to compute the letter frequency.

If this is right

  • Forbidding a word whose border polynomial satisfies a certain positivity condition forces the frequency of 1s to exceed 1/2.
  • The patterns that keep the frequency exactly 1/2 are precisely those whose border polynomial vanishes at a specific point.
  • Among words of fixed length the frequency ordering is conjectured to follow the ordering by the number of 1s in the forbidden word, except for four listed exceptions.
  • Explicit injections between shift spaces can be constructed to prove inequalities between frequencies when the border polynomials stand in a suitable relation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same border-polynomial technique could be tested on shifts that forbid two or more words simultaneously.
  • Changing the forbidden word to tune letter frequency might also alter the topological entropy or the rate of mixing of the shift.
  • Selecting forbidden words according to the conjecture could be used to bias the output distribution in constrained coding schemes without changing the constraint length.

Load-bearing premise

The letter frequency is fully determined by the border polynomial of the single forbidden word.

What would settle it

Compute the border polynomial for a concrete forbidden word, derive the predicted frequency, then count the proportion of 1s in a long random allowed word generated by the shift and check whether the two numbers agree.

read the original abstract

This work considers combinatorial and statistical aspects of {\em{shifts of finite type}}, which are families of words over a finite alphabet which avoid a fixed class of {\emph{forbidden}} sub-words. The overarching question we are interested in is: how do local statistics of a uniformly random element of the shift space depend on combinatorial features of the forbidden set? We focus on the binary alphabet $\{0,1\}$, the class of shift spaces where a single pattern is forbidden, and the average frequency of $1$s (equivalently, the probability of observing $1$ at a given position). In this case, the relevant combinatorial information is encoded by a two-variable auto-correlation polynomial associated to the forbidden word, which we call the {\em{border polynomial}}. We present several results and examples characterizing the ordering of all words by their letter frequencies: for example, we describe the set of patterns which, when forbidden, cause the frequency of $1$s to increase, decrease, or stay exactly $1/2$. We conjecture that, among forbidden patterns of the same length (except for four exceptional words), the letter frequency is monotone with respect to the number of $1$s in the forbidden word. Our methodologies include novel explicit local injections and bijections, generating function analysis, and a connection with a probabilistic notion of letter frequency.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper studies letter frequencies (specifically the density of 1s) in binary shifts of finite type defined by a single forbidden word. It introduces a two-variable autocorrelation object called the border polynomial to encode the combinatorial data of the forbidden word, then uses generating functions, explicit local injections/bijections, and probabilistic interpretations to characterize which forbidden words raise, lower, or fix the frequency at exactly 1/2. The authors also state a conjecture that, for fixed length and excluding four exceptional words, the frequency is monotone in the number of 1s appearing in the forbidden word.

Significance. If the border polynomial is shown to determine the frequencies via the stated methods, the work supplies a concrete combinatorial classification of local statistics in one-forbidden-word subshifts. The explicit bijections and the monotonicity conjecture (with carved-out exceptions) constitute falsifiable predictions that could guide further research in symbolic dynamics and combinatorics on words.

major comments (2)
  1. [Abstract / border polynomial introduction] The abstract asserts that the border polynomial encodes the relevant information and that results/examples exist characterizing the ordering, yet supplies neither the explicit definition of the polynomial nor the derivation linking it to the frequency via generating functions. This is load-bearing for every ordering claim and the conjecture.
  2. [Conjecture statement] The monotonicity conjecture is presented without a proof sketch or computational verification for small lengths; the four exceptional words are noted but the justification for carving them out is not supplied in the summary, leaving the scope of the claim unclear.
minor comments (2)
  1. Notation for the border polynomial (two variables, autocorrelation) should be fixed early and used consistently when stating the frequency formula.
  2. The abstract mentions 'several results and examples' but does not list the theorems or the lengths for which examples are given; a short table of computed frequencies for small forbidden words would clarify the claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and valuable feedback on our manuscript. We address each of the major comments below, indicating where revisions will be made to improve clarity and completeness.

read point-by-point responses
  1. Referee: [Abstract / border polynomial introduction] The abstract asserts that the border polynomial encodes the relevant information and that results/examples exist characterizing the ordering, yet supplies neither the explicit definition of the polynomial nor the derivation linking it to the frequency via generating functions. This is load-bearing for every ordering claim and the conjecture.

    Authors: The abstract is intended as a high-level summary and does not include technical definitions or derivations, which are provided in the body of the paper. However, we acknowledge that this may cause confusion for readers relying primarily on the abstract. We will revise the abstract to include a concise definition of the border polynomial and a brief mention of the generating function approach. This will be a partial revision as the core content remains unchanged. revision: partial

  2. Referee: [Conjecture statement] The monotonicity conjecture is presented without a proof sketch or computational verification for small lengths; the four exceptional words are noted but the justification for carving them out is not supplied in the summary, leaving the scope of the claim unclear.

    Authors: As a conjecture, a full proof is not provided, but computational verification for small lengths and identification of the four exceptional words are included in the manuscript, with the justification based on explicit counterexamples to monotonicity for those words. To address the referee's concern, we will add a short proof sketch outline and expand the discussion of the exceptions and scope in the revised version, including clarification in the abstract. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The derivation introduces the border polynomial as an independent two-variable autocorrelation object associated to the forbidden word, then applies generating functions, explicit local injections/bijections, and probabilistic interpretations to obtain letter frequencies and ordering results. No equation or claim reduces a frequency prediction to a fitted parameter, self-citation chain, or definitional renaming; the four exceptional words are explicitly carved out. The central claims remain self-contained against external combinatorial benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claims rest on the newly introduced border polynomial together with standard facts about shifts of finite type and generating functions; no free parameters or additional invented entities beyond the polynomial are indicated.

axioms (1)
  • standard math Standard properties of shifts of finite type and generating functions for word avoidance
    These are established background results in combinatorics on words and symbolic dynamics.
invented entities (1)
  • border polynomial no independent evidence
    purpose: Encodes the two-variable auto-correlation of the forbidden word to determine letter frequency
    Newly defined construct that links combinatorial overlaps to the statistical frequency; no independent evidence outside the paper is supplied.

pith-pipeline@v0.9.1-grok · 5773 in / 1423 out tokens · 36810 ms · 2026-06-28T00:11:20.001296+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

16 extracted references

  1. [1]

    Carrigan, I

    J. Carrigan, I. Hollars, and E. Rowland. A natural bijection for contiguous pattern avoidance in words.Discrete Mathematics, 347(3):113793, 2024

  2. [2]

    Chandgotia, B

    N. Chandgotia, B. Marcus, J. Richey, and C. Wu. Shifts of finite type obtained by forbidding a single pattern.Discrete and Continuous Dynamical Systems, 48:538–576, 2026

  3. [3]

    Flajolet and R

    P. Flajolet and R. Sedgewick.Analytic Combinatorics. Cambridge University Press, Cam- bridge, 2009

  4. [4]

    I. M. Gessel. An application of the Goulden–Jackson cluster theorem.Algebraic Combinatorics, 5(6):1279–1286, 2022

  5. [5]

    I. P. Goulden and D. M. Jackson. An inversion theorem for cluster decompositions of sequences with distinguished subsequences.Journal of the London Mathematical Society, 20(3):567–576, 1979

  6. [6]

    L. J. Guibas and A. M. Odlyzko. Periods in strings.Journal of Combinatorial Theory, Series A, 30(1):19–42, 1981

  7. [7]

    L. J. Guibas and A. M. Odlyzko. String overlaps, pattern matching, and nontransitive games. Journal of Combinatorial Theory, Series A, 30(2):183–208, 1981

  8. [8]

    R. Kenyon. Patterns in sequences.arXiv preprintarXiv:2601.04078, 2026

  9. [9]

    E. J. Kupin and D. S. Yuster. Generalizations of the Goulden–Jackson cluster method.Journal of Difference Equations and Applications, 16(12):1391–1404, 2010

  10. [10]

    D. A. Levin, Y. Peres, and E. L. Wilmer.Markov Chains and Mixing Times. Second edition. American Mathematical Society, Providence, RI, 2017

  11. [11]

    Lind and B

    D. Lind and B. Marcus.An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, 2nd edition, 2021

  12. [12]

    D. S. Ornstein. Bernoulli shifts with the same entropy are isomorphic.Advances in Mathe- matics, 4(3):337–352, 1970

  13. [13]

    W. Parry. Intrinsic Markov chains.Transactions of the American Mathematical Society, 112:55–66, 1964

  14. [14]

    Pollicott and M

    M. Pollicott and M. Yuri.Dynamical Systems and Ergodic Theory. London Mathematical Society Student Texts, vol. 40. Cambridge University Press, Cambridge, 1998

  15. [15]

    V. Vatter. Permutation classes. In M. B´ ona, editor,Handbook of Enumerative Combinatorics, pages 754–833. CRC Press, Boca Raton, FL, 2015. 21 Appendix A: Convergence to Parry Chain In this appendix we recall the construction and some properties of a finite state Markov chain corresponding to a given shift of finite type, commonly known as the Parry chain...

  16. [16]

    Denote byS n the set of allowed paths inAof lengthn≥1: Sn ={s 1s2 · · ·s n :A sisi+1 = 1fori= 1,

    Forn≥1and allx, y∈S, (P nπ)xy =µ −nℓxry (A.2) Theorem A.3.LetAbe any primitive transfer matrix with Parry chainP. Denote byS n the set of allowed paths inAof lengthn≥1: Sn ={s 1s2 · · ·s n :A sisi+1 = 1fori= 1, . . . , n−1}.(A.3) LetP n be the measure induced byPonS n, as in 3 of Fact A.2, and for integersi, j, letQ [i,n−j] be the uniform measure onS n re...