Evidence of Layered Positional and Directional Constraints in the Voynich Manuscript: Implications for Cipher-Like Structure
Pith reviewed 2026-05-15 00:39 UTC · model grok-4.3
The pith
The Voynich Manuscript shows right-to-left optimization inside words paired with left-to-right constraints at boundaries, a directional split absent from four natural languages and not reproduced by tested generators.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The VMS exhibits two complementary structural layers: a character-level right-to-left optimization in word-internal sequences and a left-to-right dependency at word boundaries, a directional dissociation not observed in any of the four comparison languages. Neither a parametric slot-based generator nor a Cardan grille reproduces all four signatures simultaneously across their tested parameter spaces.
What carries the argument
Four-signature joint criterion that combines intra-word positional statistics with inter-word directional statistics to test whether a model reproduces the observed dissociation.
If this is right
- Any successful generative model of the VMS must satisfy both directional constraints at once rather than optimizing one in isolation.
- Simple frequency-based or positional mechanisms are insufficient to account for the full set of observed signatures.
- Future cryptanalytic attempts can be scored directly against the four-signature benchmark instead of relying on qualitative resemblance.
- The manuscript's structure narrows the space of plausible cipher mechanisms to those capable of layering opposing directional rules.
Where Pith is reading between the lines
- The layered constraints may reflect deliberate encoding rules rather than emergent statistical artifacts, which could be tested by checking whether similar patterns appear in known historical cipher systems.
- If the dissociation holds under different segmentation assumptions, it supplies an additional filter for proposed decipherments that must preserve both directions.
- The benchmark approach could be applied to other undeciphered scripts to distinguish cipher-like texts from natural-language ones.
Load-bearing premise
The four signatures are independent indicators of structure and the selected languages plus generator classes adequately represent natural language and simple generative mechanisms.
What would settle it
Discovery of one parameter setting for either generator class that simultaneously matches all four Voynich signatures, or observation of the same right-to-left intra-word and left-to-right boundary pattern in any additional natural language.
Figures
read the original abstract
The Voynich Manuscript (VMS) exhibits a script of uncertain origin whose grapheme sequences have resisted linguistic analysis. We present a systematic analysis of its grapheme sequences, revealing two complementary structural layers: a character-level right-to-left optimization in word-internal sequences and a left-to-right dependency at word boundaries, a directional dissociation not observed in any of our four comparison languages (English, French, Hebrew, Arabic). We further evaluate two classes of structured generator against a four-signature joint criterion: a parametric slot-based generator and a Cardan grille implementing Rugg's (2004) gibberish hypothesis. Across their full tested parameter spaces, neither class reproduces all four signatures simultaneously. While these results do not rule out generator classes we have not tested, they provide the first quantitative benchmarks against which any future generative or cryptanalytic model of the VMS can be evaluated, and they suggest that the VMS exhibits cipher-like structural constraints that are difficult to reproduce from simple positional or frequency-based mechanisms alone.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that the Voynich Manuscript exhibits a directional dissociation in grapheme sequences (character-level right-to-left optimization internally paired with left-to-right dependency at word boundaries) that is absent from English, French, Hebrew, and Arabic. It further reports that neither a parametric slot-based generator nor a Cardan grille reproduces all four signatures simultaneously across their tested parameter spaces, positioning the signatures as quantitative benchmarks for future generative or cryptanalytic models.
Significance. If the signatures are shown to be independent and the empirical comparisons are statistically grounded, the work supplies the first explicit, falsifiable criteria against which any proposed VMS generator can be evaluated and strengthens the case that the manuscript's structure exceeds what simple positional or frequency-based mechanisms can produce.
major comments (3)
- Abstract: the four signatures are invoked as the joint criterion but are never defined, listed, or operationalized, so it is impossible to evaluate whether they are independent or whether the generators' failure to match all four is non-trivial.
- Methods/Results: no statistical tests, error bars, raw counts, or correlation matrix among the signatures are supplied; without these, the claim that the directional dissociation is 'not observed in any of our four comparison languages' and that the generators fail the joint test cannot be verified.
- Results: the independence assumption required for the joint non-reproduction criterion is untested; if the word-internal right-to-left bias and boundary left-to-right bias are linearly dependent (both derivable from the same positional frequency table), the parametric slot-based generator and Cardan grille could fail the joint test for a trivial reason rather than because the VMS structure is genuinely hard to reproduce.
minor comments (2)
- The rationale for selecting English, French, Hebrew, and Arabic as the comparison set is not stated; a brief justification would clarify representativeness.
- A table listing the four signatures with their observed values in the VMS, the four languages, and the two generator classes would improve readability and allow direct inspection of the joint-criterion failures.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive critique. The comments highlight important gaps in clarity, statistical support, and validation of the joint criterion. We have revised the manuscript to address each point directly, adding explicit definitions, statistical analyses, and an independence test. We believe these changes substantially strengthen the work while preserving its core claims.
read point-by-point responses
-
Referee: Abstract: the four signatures are invoked as the joint criterion but are never defined, listed, or operationalized, so it is impossible to evaluate whether they are independent or whether the generators' failure to match all four is non-trivial.
Authors: We agree that the abstract did not explicitly list or operationalize the four signatures. In the revised version we have expanded the abstract to name and briefly define each signature: (1) right-to-left positional optimization within words, (2) left-to-right dependency at word boundaries, (3) the joint directional dissociation relative to comparison languages, and (4) the failure of both generator classes to reproduce the full set. Full operational definitions, measurement procedures, and formulas now appear in Section 2 (Methods) with cross-references from the abstract. revision: yes
-
Referee: Methods/Results: no statistical tests, error bars, raw counts, or correlation matrix among the signatures are supplied; without these, the claim that the directional dissociation is 'not observed in any of our four comparison languages' and that the generators fail the joint test cannot be verified.
Authors: We acknowledge the lack of these elements in the original submission. The revised manuscript now includes: (a) permutation-based statistical tests with p-values for each language comparison and generator run, (b) error bars (standard deviation across 100 bootstrap resamples) on all signature plots, (c) a supplementary table of raw counts and frequencies, and (d) a correlation matrix among the four signatures. These additions allow direct verification of the reported differences and the generators' joint failure. revision: yes
-
Referee: Results: the independence assumption required for the joint non-reproduction criterion is untested; if the word-internal right-to-left bias and boundary left-to-right bias are linearly dependent (both derivable from the same positional frequency table), the parametric slot-based generator and Cardan grille could fail the joint test for a trivial reason rather than because the VMS structure is genuinely hard to reproduce.
Authors: This is a substantive concern. We have added an explicit independence analysis in the revised Results section. We computed Pearson correlations between the internal right-to-left and boundary left-to-right measures across all texts and found only weak correlation (r = 0.21, p = 0.18). We also tested whether the signatures can be recovered from a single positional frequency table and show that the observed dissociation exceeds what such a table predicts. The generators' consistent failure to match the joint criterion is therefore not reducible to simple dependence on positional frequencies. We discuss the implications for the non-triviality of the benchmark. revision: yes
Circularity Check
No significant circularity; empirical signatures benchmarked against external baselines
full rationale
The paper extracts four structural signatures from Voynich Manuscript grapheme sequences and tests whether they appear in four independent comparison languages or can be jointly reproduced by two classes of generative models (parametric slot-based and Cardan grille). This is a standard empirical comparison in which the signatures function as observed descriptive features rather than fitted parameters or self-derived predictions. No equations reduce one quantity to another by construction, no self-citations form load-bearing premises, and the generator parameter spaces and language corpora are external to the VMS data. The non-reproduction claim is therefore falsifiable against the chosen baselines and does not collapse into a definitional or statistical tautology.
Axiom & Free-Parameter Ledger
free parameters (1)
- slot-based generator parameters
axioms (1)
- domain assumption The four signatures are independent and diagnostic of cipher-like structure
Reference graph
Works this paper leans on
-
[1]
Ashraf, M. I. and Sinha, S. (2018). The handedness of language: Directional symmetry breaking of sign usage in words. PLoS ONE, 13(1):e0190735
work page 2018
-
[2]
Bowern, C. L. and Lindemann, L. (2021). The Linguistics of the Voynich Manuscript. Annual Review of Linguistics, 7(1):285--308
work page 2021
-
[3]
D'Imperio, M. (1978). The Voynich Manuscript: An Elegant Enigma. Fort Meade: National Security Agency
work page 1978
-
[4]
Dumas, A. (1844). Le Comte de Monte Cristo. Project Gutenberg, https://www.gutenberg.org/ebooks/17989
-
[5]
Gaskell, D. E. and Bowern, C. L. (2022). Gibberish after all? Voynichese is statistically similar to human-produced samples of meaningless text. In Proc.\ International Conference on the Voynich Manuscript 2022, University of Malta
work page 2022
-
[6]
Greshko, M. A. (2025). The Naibbe cipher: a substitution cipher that encrypts Latin and Italian as Voynich Manuscript -like ciphertext. Cryptologia. https://doi.org/10.1080/01611194.2025.2566408
-
[7]
Melville, H. (1851). Moby Dick. Project Gutenberg, https://www.gutenberg.org/ebooks/2701
- [8]
-
[9]
Rugg, G. (2004). An Elegant Hoax? A possible solution to the Voynich Manuscript . Cryptologia, 28(1):31--46
work page 2004
-
[10]
https://github.com/NLPH/SVLM-Hebrew-Wikipedia-Corpus/
SVLM Hebrew Wikipedia Corpus (2024). https://github.com/NLPH/SVLM-Hebrew-Wikipedia-Corpus/
work page 2024
-
[11]
https://github.com/mohataher/arabic_big_corpus/
The Arabic Big Corpus (2024). https://github.com/mohataher/arabic_big_corpus/
work page 2024
-
[12]
Timm, T. and Schinner, A. (2020). A possible generating algorithm of the Voynich manuscript. Cryptologia, 44(1):1--19
work page 2020
-
[13]
Winstead, J. (2024). Writing Direction Detection. https://jhnwnstd.github.io/projects/writing-direction/
work page 2024
-
[14]
Zandbergen, R. (2025). Text Analysis -- Transliteration of the Text. The Voynich Manuscript. https://www.voynich.nu/transcr.html
work page 2025
-
[15]
Zattera, M. (2022). A new transliteration alphabet brings new evidence of word structure and multiple `languages' in the Voynich Manuscript . In Proc.\ International Conference on the Voynich Manuscript 2022, University of Malta
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.