pith. sign in

arxiv: 2509.18103 · v3 · pith:U74IYUF3new · submitted 2025-09-09 · 💻 cs.LG · math.NT

Machine Learnability as a Measure of Order in Aperiodic Sequences

Pith reviewed 2026-05-21 22:03 UTC · model grok-4.3

classification 💻 cs.LG math.NT
keywords machine learningUlam spiralprime numbersnumber theoryaperiodic sequenceslearnabilityorder in distributions
0
0 comments X

The pith

Machine learning models achieve higher accuracy classifying primes in Ulam spiral blocks near 500 million than in blocks below 25 million.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that image-based machine learning can compare the learnable regularity of prime distributions across different scales by training classifiers on extracted blocks from the Ulam spiral. Higher accuracy on regions around 500 million compared with regions below 25 million indicates that primes exhibit more detectable order at larger magnitudes. Precision and recall patterns further suggest the model shifts emphasis from spotting prime patterns at small numbers to rejecting composites at large numbers. This aligns with expectations that averages and equidistribution dominate while local randomness regularizes after scaling by log x. The work positions machine learning as a potential experimental instrument for number theory questions including patterns relevant to cryptography.

Core claim

By training image-focused models on blocks from the Ulam spiral, the authors find that classification accuracy is higher for prime fields in the vicinity of 500 million than for integers below 25 million. This difference implies the existence of more easily learnable order in the higher-magnitude region. A breakdown of precision and recall indicates the model adopts different classification strategies in the two regions, favoring prime pattern identification at lower numbers and composite elimination at higher numbers. The findings support the possibility that machine learning can act as a new experimental tool for investigating prime distributions and related conjectures.

What carries the argument

Ulam spiral image blocks extracted from distinct magnitude regions and fed to image classification models that distinguish primes from composites, with accuracy serving as the comparative measure of order.

If this is right

  • Prime distributions at higher orders of magnitude contain more readily learnable structure than those at small integers.
  • The model employs region-specific strategies, identifying primes at low scales and eliminating composites at high scales.
  • Machine learning accuracy on spiral images can serve as an experimental probe for number-theoretic conjectures about density and equidistribution.
  • The same approach could be used to examine patterns in strong and weak primes relevant to cryptographic applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Applying the same block-extraction and classification protocol to other aperiodic sequences could quantify comparative order across different mathematical objects.
  • If the accuracy gap persists when the experiment is repeated with multiple distinct model families, the result would point more strongly to intrinsic distributional properties rather than model artifacts.
  • The observed shift toward average-dominated behavior at large x supplies a concrete, testable signature that future work could correlate with explicit bounds on prime gaps or discrepancy measures.

Load-bearing premise

That differences in model accuracy between the two spiral regions directly reflect differences in mathematical order within the prime distributions rather than arising from image encoding choices, dataset imbalances, or model-specific biases.

What would settle it

Re-running the classification experiments after applying a different image encoding method or constructing balanced training sets drawn equally from both regions and checking whether the accuracy advantage for the higher-magnitude region disappears.

Figures

Figures reproduced from arXiv: 2509.18103 by Adarsh Singh Chauhan, Adith Ramdas, Akira Rafhael, Jennifer Dodgson, Michael Joedhitya, Nordine Lotfi, Surender Suresh Kumar, Wang Mingshu.

Figure 1
Figure 1. Figure 1: 25,010,001-integer Ulam spiral, with 256 [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Example masked input, inpainting result, error map, original image [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Training metrics for a one of the three sets of models produced. Note [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Density of primes lower than x calculated per the prime number [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparative drop in prime density/accuracy of na¨ıve ratio-based [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Research on the distribution of prime numbers has revealed a dual character: deterministic in definition yet exhibiting statistical behavior reminiscent of random processes. In this paper we show that it is possible to use an image-focused machine learning model to measure the comparative regularity of prime number fields at specific regions of an Ulam spiral. Specifically, we demonstrate that in pure accuracy terms, models trained on blocks extracted from regions of the spiral in the vicinity of 500m outperform models trained on blocks extracted from the region representing integers lower than 25m. This implies existence of more easily learnable order in the former region than in the latter. Moreover, a detailed breakdown of precision and recall scores seem to imply that the model is favouring a different approach to classification in different regions of the spiral, focusing more on identifying prime patterns at lower numbers and more on eliminating composites at higher numbers. This aligns with number theory conjectures suggesting that at higher orders of magnitude we should see diminishing noise in prime number distributions, with averages (density, AP equidistribution) coming to dominate, while local randomness regularises after scaling by log x. Taken together, these findings point toward an interesting possibility: that machine learning can serve as a new experimental instrument for number theory. Notably, the method shows potential 1 for investigating the patterns in strong and weak primes for cryptographic purposes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript claims that image-based machine learning models can serve as a measure of comparative order in the distribution of primes by training on blocks extracted from an Ulam spiral. It reports that models trained on blocks near 500 million outperform those trained on blocks below 25 million in classification accuracy, with shifts in precision/recall indicating a change from identifying prime patterns at low magnitudes to eliminating composites at high magnitudes. This is interpreted as evidence of greater learnable regularity at larger scales, consistent with number-theoretic expectations of diminishing local randomness after scaling by log x, and is positioned as a potential new experimental tool for number theory and cryptographic applications involving strong/weak primes.

Significance. If the accuracy differences can be shown to arise from intrinsic mathematical structure rather than statistical artifacts, the work would introduce a data-driven probe for regularity in aperiodic sequences that complements analytic number theory. The approach could be extended to other conjectures on prime distributions, though its current evidential basis remains preliminary.

major comments (3)
  1. [Abstract] Abstract and methods: the reported accuracy advantage for the 500 m region is presented without any description of model architecture, training protocol, dataset sizes, cross-validation procedure, or statistical tests. This omission makes it impossible to determine whether the performance gap is reproducible or exceeds what would be expected from random variation.
  2. [Abstract] Abstract: the central interpretation equates higher classification accuracy with greater mathematical order, yet no controls are described for the well-known decline in prime density (∼1/log x). Standard classifiers can achieve higher accuracy on more imbalanced tasks simply by biasing toward the majority (composite) class; the reported shift toward “eliminating composites” at higher numbers is exactly the behavior predicted by this imbalance alone. Density-matched or label-balanced ablations are required to isolate any contribution from local regularity.
  3. [Abstract] Abstract: the claim that the method reveals “diminishing noise” at higher orders of magnitude rests on the accuracy comparison, but the manuscript supplies no quantitative comparison of prime-density-normalized performance or of alternative encodings (e.g., rescaled pixel windows that preserve local density). Without these, the result remains underdetermined with respect to the number-theoretic interpretation.
minor comments (2)
  1. [Abstract] The abstract refers to “strong and weak primes for cryptographic purposes” without defining these terms or indicating how the spiral-block method would distinguish them; a brief clarification or reference would improve accessibility.
  2. Figure captions and axis labels (presumed present in the full manuscript) should explicitly state the integer range represented by each pixel block and the exact positive/negative class ratio in each training set.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and insightful comments on our manuscript. We believe these suggestions will significantly improve the clarity and rigor of our work. Below, we provide point-by-point responses to the major comments and outline the revisions we intend to make.

read point-by-point responses
  1. Referee: [Abstract] Abstract and methods: the reported accuracy advantage for the 500 m region is presented without any description of model architecture, training protocol, dataset sizes, cross-validation procedure, or statistical tests. This omission makes it impossible to determine whether the performance gap is reproducible or exceeds what would be expected from random variation.

    Authors: We acknowledge this omission in the abstract. We will expand the abstract to include a brief overview of the model architecture, training protocol, dataset sizes, cross-validation, and statistical tests used. Additionally, we will ensure the Methods section provides full details and report measures of variability such as standard deviations across runs to demonstrate that the performance gap exceeds random variation. revision: yes

  2. Referee: [Abstract] Abstract: the central interpretation equates higher classification accuracy with greater mathematical order, yet no controls are described for the well-known decline in prime density (∼1/log x). Standard classifiers can achieve higher accuracy on more imbalanced tasks simply by biasing toward the majority (composite) class; the reported shift toward “eliminating composites” at higher numbers is exactly the behavior predicted by this imbalance alone. Density-matched or label-balanced ablations are required to isolate any contribution from local regularity.

    Authors: This is a valid concern, as the decreasing prime density with increasing magnitude naturally leads to greater class imbalance, which can inflate accuracy without reflecting additional order. The shift in behavior from identifying primes to eliminating composites aligns with what a biased classifier would do. We will incorporate the suggested controls by performing density-matched ablations, where composite samples are subsampled to match the density at lower scales, and label-balanced experiments with equal numbers of positive and negative examples. The results of these ablations will be added to the manuscript to clarify whether the observed differences are due to intrinsic regularity or solely to density effects. revision: yes

  3. Referee: [Abstract] Abstract: the claim that the method reveals “diminishing noise” at higher orders of magnitude rests on the accuracy comparison, but the manuscript supplies no quantitative comparison of prime-density-normalized performance or of alternative encodings (e.g., rescaled pixel windows that preserve local density). Without these, the result remains underdetermined with respect to the number-theoretic interpretation.

    Authors: We concur that additional quantitative controls are necessary to support the interpretation of diminishing noise. In the revision, we will compute and report prime-density-normalized performance metrics, such as the improvement over a density-adjusted baseline accuracy. We will also implement and compare alternative encodings, including rescaled pixel windows designed to preserve local prime density across different magnitude ranges. These additions will help isolate the contribution of local patterns from global density changes and strengthen the link to number-theoretic expectations. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical accuracy comparison stands on its own experimental inputs

full rationale

The paper presents an empirical machine-learning experiment: separate classifiers are trained on image blocks extracted from distinct magnitude ranges of the Ulam spiral (near 25 m vs. near 500 m) and their test accuracies are compared. No derivation chain, fitted parameter, or self-referential equation is invoked to obtain the central claim; the reported performance gap is the direct output of the training procedure itself. The abstract and described results contain no self-citations used as load-bearing uniqueness theorems, no ansatz smuggled via prior work, and no renaming of a known result as a new derivation. Because the measurement is defined externally by the ML training loop and the chosen data partitions, the finding does not reduce to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that machine learning accuracy serves as a valid proxy for mathematical order in prime distributions; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)
  • domain assumption Machine learning classification accuracy on spiral images directly reflects comparative regularity or order in the prime number distribution
    Invoked when the accuracy difference is taken to imply more easily learnable order at higher scales.

pith-pipeline@v0.9.0 · 5799 in / 1389 out tokens · 59537 ms · 2026-05-21T22:03:14.857338+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

10 extracted references · 10 canonical work pages

  1. [1]

    Hadamard

    J. Hadamard. Sur la distribution des z´ eros de la fonctionζ(s) et ses cons´ equences arithm´ etiques.Bull. Soc. Math. France, 24:199–220, 1896

  2. [2]

    de la Vall´ ee Poussin

    C.-J. de la Vall´ ee Poussin. Recherches analytiques sur la th´ eorie des nom- bres premiers.Ann. Soc. Sci. Bruxelles, 20:183–256, 1896

  3. [3]

    Bombieri

    E. Bombieri. On the large sieve.Mathematika, 12(2):201–225, 1965

  4. [4]

    A. I. Vinogradov. The density hypothesis for Dirichlet’s L-series.Izv. Akad. Nauk SSSR Ser. Mat., 29(4):903–934, 1965

  5. [5]

    Green and T

    B. Green and T. Tao. The primes contain arbitrarily long arithmetic pro- gressions.Ann. of Math., 167(2):481–547, 2008

  6. [6]

    Cram´ er

    H. Cram´ er. On the order of magnitude of the difference between consecutive prime numbers.Acta Arith., 2:23–46, 1936

  7. [7]

    G. H. Hardy and J. E. Littlewood. Some problems of ’Partitio numerorum’; III: On the expression of a number as a sum of primes.Acta Math., 44:1–70, 1923

  8. [8]

    H. L. Montgomery. The pair correlation of zeros of the zeta function. In Proc. Sympos. Pure Math., volume 24, pages 181–193. AMS, 1973

  9. [9]

    A. M. Odlyzko. On the distribution of spacings between zeros of the zeta function.Math. Comp., 48(177):273–308, 1987

  10. [10]

    P. Sarnak. M¨ obius randomness and dynamics. InProc. Int. Congress of Mathematicians, volume I, pages 594–617, 2012. 14