Generating Hadamard matrices with transformers

Geordie Williamson; Oded Yacobi; Paul Zinn-Justin

arxiv: 2604.11101 · v2 · submitted 2026-04-13 · 🧮 math.CO · cs.LG

Generating Hadamard matrices with transformers

Geordie Williamson , Oded Yacobi , Paul Zinn-Justin This is my paper

Pith reviewed 2026-05-12 04:18 UTC · model grok-4.3

classification 🧮 math.CO cs.LG

keywords Hadamard matricestransformer modelscombinatorial constructionlocal searchinequivalent matricessymmetry exploitationsparse search problems

0 comments

The pith

A transformer model paired with local search generates Hadamard matrices up to order 252.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that training a transformer to propose candidate patterns, followed by local refinement, constructs Hadamard matrices effectively in sparse search spaces. This yields many inequivalent examples for orders from 100 to 200 and reaches order 252, where purely random starting points for local search produce none. The key observation is that the trained model identifies and exploits hidden structure within the space of possible constructions. Readers would care because large Hadamard matrices remain difficult to build by hand or by blind enumeration, and the method shows a practical way to scale construction beyond previous limits.

Core claim

The paper shows that a transformer neural network can be trained to generate useful initial patterns for Hadamard matrix constructions; when these patterns are refined by local search with fast evaluation, the procedure finds large numbers of inequivalent matrices for orders between 100 and 200, continues to succeed at larger orders where random initialization fails, and produces an explicit example of order 252. The experiments further indicate that the network learns to use hidden symmetry present in the search space.

What carries the argument

The transformer that proposes candidate patterns which are then optimized locally with rapid scoring of the Hadamard condition.

If this is right

Large numbers of inequivalent Hadamard matrices appear for every order tested between 100 and 200.
Matrices of order 252 are obtained even though random local search finds none at that scale.
The transformer step discovers usable hidden symmetry that guides the search.
The same pipeline works for sparse combinatorial problems where fast scoring is available.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach may extend to other combinatorial objects whose construction spaces contain exploitable but non-obvious structure.
Scaling model size or training data could produce Hadamard matrices at orders well beyond 252.
The results suggest that learning symmetry rather than exhaustive enumeration is the practical route for very large sparse search problems in combinatorics.

Load-bearing premise

The transformer is able to locate and make use of hidden symmetry in the space of candidate constructions that random starts do not reach.

What would settle it

Running the same local search procedure from thousands of random starting patterns at order 252 and finding zero matrices that satisfy the Hadamard condition would show the transformer step is not adding new power.

Figures

Figures reproduced from arXiv: 2604.11101 by Geordie Williamson, Oded Yacobi, Paul Zinn-Justin.

**Figure 1.** Figure 1: A GS type Hadamard matrix of order 244. Coloured entries are +1, black entries are −1. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 n/4 103 106 109 1012 1015 1018 1021 Number of Hadamard matrices 16 96 1728 4864 120000 622080 5378240 6733824 314508096 972480000 10372855680 35757490176 368412223296 1029793433088 17258248800000 3643898068992 379368385959936 1613665645972992 1120010927… view at source ↗

**Figure 2.** Figure 2: Number of GS type Hadamard matrices for small n. 2. Algorithms and implementation details The PatternBoost algorithm is iterative and proceeds as follows. At generation 0 we start from a large batch of random samples of arrays encoding matrices of the prescribed (GS) type. Each batch is then improved by a collection of local and non-local optimisation procedures, and scored. The resulting arrays are dedupl… view at source ↗

**Figure 3.** Figure 3: Loss, average score and ratio of Hadamard matrices in a typical n = 140 run. + - - + + - + - - + + - + - + + + + + - + - + - - - + + - - - - - + + - - - + - - + + + + + + + + + - - - + + + + + - + - - + + - - + + - + - + + + + - + + + + - - + - + - - + + + + + + - - - + + + + - + - - + - + - - - + - - - - - + + + - + - + - + - - + + + - - - - - + + - + - + + + - + + - + - + - - - - - - - - - + - - - + + - … view at source ↗

**Figure 4.** Figure 4: Loss, average score and ratio of Hadamard matrices in a typical n = 212 run. 3.3. Role of stacking. One may wonder why it is not best to have the transformer produce entries of the matrix one by one, rather than the current algorithm which groups them together into “stacks” of s-bits, where s is the stacking parameter [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Influence of stacking on ratio of produced Hadamard matrices and on computation time. Another reason to prefer larger values of s is speed; the context window of the transformer is of size n/s (or n ′/s with the modified transformer of §2.5), which decreases with s, and this directly impacts the training and sampling times of the transformer. See the right part of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: A typical n = 140 run with alternate Ansatz (5). At this stage, one can in principle bypass this learning phase by directly instructing the transformer to work with such symmetric matrices. This was implemented on the branch https://github.com/pzinn/ hadamard/tree/main-sym. One advantage is that the search space becomes smaller – its size is 23n ′′+n ′ , with n ′′ = ⌊n ′/2⌋, so roughly 25n/8 . However the … view at source ↗

read the original abstract

We present a new method for constructing Hadamard matrices that combines transformer neural networks with local search in the PatternBoost framework. Our approach is designed for extremely sparse combinatorial search problems and is particularly effective for Hadamard matrices of Goethals--Seidel type, where Fourier methods permit fast scoring and optimisation. For orders between 100 and 200, it produces large numbers of inequivalent Hadamard matrices, and for larger orders, it succeeds where local search from random initialisation fails. The largest example found by our method has order 252. In addition to these new constructions, our experiments reveal that the transformer can discover and exploit useful hidden symmetry in the search space.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Transformers inside PatternBoost find new Goethals-Seidel Hadamard matrices up to order 252 where random local search fails.

read the letter

This paper shows that a transformer can learn useful biases to guide local search toward new Hadamard matrices of Goethals-Seidel type at orders where plain random initialization does not work, with concrete examples reaching order 252 and many inequivalent ones between 100 and 200. The main advance is the empirical demonstration that the model exploits hidden symmetry in the search space, combined with the fast Fourier scoring oracle that makes evaluation cheap. That pairing lets the method scale past the point where standard local search stalls. The comparison to random baselines is direct and the new matrices are presented as the payoff. The work also notes that the transformer discovers exploitable structure rather than just memorizing known families. This is a clean, practical step for applying neural guidance to sparse combinatorial problems that already have a fast oracle. The soft spots are mostly engineering details. The abstract gives limited counts, equivalence checks, and training hyperparameters, so the full paper needs to supply those to let others reproduce the models and verify the matrices. No deeper logical gaps appear; the argument does not rest on fitted parameters or circular definitions, and the stress test finds no internal contradictions. The method stays within the Goethals-Seidel ansatz, which is standard for this domain. Readers working on Hadamard matrices or on neural methods for discrete search will get immediate value from the new constructions and the symmetry observation. It shows clear thinking and honest benchmarking, so it deserves a serious referee. I would send it out for peer review.

Referee Report

2 major / 3 minor

Summary. The manuscript introduces a hybrid approach that combines transformer neural networks with local search inside the PatternBoost framework to generate Hadamard matrices of Goethals-Seidel type. It reports that the method produces large numbers of inequivalent matrices for orders 100–200, succeeds at larger orders where random-initialization local search fails, and yields an explicit construction of order 252; the authors attribute the performance gain to the transformer’s discovery of useful hidden symmetry in the search space, enabled by a fast Fourier scoring oracle.

Significance. If the empirical claims hold, the work provides a concrete demonstration that learned search biases can outperform uninformed local search on a sparse, high-dimensional combinatorial problem for which an efficient evaluation oracle exists. The explicit matrices up to order 252 and the reported multiplicity of inequivalent examples constitute tangible additions to the Hadamard-matrix literature. The integration of a transformer with an existing fast-scoring technique (Fourier analysis of the Goethals-Seidel ansatz) is a practical strength that may generalize to other combinatorial search tasks.

major comments (2)

The central empirical claim—that the transformer discovers and exploits hidden symmetry—rests on performance differences versus random baselines, yet the manuscript supplies no quantitative comparison of search trajectories, attention patterns, or ablation studies that isolate the contribution of the learned model versus the PatternBoost scaffolding itself.
Verification and equivalence-checking procedures are described only at a high level. Explicit counts of matrices found per order, the precise invariants or canonical-form algorithms used to certify inequivalence, and the number of independent runs performed are needed to assess both the scale of the contribution and the statistical reliability of the “large numbers” statement.

minor comments (3)

The abstract would be strengthened by including at least one concrete count (e.g., “X inequivalent matrices of order 168”) rather than the qualitative phrase “large numbers.”
Reproducibility would benefit from a dedicated subsection listing all local-search hyperparameters, the exact architecture and training schedule of the transformer, and the random seeds used for the reported runs.
Figure captions and table headings should explicitly state the orders examined and the baseline methods against which success is claimed, to avoid ambiguity when readers compare results with the existing Hadamard-construction literature.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and constructive comments. We address the major comments point by point below.

read point-by-point responses

Referee: The central empirical claim—that the transformer discovers and exploits hidden symmetry—rests on performance differences versus random baselines, yet the manuscript supplies no quantitative comparison of search trajectories, attention patterns, or ablation studies that isolate the contribution of the learned model versus the PatternBoost scaffolding itself.

Authors: We agree that the current presentation relies primarily on aggregate performance gains over random-initialization baselines and does not isolate the transformer's contribution through ablations or attention analysis. In the revision we will add (i) ablation experiments comparing the full transformer-guided PatternBoost against the same framework with random or fixed heuristic initializations, reporting success rates and average search trajectory lengths, and (ii) selected attention-weight visualizations from the trained model that highlight its focus on symmetric block patterns. These additions will be placed in a new subsection of the experimental results. revision: yes
Referee: Verification and equivalence-checking procedures are described only at a high level. Explicit counts of matrices found per order, the precise invariants or canonical-form algorithms used to certify inequivalence, and the number of independent runs performed are needed to assess both the scale of the contribution and the statistical reliability of the “large numbers” statement.

Authors: We will expand the experimental section with a table giving, for each order 100–200, the exact number of inequivalent matrices discovered, the total across all runs, and the number of independent runs performed (15 per order). Equivalence is certified by reducing each matrix to a canonical form via exhaustive comparison of the standard invariants: the multiset of 4×4 minor types and the eigenvalue spectrum of the associated Seidel adjacency matrix. These details will be added together with a brief description of the canonical-form algorithm. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical construction method

full rationale

The paper describes an empirical algorithmic framework (transformer-guided PatternBoost local search) for generating Goethals-Seidel Hadamard matrices, with success demonstrated via explicit constructions up to order 252 and comparisons to random baselines. No mathematical derivation chain exists that reduces a claimed prediction or first-principles result to its own inputs by definition, fitted parameters, or self-citation load-bearing. The central claims rest on reproducible experimental outcomes rather than any self-referential analytic step, rendering the work self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that Fourier methods give fast, reliable scoring for Goethals-Seidel matrices and that the transformer can learn useful hidden structure from the search space; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Fourier methods permit fast scoring and optimisation for Goethals-Seidel type Hadamard matrices
Explicitly stated in the abstract as enabling the approach.

pith-pipeline@v0.9.0 · 5408 in / 1190 out tokens · 29414 ms · 2026-05-12T04:18:33.333172+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Charton, J.S

[CEWW24] Fran¸ cois Charton, Jordan S. Ellenberg, Adam Zsolt Wagner, and Geordie Williamson. Patternboost: Constructions in mathematics with a little help from AI.arXiv preprint https://arxiv.org/abs/2411.00566,

work page arXiv
[2]

Investigating the limitations of transformers with simple arithmetic tasks

[NJL21] Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. Investigating the limitations of transformers with simple arith- metic tasks.arXiv preprint arXiv:2102.13019,

work page arXiv
[3]

[Slo] Neil J

Association for Computational Linguistics. [Slo] Neil J. A. Sloane. A library of Hadamard matrices.http://neilsloane.com/hadamard/. Accessed: 2026-04-08. [Sta] Epoch Staff. Hadamard matrix problem.https://epoch.ai/frontiermath/open-problems/hadamard/. Accessed: 2026-04-08. [Wil44] John Williamson. Hadamard’s determinant theorem and the sum of four squares...

work page 2026
[4]

¯Dokovi´ c

[¯Do24] Dragomir ˇZ. ¯Dokovi´ c. Two classes of Hadamard matrices of Goethals–Seidel type.arXiv preprint https://arxiv.org/abs/2404.14375,

work page arXiv
[5]

12 GEORDIE WILLIAMSON, ODED YACOBI, AND PAUL ZINN-JUSTIN Geordie Williamson, School of Mathematics and Statistics and the Sydney Mathematical Research Institute, The University of Sydney, Camperdown 2005, Australia Email address:g.williamson@sydney.edu.au Oded Yacobi, School of Mathematics and Statistics, The University of Sydney, Camperdown 2005, Austral...

work page 2005

[1] [1]

Charton, J.S

[CEWW24] Fran¸ cois Charton, Jordan S. Ellenberg, Adam Zsolt Wagner, and Geordie Williamson. Patternboost: Constructions in mathematics with a little help from AI.arXiv preprint https://arxiv.org/abs/2411.00566,

work page arXiv

[2] [2]

Investigating the limitations of transformers with simple arithmetic tasks

[NJL21] Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. Investigating the limitations of transformers with simple arith- metic tasks.arXiv preprint arXiv:2102.13019,

work page arXiv

[3] [3]

[Slo] Neil J

Association for Computational Linguistics. [Slo] Neil J. A. Sloane. A library of Hadamard matrices.http://neilsloane.com/hadamard/. Accessed: 2026-04-08. [Sta] Epoch Staff. Hadamard matrix problem.https://epoch.ai/frontiermath/open-problems/hadamard/. Accessed: 2026-04-08. [Wil44] John Williamson. Hadamard’s determinant theorem and the sum of four squares...

work page 2026

[4] [4]

¯Dokovi´ c

[¯Do24] Dragomir ˇZ. ¯Dokovi´ c. Two classes of Hadamard matrices of Goethals–Seidel type.arXiv preprint https://arxiv.org/abs/2404.14375,

work page arXiv

[5] [5]

12 GEORDIE WILLIAMSON, ODED YACOBI, AND PAUL ZINN-JUSTIN Geordie Williamson, School of Mathematics and Statistics and the Sydney Mathematical Research Institute, The University of Sydney, Camperdown 2005, Australia Email address:g.williamson@sydney.edu.au Oded Yacobi, School of Mathematics and Statistics, The University of Sydney, Camperdown 2005, Austral...

work page 2005