Transformers Linearly Represent Highly Structured World Models

Nathana\"el Fijalkow; Roman Kniazev

arxiv: 2605.18847 · v1 · pith:MMAZ5EVQnew · submitted 2026-05-13 · 💻 cs.LG · cs.AI

Transformers Linearly Represent Highly Structured World Models

Roman Kniazev , Nathana\"el Fijalkow This is my paper

Pith reviewed 2026-05-20 20:59 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords transformersmechanistic interpretabilitySudokuworld modelsconstraint satisfactionemergent representationsneural circuitscombinatorial reasoning

0 comments

The pith

A transformer trained on Sudoku traces builds an internal model organized around rows, columns, and boxes rather than individual cells.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper trains an 8-layer transformer on sequences of Sudoku solving steps and applies mechanistic interpretability methods to inspect its internal states. It shows that the model represents the board by grouping information according to the puzzle's constraint groups instead of storing cell-by-cell values. This structure arises because the model aligns its representations with the algebra of the domain's rules. A sympathetic reader would care because it indicates that transformers can discover and use the underlying structure of combinatorial problems without being given that structure explicitly.

Core claim

The transformer develops a substructure world model in which information is organized around the rows, columns, and boxes that define Sudoku constraints, rather than representing the board state cell by cell. In addition, a small set of dedicated neurons in the final MLP layer forms a naked-single circuit that detects when exactly one digit remains possible for a given cell and promotes that digit.

What carries the argument

The substructure world model, in which the transformer organizes representations around the constraint groups of rows, columns, and boxes.

If this is right

The geometry of an emergent world model inside a transformer is determined by the constraint algebra of the task domain rather than its surface presentation.
The resulting decision circuits can be sparse, with individual neurons performing monosemantic functions that are fully interpretable.
Mechanistic interpretability methods can recover an end-to-end algorithmic description of how the transformer solves a combinatorial reasoning task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same substructure organization might appear in transformers trained on other constraint-satisfaction domains such as graph coloring or scheduling.
If the pattern holds, training data that explicitly exposes constraint structure could accelerate the emergence of interpretable internal models.
The finding suggests that linear representations in transformers can encode relational structure without requiring explicit architectural changes.

Load-bearing premise

The mechanistic interpretability techniques correctly isolate the causal mechanisms responsible for the model's Sudoku-solving behavior.

What would settle it

An experiment in which targeted interventions on the row-column-box representations or the candidate naked-single neurons fail to change the model's output accuracy on held-out puzzles.

Figures

Figures reproduced from arXiv: 2605.18847 by Nathana\"el Fijalkow, Roman Kniazev.

**Figure 1.** Figure 1: (Left) Mean exact match accuracy of three families of linear probes trained at [clues_end] token over different layers. Blue: 81 multi-class probes predicting the digit in a cell (top-1 accuracy); Orange: 729 binary probes predicting if a digit is a valid candidate in a cell (exact match across cell); Green: 243 binary probes predicting if a digit is present in a substructure (exact match over a substructu… view at source ↗

**Figure 2.** Figure 2: (Left) Cross-layer transfer of substructure-state probes: entry (i, j) is the mean exact match accuracy of probes trained on layer i activations and evaluated on layer j activations without retraining; (Right) Cross-position transfer of substructure-state probes trained on different layers: mean exact match accuracy of probes trained at [clues_end] and evaluated on later positions without retraining. accur… view at source ↗

**Figure 3.** Figure 3: Substructure constraint elimination in mid-layer attention heads. Attention map (left) shows [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: (Left) The distribution of the within-cell logit margin (correct digit score - max incorrect digit score) across 80,919 states with unique NS placement. (Right) A representative logit lens trace of tokens of a NS cell. The correct digit candidate (red) separates from other digits of the cell (gray) across the layers, with a general promotion of the logits for all digits of the cell in the last MLP block. 5… view at source ↗

**Figure 5.** Figure 5: The distribution of activations of an NS neurons. Placement Logit drop Probability drop Target NS placement 11.408 ± 0.084 0.585 ± 0.003 Other NS placements −0.655 ± 0.008 – [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗

**Figure 6.** Figure 6: (Left) Mean squared error between the probes’ predicted probability vectors and the one-hot targets, per layer, for the three probe families: 81 cell-state probes (blue), 729 cell-candidate probes (orange), and 243 substructure-state probes (green). Cell-state probes settle around MSE≈0.03 in mid-layers and never reach zero, consistent with their imperfect top-1 accuracy; substructure-state probes drive MS… view at source ↗

**Figure 7.** Figure 7: Cosine similarities between the probe trained at layer 6 activations to predict if [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗

**Figure 8.** Figure 8: Cross-position transfer diagnostics complementing Fig. 2 (right). Frozen substructure [PITH_FULL_IMAGE:figures/full_fig_p013_8.png] view at source ↗

**Figure 9.** Figure 9: Pairwise cosine similarity of the unembedding vectors. Each of the nine big cells is a row, [PITH_FULL_IMAGE:figures/full_fig_p014_9.png] view at source ↗

**Figure 10.** Figure 10: Mean attention scores from [clues_end] token to other tokens in the grid, averaged over all digits, computed over 6400 puzzles. 15 [PITH_FULL_IMAGE:figures/full_fig_p015_10.png] view at source ↗

**Figure 11.** Figure 11: Substructure constraint elimination in mid-layer attention heads. Attention maps (left) [PITH_FULL_IMAGE:figures/full_fig_p016_11.png] view at source ↗

read the original abstract

Do transformers, when trained on sequential reasoning traces, build internal models of the underlying task? And if so, does the structure of those internal representations mirror the structure of the domain? We train an 8-layer transformer on Sudoku solving traces and perform a mechanistic analysis of its internal computation. We establish two results. First, the model builds a substructure world model: it does not represent the board state cell by cell, as a human analyst would expect, but organizes information around the rows, columns, and boxes that Sudoku's constraints act on. Second, we identify a naked-single circuit: a small set of dedicated neurons in the final MLP layer, each individually detecting when exactly one digit remains possible for a specific cell, and reliably promoting that digit. These findings show that the geometry of an emergent world model is shaped by the constraint algebra of the domain, not its surface presentation, and that the resulting decision circuit is sparse, monosemantic, and fully interpretable. More broadly, they demonstrate that mechanistic interpretability tools can recover an end-to-end algorithmic account of how a transformer solves a combinatorial reasoning task.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The transformer organizes Sudoku info around constraint groups and has a sparse naked-single circuit, but the causal evidence from patching is not yet airtight.

read the letter

The main thing to know is that this paper trains a small transformer on Sudoku solving sequences and reports that the model internally organizes board information around rows, columns, and boxes rather than cell-by-cell, plus isolates a small set of neurons in the last MLP that detect and promote naked singles. That geometry claim and the concrete circuit are the concrete new pieces. Prior interpretability work has looked at simpler tasks or surface features; here the structure follows the actual constraint algebra of the problem, which is a step up in complexity. The analysis uses activation patching and neuron inspection on an 8-layer model, and they show the representations are not the naive cell-wise encoding one might expect. That part is worth noting because it gives a direct example of a model picking up domain structure from traces alone. The soft spot is exactly the one the stress test flags. The patching results need to show that intervening on the row/column/box features produces targeted, structure-specific effects on solving behavior rather than nonspecific degradation. If a linear probe on individual cells recovers similar information or if the interventions only correlate with performance drops, the world-model story stays more observational than mechanistic. The abstract does not include the quantitative ablation numbers or full patching tables, so it is hard to judge how complete or robust the circuit discovery is. This is for people working on mechanistic interpretability of reasoning models and on whether transformers acquire structured internal representations for combinatorial problems. A reader already following circuit work on Sudoku or similar constraint tasks will get the most out of it. The paper is coherent on its own terms and shows honest engagement with the methods, so it deserves a serious referee to pressure-test the causality claims and ask for the missing metrics. I would send it to review.

Referee Report

2 major / 2 minor

Summary. The paper trains an 8-layer transformer on sequential Sudoku solving traces and performs mechanistic interpretability analysis using activation patching and neuron inspection. It claims two main results: the model builds a substructure world model organized around rows, columns, and boxes (rather than representing the board cell-by-cell), and a sparse naked-single circuit exists in the final MLP layer where individual neurons detect cells with exactly one possible digit and promote that digit.

Significance. If the causal claims hold, the work would provide evidence that transformer internal representations can mirror the constraint algebra of a combinatorial domain rather than its surface structure, and that standard MI tools can recover sparse, interpretable decision circuits. This would strengthen the case for using mechanistic analysis to obtain end-to-end algorithmic accounts of reasoning in trained models, with potential implications for understanding how neural networks solve structured tasks.

major comments (2)

[Mechanistic analysis of world model] Mechanistic analysis section on substructure representations: the activation patching and neuron inspection results establish correlations between row/column/box features and internal activations, but the manuscript does not report control experiments comparing performance drops when patching these putative substructure neurons versus neurons identified via cell-wise linear probes; without such specificity tests, the claim that the model organizes information around Sudoku constraints (rather than cells) remains vulnerable to the possibility that cell-level features are also linearly extractable and causally sufficient.
[Naked-single circuit] Naked-single circuit identification (final MLP layer): while the paper identifies a small set of dedicated neurons that detect single-possibility cells, no quantitative ablation results are provided showing the exact accuracy drop (e.g., fraction of puzzles solved or per-step prediction accuracy) when these neurons are ablated versus random or control neurons; this is needed to confirm they are causally responsible for promoting the correct digit rather than correlational.

minor comments (2)

[Abstract and Introduction] The abstract and introduction could more explicitly state the training dataset size, number of Sudoku puzzles, and exact sequence format used for the solving traces to allow replication.
[Results figures] Figure legends for activation patching results should include error bars or statistical significance tests across multiple runs or puzzle sets.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which help clarify the strength of our causal claims. We address each major comment point-by-point below.

read point-by-point responses

Referee: [Mechanistic analysis of world model] Mechanistic analysis section on substructure representations: the activation patching and neuron inspection results establish correlations between row/column/box features and internal activations, but the manuscript does not report control experiments comparing performance drops when patching these putative substructure neurons versus neurons identified via cell-wise linear probes; without such specificity tests, the claim that the model organizes information around Sudoku constraints (rather than cells) remains vulnerable to the possibility that cell-level features are also linearly extractable and causally sufficient.

Authors: We agree that additional specificity controls are needed to rule out cell-level alternatives. In the revised manuscript we will add experiments that first train cell-wise linear probes, identify the top neurons by probe accuracy, and then compare activation-patching performance drops for those neurons against the substructure neurons. This will directly test whether constraint-based features are more causally relevant than cell-level ones. revision: yes
Referee: [Naked-single circuit] Naked-single circuit identification (final MLP layer): while the paper identifies a small set of dedicated neurons that detect single-possibility cells, no quantitative ablation results are provided showing the exact accuracy drop (e.g., fraction of puzzles solved or per-step prediction accuracy) when these neurons are ablated versus random or control neurons; this is needed to confirm they are causally responsible for promoting the correct digit rather than correlational.

Authors: We acknowledge that the main text lacked precise quantitative ablation metrics. The revised version will include a new table and figure reporting the exact drop in puzzle-solving accuracy and per-step prediction accuracy when ablating the naked-single neurons, compared against random ablations of the same number of neurons and against ablations of other high-activation neurons in the same layer. These results will quantify the causal contribution. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical MI analysis of trained transformer

full rationale

The paper trains an 8-layer transformer on Sudoku solving traces and applies mechanistic interpretability methods (activation patching, neuron inspection) to identify internal representations organized around rows/columns/boxes and a naked-single circuit in the final MLP. These results are obtained by direct empirical probing of the trained model rather than any derivation, equation, or first-principles argument that reduces to fitted parameters, self-definitions, or self-citation chains. No load-bearing step equates a claimed prediction or world-model geometry to quantities defined by the analysis itself. The work is self-contained against external benchmarks of model behavior and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work relies on standard transformer architecture assumptions and the validity of current mechanistic interpretability tools; no new physical or mathematical entities are postulated.

free parameters (1)

model depth and width
8-layer transformer size chosen for the experiment; typical hyperparameter not derived from first principles.

axioms (1)

domain assumption Mechanistic interpretability methods can isolate causal circuits in transformer activations
Invoked when claiming the naked-single neurons are the dedicated mechanism.

pith-pipeline@v0.9.0 · 5723 in / 1242 out tokens · 33042 ms · 2026-05-20T20:59:04.367809+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the model builds a substructure world model: it does not represent the board state cell by cell... organizes information around the rows, columns, and boxes that Sudoku's constraints act on
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Linear probes for “is digit d present in substructure S?” achieve perfect exact match accuracy across all 243 (substructure, digit) pairs

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 2 internal anchors

[1]

Understanding intermediate layers using linear classifier probes

G. Alain and Y. Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[2]

Belinkov

Y. Belinkov. Probing classifiers: Promises, shortcomings, and advances. Computational Linguistics, 48 0 (1): 0 207--219, 2022

work page 2022
[3]

Elhage, N

N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, N. DasSarma, D. Drain, D. Ganguli, Z. Hatfield-Dodds, D. Hernandez, A. Jones, J. Kernion, L. Lovitt, K. Ndousse, D. Amodei, T. Brown, J. Clark, J. Kaplan, S. McCandlish, and C. Olah. A mathematical framework for transformer circuits. 2021. Transformer ...

work page 2021
[4]

Elhage, T

N. Elhage, T. Hume, C. Olsson, N. Schiefer, T. Henighan, S. Kravec, Z. Hatfield-Dodds, R. Lasenby, D. Drain, C. Chen, R. Grosse, S. McCandlish, J. Kaplan, D. Amodei, M. Wattenberg, and C. Olah. Toy models of superposition. 2022. Transformer Circuits Thread, https://transformer-circuits.pub/2022/toy_model/index.html

work page 2022
[5]

Geiger, H

A. Geiger, H. Lu, T. Icard, and C. Potts. Causal abstractions of neural networks. Advances in neural information processing systems, 34: 0 9574--9586, 2021

work page 2021
[6]

Giannoulis, Y

P. Giannoulis, Y. Pantis, and C. Tzamos. Teaching transformers to solve combinatorial problems through efficient trial & error. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026. URL https://openreview.net/forum?id=MLprqOvAAK

work page 2026
[7]

Ivanitskiy, A

M. Ivanitskiy, A. F. Spies, T. R\"auker, G. Corlouer, C. Mathwin, L. Quirke, C. Rager, R. Shah, D. Valentine, C. D. Behn, K. Inoue, and S. W. Fung. Linearly structured world representations in maze-solving transformers. In M. Fumero, E. Rodolá, C. Domine, F. Locatello, K. Dziugaite, and C. Mathilde, editors, Proceedings of UniReps: the First Workshop on U...

work page 2024
[8]

Karvonen

A. Karvonen. Emergent world models and latent variable estimation in chess-playing language models. In First Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=PPTrmvEnpW

work page 2024
[9]

K. Li, A. K. Hopkins, D. Bau, F. Vi \' e gas, H. Pfister, and M. Wattenberg. Emergent world representations: Exploring a sequence model trained on a synthetic task. In The Eleventh International Conference on Learning Representations ( ICLR 2023) , 2023. URL https://openreview.net/forum?id=DeG07_TcZvT

work page 2023
[10]

Mikolov, W.-t

T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In L. Vanderwende, H. Daum \'e III, and K. Kirchhoff, editors, Proceedings of the 2013 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 746--751, Atlanta, Georgia, June 2013. ...

work page 2013
[11]

Nanda, A

N. Nanda, A. Lee, and M. Wattenberg. Emergent linear representations in world models of self-supervised sequence models. In Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 16--30, 2023

work page 2023
[12]

P. Norvig. Solving every Sudoku puzzle. https://norvig.com/sudoku.html, 2006

work page 2006
[13]

Interpreting GPT : the logit lens

nostalgebraist. Interpreting GPT : the logit lens. https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/

work page
[14]

K. Park, Y. J. Choe, and V. Veitch. The linear representation hypothesis and the geometry of large language models. In Forty-first International Conference on Machine Learning, 2024. URL https://openreview.net/forum?id=UGpGkLzwpP

work page 2024
[15]

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

A. Power, Y. Burda, H. Edwards, I. Babuschkin, and V. Misra. Grokking: Generalization beyond overfitting on small algorithmic datasets. arXiv preprint arXiv:2201.02177, 2022. arXiv:2201.02177

work page internal anchor Pith review Pith/arXiv arXiv 2022
[16]

Radcliffe

D. Radcliffe. 3 million Sudoku puzzles with ratings. Kaggle dataset, https://www.kaggle.com/datasets/radcliffe/3-million-sudoku-puzzles-with-ratings, 2020

work page 2020
[17]

K. Shah, N. Dikkala, X. Wang, and R. Panigrahy. Causal language modeling can elicit search and reasoning capabilities on logic puzzles. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=i5PoejmWoC

work page 2024

[1] [1]

Understanding intermediate layers using linear classifier probes

G. Alain and Y. Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[2] [2]

Belinkov

Y. Belinkov. Probing classifiers: Promises, shortcomings, and advances. Computational Linguistics, 48 0 (1): 0 207--219, 2022

work page 2022

[3] [3]

Elhage, N

N. Elhage, N. Nanda, C. Olsson, T. Henighan, N. Joseph, B. Mann, A. Askell, Y. Bai, A. Chen, T. Conerly, N. DasSarma, D. Drain, D. Ganguli, Z. Hatfield-Dodds, D. Hernandez, A. Jones, J. Kernion, L. Lovitt, K. Ndousse, D. Amodei, T. Brown, J. Clark, J. Kaplan, S. McCandlish, and C. Olah. A mathematical framework for transformer circuits. 2021. Transformer ...

work page 2021

[4] [4]

Elhage, T

N. Elhage, T. Hume, C. Olsson, N. Schiefer, T. Henighan, S. Kravec, Z. Hatfield-Dodds, R. Lasenby, D. Drain, C. Chen, R. Grosse, S. McCandlish, J. Kaplan, D. Amodei, M. Wattenberg, and C. Olah. Toy models of superposition. 2022. Transformer Circuits Thread, https://transformer-circuits.pub/2022/toy_model/index.html

work page 2022

[5] [5]

Geiger, H

A. Geiger, H. Lu, T. Icard, and C. Potts. Causal abstractions of neural networks. Advances in neural information processing systems, 34: 0 9574--9586, 2021

work page 2021

[6] [6]

Giannoulis, Y

P. Giannoulis, Y. Pantis, and C. Tzamos. Teaching transformers to solve combinatorial problems through efficient trial & error. In The Thirty-ninth Annual Conference on Neural Information Processing Systems, 2026. URL https://openreview.net/forum?id=MLprqOvAAK

work page 2026

[7] [7]

Ivanitskiy, A

M. Ivanitskiy, A. F. Spies, T. R\"auker, G. Corlouer, C. Mathwin, L. Quirke, C. Rager, R. Shah, D. Valentine, C. D. Behn, K. Inoue, and S. W. Fung. Linearly structured world representations in maze-solving transformers. In M. Fumero, E. Rodolá, C. Domine, F. Locatello, K. Dziugaite, and C. Mathilde, editors, Proceedings of UniReps: the First Workshop on U...

work page 2024

[8] [8]

Karvonen

A. Karvonen. Emergent world models and latent variable estimation in chess-playing language models. In First Conference on Language Modeling, 2024. URL https://openreview.net/forum?id=PPTrmvEnpW

work page 2024

[9] [9]

K. Li, A. K. Hopkins, D. Bau, F. Vi \' e gas, H. Pfister, and M. Wattenberg. Emergent world representations: Exploring a sequence model trained on a synthetic task. In The Eleventh International Conference on Learning Representations ( ICLR 2023) , 2023. URL https://openreview.net/forum?id=DeG07_TcZvT

work page 2023

[10] [10]

Mikolov, W.-t

T. Mikolov, W.-t. Yih, and G. Zweig. Linguistic regularities in continuous space word representations. In L. Vanderwende, H. Daum \'e III, and K. Kirchhoff, editors, Proceedings of the 2013 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 746--751, Atlanta, Georgia, June 2013. ...

work page 2013

[11] [11]

Nanda, A

N. Nanda, A. Lee, and M. Wattenberg. Emergent linear representations in world models of self-supervised sequence models. In Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 16--30, 2023

work page 2023

[12] [12]

P. Norvig. Solving every Sudoku puzzle. https://norvig.com/sudoku.html, 2006

work page 2006

[13] [13]

Interpreting GPT : the logit lens

nostalgebraist. Interpreting GPT : the logit lens. https://www.lesswrong.com/posts/AcKRB8wDpdaN6v6ru/

work page

[14] [14]

K. Park, Y. J. Choe, and V. Veitch. The linear representation hypothesis and the geometry of large language models. In Forty-first International Conference on Machine Learning, 2024. URL https://openreview.net/forum?id=UGpGkLzwpP

work page 2024

[15] [15]

Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets

A. Power, Y. Burda, H. Edwards, I. Babuschkin, and V. Misra. Grokking: Generalization beyond overfitting on small algorithmic datasets. arXiv preprint arXiv:2201.02177, 2022. arXiv:2201.02177

work page internal anchor Pith review Pith/arXiv arXiv 2022

[16] [16]

Radcliffe

D. Radcliffe. 3 million Sudoku puzzles with ratings. Kaggle dataset, https://www.kaggle.com/datasets/radcliffe/3-million-sudoku-puzzles-with-ratings, 2020

work page 2020

[17] [17]

K. Shah, N. Dikkala, X. Wang, and R. Panigrahy. Causal language modeling can elicit search and reasoning capabilities on logic puzzles. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. URL https://openreview.net/forum?id=i5PoejmWoC

work page 2024