pith:4GZLISFO
Progress measures for grokking via mechanistic interpretability
Transformers on modular addition learn a Fourier rotation algorithm that gradually replaces memorization during training.
arxiv:2301.05217 v3 · 2023-01-12 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4GZLISFORWYFPHMHLGN4SDP2P6}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We fully reverse engineer the algorithm learned by these networks, which uses discrete Fourier transforms and trigonometric identities to convert addition to rotation about a circle. We confirm the algorithm by analyzing the activations and weights and by performing ablations in Fourier space.
That the identified Fourier circuit is the dominant mechanism and that ablations in Fourier space fully isolate it without missing other co-occurring computations that could also produce the observed behavior.
Grokking arises from gradual amplification of a Fourier-based circuit in the weights followed by removal of memorizing components.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:39:21.566368Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
e1b2b448ae8db0579d87599bc90dfa7f8b70549e054c6ffc140d0ac4dadecf36
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4GZLISFORWYFPHMHLGN4SDP2P6 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e1b2b448ae8db0579d87599bc90dfa7f8b70549e054c6ffc140d0ac4dadecf36
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "21d2e0a1f4ee7261ee53ba1f358b8ab53995d1926f8e62a563a00647a1530f9c",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.LG",
"submitted_at": "2023-01-12T18:56:49Z",
"title_canon_sha256": "2b8da6a68b450756d12923b1c2f434a107b1e93fe063bab567c85bdd1c56c5f9"
},
"schema_version": "1.0",
"source": {
"id": "2301.05217",
"kind": "arxiv",
"version": 3
}
}