pith:Y3UJ36HW
SOAP: Improving and Stabilizing Shampoo using Adam
SOAP runs Adam inside Shampoo's eigenbasis to cut large-batch iterations by over 40 percent versus AdamW.
arxiv:2409.11321 v2 · 2024-09-17 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{Y3UJ36HWYODK3SBIVQAWN3ZXCB}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
In the large-batch regime, SOAP reduces the number of iterations by over 40% and wall-clock time by over 35% compared to AdamW, with approximately 20% improvements in both metrics compared to Shampoo.
The formal equivalence between 1/2-power Shampoo and Adafactor holds only inside the current eigenbasis; the paper assumes that keeping this basis fixed for many steps does not materially degrade the preconditioning quality, an assumption validated only empirically on the tested model sizes.
SOAP runs Adam in the eigenbasis of Shampoo's preconditioner, cutting iterations by over 40% versus AdamW on 360M-660M language models while adding only one hyperparameter.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:39:05.179384Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
c6e89df8f6c386adc828ac0166ef37105c60b407da96a0de22491f79bd188883
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/Y3UJ36HWYODK3SBIVQAWN3ZXCB \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c6e89df8f6c386adc828ac0166ef37105c60b407da96a0de22491f79bd188883
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "10304c10072863fbd852c8c755150cdf3e5f2321c96b18b2f4e2df8a7fcfe0d2",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.LG",
"submitted_at": "2024-09-17T16:18:05Z",
"title_canon_sha256": "19f51ed35d18fedb5acd0bd7125373799c5f6753272115eb11a91ac74cc69dcb"
},
"schema_version": "1.0",
"source": {
"id": "2409.11321",
"kind": "arxiv",
"version": 2
}
}