pith:4ZAE2CQ5
GQA-{\mu}P: The maximal parameterization update for grouped query attention
A modified spectral norm for non-full-rank matrices lets maximal update parameterization apply to grouped-query attention.
arxiv:2605.15290 v1 · 2026-05-14 · cs.LG · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4ZAE2CQ5P2OAQA7JZEBMVDQNF5}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We demonstrate the efficacy of our theoretical derivations by showing learning rate transfer across the GQA repetition hyperparameter as well as experiments regarding transfer over weight decay.
The modified spectral norm preserves the valid scaling law of network weights when weight matrices are not full rank; this premise is invoked to enable the GQA derivation and is stated as the key technical step after promoting spectral conditions to a definition.
Derives μP scalings for GQA via promoted spectral-norm definition of feature learning and a modified norm preserving scaling laws for non-full-rank matrices, with experiments showing learning-rate transfer.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:00:50.909862Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
e6404d0a1d7e9c0803e9c902ca8e0d2f488b932ee2a429751a1c5bc4e859073d
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4ZAE2CQ5P2OAQA7JZEBMVDQNF5 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e6404d0a1d7e9c0803e9c902ca8e0d2f488b932ee2a429751a1c5bc4e859073d
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "9a236153d0ed19a973f09d5e91d5efffac6932927864cff9ce38cb74bc5e31b8",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T18:03:16Z",
"title_canon_sha256": "4f1ed45308da2ef4127da20d11a5a70c05ad7e972ddcc4c497973587c1bbc514"
},
"schema_version": "1.0",
"source": {
"id": "2605.15290",
"kind": "arxiv",
"version": 1
}
}