pith:MDJKPL5S
Massive Activations in Large Language Models
Large language models contain a small number of massive activations that remain constant across inputs and act as indispensable bias terms.
arxiv:2402.17762 v2 · 2024-02-27 · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{MDJKPL5S3IEMCXYQ4Z5CCVQHIW}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
very few activations exhibit significantly larger values than others (e.g., 100,000 times larger). We call them massive activations... their values largely stay constant regardless of the input, and they function as indispensable bias terms in LLMs... these massive activations lead to the concentration of attention probabilities to their corresponding tokens.
That the observed constancy of massive activation values and their role as indispensable bias terms generalize across all LLMs, inputs, and architectures based on the limited set of models and characterizations performed.
Massive activations are constant large values in LLMs that function as indispensable bias terms and concentrate attention probabilities on specific tokens.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:48.754966Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
60d2a7afb2da08c15f10e67a215607459bca6ed57194e20a0f3dbc5b94bfe664
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/MDJKPL5S3IEMCXYQ4Z5CCVQHIW \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 60d2a7afb2da08c15f10e67a215607459bca6ed57194e20a0f3dbc5b94bfe664
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "161a0a2b92c9dbee51eb5242b3c2633f8c7a752a67326dc218027ce16a6a8324",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2024-02-27T18:55:17Z",
"title_canon_sha256": "1375592bd25780fa45da9e4a454856fb6a1918f2dfb6bdb9df98135e4a994fe3"
},
"schema_version": "1.0",
"source": {
"id": "2402.17762",
"kind": "arxiv",
"version": 2
}
}