pith. sign in
Pith Number

pith:QEET3LAD

pith:2019:QEET3LADZ3NCLD36BMTOHQVDRT
not attested not anchored not stored refs resolved

Root Mean Square Layer Normalization

Biao Zhang, Rico Sennrich

RMSNorm delivers re-scaling invariance and comparable accuracy to LayerNorm while cutting computation by skipping mean subtraction, yielding 7-64% runtime reductions across tested models.

arxiv:1910.07467 v1 · 2019-10-16 · cs.LG · cs.CL · stat.ML

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{QEET3LADZ3NCLD36BMTOHQVDRT}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Extensive experiments on several tasks using diverse network architectures show that RMSNorm achieves comparable performance against LayerNorm but reduces the running time by 7%~64% on different models.

C2weakest assumption

Re-centering invariance in LayerNorm is dispensable for the stabilization and convergence benefits the method provides.

C3one line summary

RMSNorm delivers re-scaling invariance and comparable accuracy to LayerNorm while cutting computation by skipping mean subtraction, yielding 7-64% runtime reductions across tested models.

References

37 extracted · 37 resolved · 16 Pith anchors

[1] Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng 2016
[2] Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks 2016 · arXiv:1603.01431
[3] Layer Normalization 2016 · arXiv:1607.06450
[4] Neural Machine Translation by Jointly Learning to Align and Translate 2014 · arXiv:1409.0473
[5] Understanding batch normalization 2018

Formal links

2 machine-checked theorem links

Cited by

24 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:13.382892Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

81093dac03ceda258f7e0b26e3c2a38cc7a8d7e8d5b8bf146611ab68d6d2dc25

Aliases

arxiv: 1910.07467 · arxiv_version: 1910.07467v1 · doi: 10.48550/arxiv.1910.07467 · pith_short_12: QEET3LADZ3NC · pith_short_16: QEET3LADZ3NCLD36 · pith_short_8: QEET3LAD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/QEET3LADZ3NCLD36BMTOHQVDRT \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 81093dac03ceda258f7e0b26e3c2a38cc7a8d7e8d5b8bf146611ab68d6d2dc25
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "010165e0e38b20055e51086183e93942e4b020d8ca5e1b88c446daa3842a151b",
    "cross_cats_sorted": [
      "cs.CL",
      "stat.ML"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2019-10-16T16:44:22Z",
    "title_canon_sha256": "13f2e705c8860613b17cff09bdab6ee432490ed93dbe9c4b5baad45f52d78345"
  },
  "schema_version": "1.0",
  "source": {
    "id": "1910.07467",
    "kind": "arxiv",
    "version": 1
  }
}