pith. sign in
Pith Number

pith:CAI4IMOS

pith:2023:CAI4IMOSTWORB26XEK3HMLNKKA
not attested not anchored not stored refs resolved

A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA

Damjan Kalajdzievski

LoRA adapters should be scaled by dividing by the square root of the rank rather than the full rank to stabilize learning.

arxiv:2312.03732 v1 · 2023-11-28 · cs.CL · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{CAI4IMOSTWORB26XEK3HMLNKKA}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

we study the impact of the scaling factor on the learning process and prove that LoRA adapters should be divided by a factor of the square root of the rank

C2weakest assumption

The proof that 1/sqrt(rank) is optimal rests on unstated assumptions about initialization variance, gradient flow, and the precise form of the LoRA update rule during fine-tuning; these assumptions are not detailed in the provided abstract.

C3one line summary

LoRA adapters should be scaled by 1/sqrt(rank) rather than 1/rank to stabilize learning and enable effective use of higher ranks during fine-tuning of large language models.

References

91 extracted · 91 resolved · 6 Pith anchors

[1] 2023 , publisher = 2023
[2] Aaron Gokaslan, Vanya Cohen, Ellie Pavlick, and Stefanie Tellex · doi:10.5281/zenodo.5371628
[3] Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge , author=. 2018 , eprint= 2018
[4] HellaSwag: Can a Machine Really Finish Your Sentence? , author=. 2019 , eprint= 2019
[5] Measuring Massive Multitask Language Understanding , author=. 2021 , eprint= 2021

Formal links

2 machine-checked theorem links

Cited by

24 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:50.002668Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

1011c431d29d9d10ebd722b6762daa50049629f32c1a69c94a3fd0ed1f29b933

Aliases

arxiv: 2312.03732 · arxiv_version: 2312.03732v1 · doi: 10.48550/arxiv.2312.03732 · pith_short_12: CAI4IMOSTWOR · pith_short_16: CAI4IMOSTWORB26X · pith_short_8: CAI4IMOS
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/CAI4IMOSTWORB26XEK3HMLNKKA \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 1011c431d29d9d10ebd722b6762daa50049629f32c1a69c94a3fd0ed1f29b933
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "fc96d70f4339ef6a7ea4f91c0a84441b8dc996a69d85a3dd76178a6f181d107c",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2023-11-28T03:23:20Z",
    "title_canon_sha256": "af5df5ac73aa5e580bcc05a69a919e9f79e524916c2b8436161eafa0c94e374d"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2312.03732",
    "kind": "arxiv",
    "version": 1
  }
}