Pith Number

pith:F3SQWA3G

pith:2024:F3SQWA3GM44IXO5Y3EQ3SSF4O5

not attested not anchored not stored refs resolved

SpinQuant: LLM quantization with learned rotations

Bilge Soran, Changsheng Zhao, Dhruv Choudhary, Igor Fedorov, Raghuraman Krishnamoorthi, Tijmen Blankevoort, Vikas Chandra, Yuandong Tian, Zechun Liu

SpinQuant learns rotation matrices to quantize LLM weights, activations, and KV cache to 4 bits while keeping outputs identical in full precision.

arxiv:2405.16406 v4 · 2024-05-26 · cs.LG · cs.AI · cs.CL · cs.CV

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{F3SQWA3GM44IXO5Y3EQ3SSF4O5}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

With 4-bit quantization of weight, activation, and KV-cache, SpinQuant narrows the accuracy gap on zero-shot reasoning tasks with full precision to merely 2.9 points on the LLaMA-2 7B model, surpassing LLM-QAT by 19.1 points and SmoothQuant by 25.0 points.

C2weakest assumption

That learned rotation matrices found on calibration data will generalize to preserve full-precision outputs and improve quantization accuracy across diverse downstream tasks without introducing new errors.

C3one line summary

SpinQuant learns optimal rotations to enable accurate 4-bit quantization of LLM weights, activations, and KV cache, reducing the zero-shot gap to full precision to 2.9 points on LLaMA-2 7B.

References

33 extracted · 33 resolved · 14 Pith anchors

[1] GPT-4 Technical Report · arXiv:2303.08774

[2] BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions 1905 · arXiv:1905.10044

[3] Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge · arXiv:1803.05457

[4] Extreme compression of large language models via additive quantization

[5] GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers · arXiv:2210.17323

Formal links

3 machine-checked theorem links

Cited by

37 papers in Pith

Not All Tasks Quantize Equally: Fisher-Guided Quantization for Visual Geometry Transformer

MixLLM: LLM Quantization with Global Mixed-precision between Output-features and Highly-efficient System Design

You Had One Job: Per-Task Quantization Using LLMs' Hidden Representations

Theory-optimal Quantization Based on Flatness

Not All Tasks Quantize Equally: Fisher-Guided Quantization for Visual Geometry Transformer

Receipt and verification

First computed	2026-05-17T23:38:51.013172Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

2ee50b036667388bbbb8d921b948bc7750779dea0b22b98eabe92be28d7cfed6

Aliases

arxiv: 2405.16406 · arxiv_version: 2405.16406v4 · doi: 10.48550/arxiv.2405.16406 · pith_short_12: F3SQWA3GM44I · pith_short_16: F3SQWA3GM44IXO5Y · pith_short_8: F3SQWA3G

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/F3SQWA3GM44IXO5Y3EQ3SSF4O5 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 2ee50b036667388bbbb8d921b948bc7750779dea0b22b98eabe92be28d7cfed6

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "1538f48e6afe2d097304cf1c4b0d8a7c258b50ad60ddaf75d8de3fc142c5dd7a",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.CV"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2024-05-26T02:15:49Z",
    "title_canon_sha256": "ba0aab9cac8a079304b5cac58e1b79301b39a03caf2c9306481eb82595e638bd"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2405.16406",
    "kind": "arxiv",
    "version": 4
  }
}