pith:U5KALI76
Finite Scalar Quantization: VQ-VAE Made Simple
FSQ replaces vector quantization in VQ-VAEs by projecting latents to a few dimensions and quantizing each independently to fixed levels.
arxiv:2309.15505 v2 · 2023-09-27 · cs.CV · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{U5KALI76GX5OZF7IVTQ3GLLULE}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Despite the much simpler design of FSQ, we obtain competitive performance in all these tasks. We emphasize that FSQ does not suffer from codebook collapse and does not need the complex machinery employed in VQ (commitment losses, codebook reseeding, code splitting, entropy penalties, etc.) to learn expressive discrete representations.
That projecting the VAE latent to a small number of dimensions (typically less than 10) and quantizing each independently to fixed levels preserves sufficient representational capacity for the downstream tasks to match VQ performance.
Finite scalar quantization simplifies VQ-VAE latents by independently rounding a few dimensions to fixed levels, producing an equivalent-sized implicit codebook with competitive performance and no collapse.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:46.967163Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
a75405a3fe35faec97e8ace1b32d7459196712903675423b00ec02d36aade776
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/U5KALI76GX5OZF7IVTQ3GLLULE \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a75405a3fe35faec97e8ace1b32d7459196712903675423b00ec02d36aade776
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "38f0f105f4217052b88ab6b5dbfc6a4369c298813925934c25e84d837197197d",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CV",
"submitted_at": "2023-09-27T09:13:40Z",
"title_canon_sha256": "62083644e49d78ab01722dcc2435bd6d5edf475e93cf1896c3bc04031e788647"
},
"schema_version": "1.0",
"source": {
"id": "2309.15505",
"kind": "arxiv",
"version": 2
}
}