pith. sign in
Pith Number

pith:XTLBBB6Y

pith:2023:XTLBBB6Y4KTUBXU5EM27RGCOF7
not attested not anchored not stored refs resolved

Language Modeling Is Compression

Anian Ruoss, Christopher Mattern, Elliot Catt, Gr\'egoire Del\'etang, Joel Veness, Jordi Grau-Moya, Laurent Orseau, Li Kevin Wenliang, Marcus Hutter, Matthew Aitchison, Paul-Ambroise Duquenne, Tim Genewein

Large language models trained on text compress images and audio better than specialized tools.

arxiv:2309.10668 v2 · 2023-09-19 · cs.LG · cs.AI · cs.CL · cs.IT · math.IT

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{XTLBBB6Y4KTUBXU5EM27RGCOF7}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Chinchilla 70B, while trained primarily on text, compresses ImageNet patches to 43.4% and LibriSpeech samples to 16.4% of their raw size, beating domain-specific compressors like PNG (58.5%) or FLAC (30.3%), respectively.

C2weakest assumption

That the predictive distribution produced by the language model can be directly converted into a lossless compression scheme via arithmetic coding without significant overhead or implementation-specific losses that would invalidate the reported ratios.

C3one line summary

Large language models serve as strong general-purpose lossless compressors for text, images, and audio, outperforming domain-specific methods and revealing insights into scaling, tokenization, and in-context learning.

References

20 extracted · 20 resolved · 8 Pith anchors

[1] On the Opportunities and Risks of Foundation Models · arXiv:2108.07258
[2] Sparks of Artificial General Intelligence: Early experiments with GPT-4 · arXiv:2303.12712
[3] Scaling transformer to 1m tokens and beyond with rmt
[4] arXiv preprint arXiv:1710.09282 , year=
[5] Syntactically Informed Text Compression with Recurrent Neural Networks · arXiv:1608.02893

Cited by

18 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:12.795179Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

bcd61087d8e2a740de9d2335f8984e2ffad77aecf94016c81c774f0aefaddc2e

Aliases

arxiv: 2309.10668 · arxiv_version: 2309.10668v2 · doi: 10.48550/arxiv.2309.10668 · pith_short_12: XTLBBB6Y4KTU · pith_short_16: XTLBBB6Y4KTUBXU5 · pith_short_8: XTLBBB6Y
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/XTLBBB6Y4KTUBXU5EM27RGCOF7 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: bcd61087d8e2a740de9d2335f8984e2ffad77aecf94016c81c774f0aefaddc2e
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "def926c0e7ca4abc4365977dcb574ca4dabb545c6b3e14e3b6b81a9cc38c332a",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL",
      "cs.IT",
      "math.IT"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2023-09-19T14:50:38Z",
    "title_canon_sha256": "6e120a3a5dcd5a2ea8b8e58a3af16ddbf5cf63cc0fa224a78c89c0a65669247e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2309.10668",
    "kind": "arxiv",
    "version": 2
  }
}