Pith Number

pith:WCHBCUAJ

pith:2024:WCHBCUAJDPJA3BI2DVHH5U5GUG

not attested not anchored not stored refs resolved

Titans: Learning to Memorize at Test Time

Ali Behrouz, Peilin Zhong, Vahab Mirrokni

Titans combine attention with a learnable neural long-term memory to handle contexts over two million tokens more effectively than Transformers or linear recurrent models.

arxiv:2501.00663 v1 · 2024-12-31 · cs.LG · cs.AI · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{WCHBCUAJDPJA3BI2DVHH5U5GUG}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our experimental results on language modeling, common-sense reasoning, genomics, and time series tasks show that Titans are more effective than Transformers and recent modern linear recurrent models. They further can effectively scale to larger than 2M context window size with higher accuracy in needle-in-haystack tasks compared to baselines.

C2weakest assumption

That the neural memory module can reliably learn to store and retrieve relevant historical information without catastrophic forgetting or introducing new failure modes that offset the claimed gains, especially when the training objective does not explicitly supervise the memory contents.

C3one line summary

Titans combine attention for current context with a learnable neural memory for long-term history, achieving better performance and scaling to over 2M-token contexts on language, reasoning, genomics, and time-series tasks.

References

139 extracted · 139 resolved · 24 Pith anchors

[1] GPT-4 Technical Report 2023 · arXiv:2303.08774

[2] Linear Transformers with Learnable Kernel Functions are Better In-Context Models 2024

[3] Learning to learn by gradient descent by gradient descent 2016

[4] Exploring length generalization in large language models 2022

[5] Simple linear attention language models balance the recall-throughput tradeoff 2024

Formal links

2 machine-checked theorem links

Cited by

42 papers in Pith

LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation

CompilerKV: Risk-Adaptive KV Compression via Offline Experience Compilation

Hybrid Architectures for Language Models: Systematic Analysis and Design Insights

Nirvana: A Specialized Generalist Model With Task-Aware Memory Mechanism

Higher-order Linear Attention

Receipt and verification

First computed	2026-05-17T23:39:21.525493Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

b08e1150091bd20d851a1d4e7ed3a6a1b85728467986a54b1264c50b5ba05ea7

Aliases

arxiv: 2501.00663 · arxiv_version: 2501.00663v1 · doi: 10.48550/arxiv.2501.00663 · pith_short_12: WCHBCUAJDPJA · pith_short_16: WCHBCUAJDPJA3BI2 · pith_short_8: WCHBCUAJ

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/WCHBCUAJDPJA3BI2DVHH5U5GUG \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b08e1150091bd20d851a1d4e7ed3a6a1b85728467986a54b1264c50b5ba05ea7

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "2e206822891bb75ad3edfac5f675ae2117dd8dc18a8e770fe3e7037c8bcd6d5b",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2024-12-31T22:32:03Z",
    "title_canon_sha256": "68ab678edefb0c80939e9ec6ad62f8f70af0a8957580f19f811a11a8a0a22891"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2501.00663",
    "kind": "arxiv",
    "version": 1
  }
}