pith. sign in
Pith Number

pith:RCPETT5H

pith:2024:RCPETT5HFXXTYZCIDGIVXYAWDV
not attested not anchored not stored refs resolved

Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Arjun Vikram, Carlos Guestrin, Genghan Zhang, Jiarui Xu, Karan Dalal, Sanmi Koyejo, Tatsunori Hashimoto, Xiaolong Wang, Xinhao Li, Xinlei Chen, Yann Dubois, Yu Sun

RNNs can match long-context performance by updating a learnable hidden-state model via self-supervised steps at test time.

arxiv:2407.04620 v4 · 2024-07-05 · cs.LG · cs.AI · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RCPETT5HFXXTYZCIDGIVXYAWDV}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

TTT-Linear and TTT-MLP can keep reducing perplexity by conditioning on more tokens, while Mamba cannot after 16k context.

C2weakest assumption

That performing gradient-based self-supervised updates on the hidden-state model at test time remains stable, computationally tractable, and beneficial without overfitting or excessive overhead at scale.

C3one line summary

TTT layers treat the hidden state as a trainable model updated at test time, allowing linear-complexity sequence models to scale perplexity reduction with context length unlike Mamba.

References

85 extracted · 85 resolved · 15 Pith anchors

[1] GPT-4 Technical Report 2023 · arXiv:2303.08774
[2] Learning to learn by gradient descent by gradient descent 2016
[3] You just found out your book was used to train ai 2023
[4] o ppel, Markus Spanring, Andreas Auer, Oleksandra Prudnikova, Michael Kopp, G \ 2024
[5] Learning a synaptic learning rule 1990

Formal links

2 machine-checked theorem links

Cited by

38 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.408085Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

889e49cfa72def3c644819915be0161d71812901998c79e2d764dfbfa76e92d6

Aliases

arxiv: 2407.04620 · arxiv_version: 2407.04620v4 · doi: 10.48550/arxiv.2407.04620 · pith_short_12: RCPETT5HFXXT · pith_short_16: RCPETT5HFXXTYZCI · pith_short_8: RCPETT5H
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RCPETT5HFXXTYZCIDGIVXYAWDV \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 889e49cfa72def3c644819915be0161d71812901998c79e2d764dfbfa76e92d6
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "05b4d8152342b055af443082b4000e3e33ae32d46334dbf1752401c3572d0c9e",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2024-07-05T16:23:20Z",
    "title_canon_sha256": "28bf260612ef235043aafc2f64009b40780baf577faf3678da216f6c9231734f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2407.04620",
    "kind": "arxiv",
    "version": 4
  }
}