pith. sign in
Pith Number

pith:TN6TYKXH

pith:2025:TN6TYKXHBTFO5YS4ZQKFT4CEP5
not attested not anchored not stored refs resolved

Test-Time Training Done Right

Fujun Luan, Hao Tan, Kai Zhang, Kalyan Sunkavalli, Sai Bi, Songlin Yang, Tianyuan Zhang, William T. Freeman, Yicong Hong

Large-chunk updates during inference make test-time training efficient enough to scale nonlinear states to 40 percent of model parameters.

arxiv:2505.23884 v1 · 2025-05-29 · cs.LG · cs.CL · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{TN6TYKXHBTFO5YS4ZQKFT4CEP5}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

LaCT improves hardware utilization by orders of magnitude, facilitates scaling of nonlinear state size (up to 40% of model parameters), and enables 14B-parameter AR video diffusion on 56K tokens and 1M-token novel view synthesis without custom kernels.

C2weakest assumption

That performing weight updates on extremely large chunks (2K–1M tokens) preserves or improves modeling quality compared with the fine-grained causal updates used in prior TTT work.

C3one line summary

Large-chunk online updates during inference let test-time training scale state capacity to 40% of model size and handle contexts up to 1M tokens without custom kernels.

References

75 extracted · 75 resolved · 19 Pith anchors

[1] Attention is all you need 2017
[2] Learning to (Learn at Test Time): RNNs with Expressive Hidden States 2024 · arXiv:2407.04620
[3] Linear transformers are secretly fast weight programmers 2021
[4] Ke Alexander Wang, Jiaxin Shi, and Emily B. Fox. Test-time regression: a unifying framework for designing sequence models with associative memory, 2025 2025
[5] Titans: Learning to Memorize at Test Time 2024 · arXiv:2501.00663

Formal links

3 machine-checked theorem links

Cited by

30 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:48.037169Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

9b7d3c2ae70ccaeee25ccc1459f0447f7e2a46c7cf1ca4d52ab35e26f0bf7927

Aliases

arxiv: 2505.23884 · arxiv_version: 2505.23884v1 · doi: 10.48550/arxiv.2505.23884 · pith_short_12: TN6TYKXHBTFO · pith_short_16: TN6TYKXHBTFO5YS4 · pith_short_8: TN6TYKXH
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TN6TYKXHBTFO5YS4ZQKFT4CEP5 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9b7d3c2ae70ccaeee25ccc1459f0447f7e2a46c7cf1ca4d52ab35e26f0bf7927
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "3d69dd83bd8506725dbd63ec554147402277ee729a95527a096d51c9e74cc2b2",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2025-05-29T17:50:34Z",
    "title_canon_sha256": "8d890dd60c819654346bdec1702e4247b845985dd9767bc783261caa25ebde1e"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2505.23884",
    "kind": "arxiv",
    "version": 1
  }
}