Pith Number

pith:V2HVCKOO

pith:2026:V2HVCKOORCCJHIEJ6WTESR6M5Z

not attested not anchored not stored refs resolved

OSDN: Improving Delta Rule with Provable Online Preconditioning in Linear Attention

Chenyu Zhou, Dongdong Ge, Hongpei Li, Jianghao Lin, Yinyu Ye, Yuerou Liu

OSDN augments the Delta Rule with an online diagonal preconditioner equivalent to per-feature key scaling, delivering super-geometric convergence and 39% lower recall residual at 1.3B parameters.

arxiv:2605.13473 v1 · 2026-05-13 · cs.LG · cs.CL

Open paper page JSON Open Graph Bundle Merged state What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{V2HVCKOORCCJHIEJ6WTESR6M5Z}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

By exploiting the exact-quadratic structure of the inner regression loss, we establish super-geometric convergence against a right-Newton comparator and prove an algorithm-aligned token-local residual contraction bound; at 1.3B parameters OSDN achieves a 39% reduction in the recall residual ratio.

C2weakest assumption

The inner objective remains exactly quadratic and the online hypergradient update for the diagonal preconditioner can be maintained without breaking the chunkwise parallel pipeline or requiring high-dimensional state.

C3one line summary

OSDN adds online diagonal preconditioning to the Delta Rule, preserving chunkwise parallelism while proving super-geometric convergence and delivering 32-39% recall gains at 340M-1.3B scales.

References

73 extracted · 73 resolved · 5 Pith anchors

[1] What learning algorithm is in-context learning? investigations with linear models 2023

[2] Zoology: Measuring and improving recall in efficient language models 2024

[3] Simple linear attention language models balance the recall-throughput tradeoff 2024

[4] Just read twice: closing the recall gap for recurrent language models, 2024 b 2024

[5] Hinton, V olodymyr Mnih, Joel Z 2016

Receipt and verification

First computed	2026-05-18T02:44:41.519789Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

ae8f5129ce888493a089f5a64947ccee65eaccf648772ff79bf2e6f538521557

Aliases

arxiv: 2605.13473 · arxiv_version: 2605.13473v1 · doi: 10.48550/arxiv.2605.13473 · pith_short_12: V2HVCKOORCCJ · pith_short_16: V2HVCKOORCCJHIEJ · pith_short_8: V2HVCKOO

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/V2HVCKOORCCJHIEJ6WTESR6M5Z \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ae8f5129ce888493a089f5a64947ccee65eaccf648772ff79bf2e6f538521557

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "5093d0f0e9ba463cb4be5369fa3e04276e01fe1791c1dbd14e4898ace465fb14",
    "cross_cats_sorted": [
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T12:59:26Z",
    "title_canon_sha256": "8afc6aea63fda48a49767a60fbfc6e89d4c2a22468ae4b510ed73d342aa7b9e1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13473",
    "kind": "arxiv",
    "version": 1
  }
}