pith. sign in
Pith Number

pith:V2HVCKOO

pith:2026:V2HVCKOORCCJHIEJ6WTESR6M5Z
not attested not anchored not stored refs resolved

OSDN: Improving Delta Rule with Provable Online Preconditioning in Linear Attention

Chenyu Zhou, Dongdong Ge, Hongpei Li, Jianghao Lin, Yinyu Ye, Yuerou Liu

OSDN augments the Delta Rule with an online diagonal preconditioner equivalent to per-feature key scaling, delivering super-geometric convergence and 39% lower recall residual at 1.3B parameters.

arxiv:2605.13473 v1 · 2026-05-13 · cs.LG · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{V2HVCKOORCCJHIEJ6WTESR6M5Z}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

By exploiting the exact-quadratic structure of the inner regression loss, we establish super-geometric convergence against a right-Newton comparator and prove an algorithm-aligned token-local residual contraction bound; at 1.3B parameters OSDN achieves a 39% reduction in the recall residual ratio.

C2weakest assumption

The inner objective remains exactly quadratic and the online hypergradient update for the diagonal preconditioner can be maintained without breaking the chunkwise parallel pipeline or requiring high-dimensional state.

C3one line summary

OSDN adds online diagonal preconditioning to the Delta Rule, preserving chunkwise parallelism while proving super-geometric convergence and delivering 32-39% recall gains at 340M-1.3B scales.

References

73 extracted · 73 resolved · 5 Pith anchors

[1] What learning algorithm is in-context learning? investigations with linear models 2023
[2] Zoology: Measuring and improving recall in efficient language models 2024
[3] Simple linear attention language models balance the recall-throughput tradeoff 2024
[4] Just read twice: closing the recall gap for recurrent language models, 2024 b 2024
[5] Hinton, V olodymyr Mnih, Joel Z 2016
Receipt and verification
First computed 2026-05-18T02:44:41.519789Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

ae8f5129ce888493a089f5a64947ccee65eaccf648772ff79bf2e6f538521557

Aliases

arxiv: 2605.13473 · arxiv_version: 2605.13473v1 · doi: 10.48550/arxiv.2605.13473 · pith_short_12: V2HVCKOORCCJ · pith_short_16: V2HVCKOORCCJHIEJ · pith_short_8: V2HVCKOO
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/V2HVCKOORCCJHIEJ6WTESR6M5Z \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ae8f5129ce888493a089f5a64947ccee65eaccf648772ff79bf2e6f538521557
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "5093d0f0e9ba463cb4be5369fa3e04276e01fe1791c1dbd14e4898ace465fb14",
    "cross_cats_sorted": [
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T12:59:26Z",
    "title_canon_sha256": "8afc6aea63fda48a49767a60fbfc6e89d4c2a22468ae4b510ed73d342aa7b9e1"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13473",
    "kind": "arxiv",
    "version": 1
  }
}