pith. sign in
Pith Number

pith:B7KD4YH3

pith:2024:B7KD4YH37PEWBU7OB26KIDX3WY
not attested not anchored not stored refs resolved

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Alon Albalak, Atsushi Saito, Bart{\l}omiej Koptyra, Bingchen Zhao, Bo Peng, Cahya Wirawan, Daniel Goldstein, Eric Alcaide, Eugene Cheah, Fares Obeid, Guangyu Song, Haoqin Tu, Haowen Hou, Jan Koco\'n, Jiaju Lin, Jian Zhu, Kranthi Kiran GV, Niklas Muennighoff, Peng Zhou, Przemys{\l}aw Kazienko, Qihang Zhao, Quentin Anthony, Ronald McClelland Jr., Ruichong Zhang, Rui-Jie Zhu, Satyapriya Krishna, Stanis{\l}aw Wo\'zniak, Stella Biderman, Teddy Ferdinan, Xingjian Du

Matrix-valued states and dynamic recurrence let updated RWKV models reach competitive benchmark performance while keeping RNN inference speed.

arxiv:2404.05892 v4 · 2024-04-08 · cs.CL · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{B7KD4YH37PEWBU7OB26KIDX3WY}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

We trained four Eagle models, ranging from 0.46 to 7.5 billion parameters, and two Finch models with 1.6 and 3.1 billion parameters and find that they achieve competitive performance across a wide variety of benchmarks.

C2weakest assumption

That observed benchmark performance stems primarily from the matrix-valued states and dynamic recurrence rather than from the scale of the new 1.12-trillion-token corpus, tokenizer changes, or other unablated training choices.

C3one line summary

Eagle and Finch enhance RWKV with matrix-valued states and dynamic recurrence, trained on a 1.12-trillion-token multilingual corpus, and report competitive performance on standard benchmarks.

References

59 extracted · 59 resolved · 2 Pith anchors

[1] Generating Long Sequences with Sparse Transformers 2022 · doi:10.18653/v1/2022.bigscience-1.9
[2] Teddy Ferdinan, Jan Koco ´ n, and Przemysław Kazienko 2021 · doi:10.18653/v1/2022.findings-naacl.55
[3] Zhen Qin, Dong Li, Weigao Sun, Weixuan Sun, Xuyang Shen, Xiaodong Han, Yunshen Wei, Baohong Lv, Xiao Luo, Yu Qiao, and Yiran Zhong 2024 · doi:10.1162/neco.1992.4.1.131
[4] Group normalization 2018 · arXiv:1803.08494
[5] The most recent single-timestep input to the Time-mixing module, denoted as xt −1 ∈ RD , useful for the Token Shift. 33 Dataset Domain Wikipediaa Encyclopedia SlimPajama Web peS2o Academia BigPatent P 2023

Formal links

3 machine-checked theorem links

Cited by

20 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:12.646019Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

0fd43e60fbfbc960d3ee0ebca40efbb60b9ec47ea7bf337cdada864afbe7879b

Aliases

arxiv: 2404.05892 · arxiv_version: 2404.05892v4 · doi: 10.48550/arxiv.2404.05892 · pith_short_12: B7KD4YH37PEW · pith_short_16: B7KD4YH37PEWBU7O · pith_short_8: B7KD4YH3
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/B7KD4YH37PEWBU7OB26KIDX3WY \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0fd43e60fbfbc960d3ee0ebca40efbb60b9ec47ea7bf337cdada864afbe7879b
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "64fa49491abe0152dd19bd1c90a34ac52150a7c69a047b9e9385d85718405466",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2024-04-08T22:20:59Z",
    "title_canon_sha256": "2f2257ddd3304956e8d5d6057363a5d9905c3bc11b60c4a7be995006f13926b2"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2404.05892",
    "kind": "arxiv",
    "version": 4
  }
}