pith. sign in
Pith Number

pith:4TZ3RWFY

pith:2026:4TZ3RWFY4YUNRXZDRHBU4QWBXN
not attested not anchored not stored refs resolved

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Jaewon Jang, Junghoon Shin, Minseo Kim, Minsik Kim, Sunyoung Choi, Taebong Kim, Youngsik Hong

Evolutionary merging of existing language model checkpoints produces superior reasoning performance without any training.

arxiv:2605.14386 v1 · 2026-05-14 · cs.NE · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{4TZ3RWFY4YUNRXZDRHBU4QWBXN}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The flagship Darwin-27B-Opus achieves 86.9% on GPQA Diamond, ranking #6 among 1,252 evaluated models, and outperforming its fully trained foundation model without any gradient-based training.

C2weakest assumption

That the evolutionary recombination guided by the 14-dimensional merge genome and MRI-Trust Fusion can reliably reorganize latent reasoning capabilities already present in existing checkpoints without loss of coherence or introduction of new failure modes.

C3one line summary

Evolutionary merging with a 14-dimensional genome and MRI-Trust Fusion produces models that outperform their trained parents on reasoning benchmarks without any gradient updates.

References

31 extracted · 31 resolved · 3 Pith anchors

[1] Chain-of-thought prompting elicits reason- ing in large language models 2022
[2] Takeshi Kojima, Shixiang Gu, M. Reid, et al. Large language models are zero-shot reasoners. InNeural Information Processing Systems, 2022 2022
[3] Self-consistency improves chain-of-thought reasoning in language models 2023
[4] Least-to-most prompting enables complex rea- soning in large language models 2023
[5] Bert rediscovers the classical nlp pipeline 2019

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-17T23:39:07.667925Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

e4f3b8d8b8e628d8df2389c34e42c1bb6858967182d0c392ae44719addf25ed4

Aliases

arxiv: 2605.14386 · arxiv_version: 2605.14386v1 · doi: 10.48550/arxiv.2605.14386 · pith_short_12: 4TZ3RWFY4YUN · pith_short_16: 4TZ3RWFY4YUNRXZD · pith_short_8: 4TZ3RWFY
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/4TZ3RWFY4YUNRXZDRHBU4QWBXN \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e4f3b8d8b8e628d8df2389c34e42c1bb6858967182d0c392ae44719addf25ed4
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "b2edeca0f92cfa052adfac5ff244477531547f1ea36f88c4a7da21f1c172fd40",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.NE",
    "submitted_at": "2026-05-14T05:09:12Z",
    "title_canon_sha256": "c461aa8820472e61db747a2d0a7182cb1675e71409a9b4300174836fc1f5f49f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14386",
    "kind": "arxiv",
    "version": 1
  }
}