Pith Number

pith:4TZ3RWFY

pith:2026:4TZ3RWFY4YUNRXZDRHBU4QWBXN

not attested not anchored not stored refs resolved

Darwin Family: MRI-Trust-Weighted Evolutionary Merging for Training-Free Scaling of Language-Model Reasoning

Jaewon Jang, Junghoon Shin, Minseo Kim, Minsik Kim, Sunyoung Choi, Taebong Kim, Youngsik Hong

Evolutionary merging of existing language model checkpoints produces superior reasoning performance without any training.

arxiv:2605.14386 v1 · 2026-05-14 · cs.NE · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{4TZ3RWFY4YUNRXZDRHBU4QWBXN}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The flagship Darwin-27B-Opus achieves 86.9% on GPQA Diamond, ranking #6 among 1,252 evaluated models, and outperforming its fully trained foundation model without any gradient-based training.

C2weakest assumption

That the evolutionary recombination guided by the 14-dimensional merge genome and MRI-Trust Fusion can reliably reorganize latent reasoning capabilities already present in existing checkpoints without loss of coherence or introduction of new failure modes.

C3one line summary

Evolutionary merging with a 14-dimensional genome and MRI-Trust Fusion produces models that outperform their trained parents on reasoning benchmarks without any gradient updates.

References

31 extracted · 31 resolved · 3 Pith anchors

[1] Chain-of-thought prompting elicits reason- ing in large language models 2022

[2] Takeshi Kojima, Shixiang Gu, M. Reid, et al. Large language models are zero-shot reasoners. InNeural Information Processing Systems, 2022 2022

[3] Self-consistency improves chain-of-thought reasoning in language models 2023

[4] Least-to-most prompting enables complex rea- soning in large language models 2023

[5] Bert rediscovers the classical nlp pipeline 2019

Formal links

2 machine-checked theorem links

Receipt and verification

First computed	2026-05-17T23:39:07.667925Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

e4f3b8d8b8e628d8df2389c34e42c1bb6858967182d0c392ae44719addf25ed4

Aliases

arxiv: 2605.14386 · arxiv_version: 2605.14386v1 · doi: 10.48550/arxiv.2605.14386 · pith_short_12: 4TZ3RWFY4YUN · pith_short_16: 4TZ3RWFY4YUNRXZD · pith_short_8: 4TZ3RWFY

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/4TZ3RWFY4YUNRXZDRHBU4QWBXN \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e4f3b8d8b8e628d8df2389c34e42c1bb6858967182d0c392ae44719addf25ed4

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "b2edeca0f92cfa052adfac5ff244477531547f1ea36f88c4a7da21f1c172fd40",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.NE",
    "submitted_at": "2026-05-14T05:09:12Z",
    "title_canon_sha256": "c461aa8820472e61db747a2d0a7182cb1675e71409a9b4300174836fc1f5f49f"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.14386",
    "kind": "arxiv",
    "version": 1
  }
}