Pith Number

pith:5GDYLWOL

pith:2026:5GDYLWOL7NGIY3SRLTSKHFM63X

not attested not anchored not stored refs resolved

Bayesian Model Merging

Kaiyang Li, Qing Su, Shaobo Han, Shihao Ji

Bayesian Model Merging fuses task-specific models into one via inner Bayesian regression under anchor priors and outer Bayesian optimization of per-module hyperparameters.

arxiv:2605.12843 v1 · 2026-05-13 · cs.LG · cs.AI

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{5GDYLWOL7NGIY3SRLTSKHFM63X}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across extensive benchmarks, including up to 20-task merging in vision and 5-task merging in language, BMM consistently outperforms all plug-and-play anchor baselines (e.g., TA, WUDI-Merging, and TSV). In particular, on the ViT-L/14 benchmark for 8-task merging, a single merged model reaches 95.1, closely matching the average performance of eight task-specific experts (95.8).

C2weakest assumption

The claimed alignment between activation statistics and task vectors that enables the data-free Gram-matrix estimation, together with the assumption that the inner-level Bayesian regression under the anchor prior produces a solution that generalizes without hidden post-hoc adjustments.

C3one line summary

Bayesian Model Merging introduces a bi-level optimization framework that merges task-specific models via closed-form Bayesian regression with an anchor prior and global hyperparameter search, outperforming baselines and nearly matching expert averages on up to 20-task vision and 5-task language Merg

References

66 extracted · 66 resolved · 11 Pith anchors

[1] An Overview of Multi-Task Learning in Deep Neural Networks 2017 · arXiv:1706.05098

[2] Matena and Colin A 2022

[3] Editing models with task arithmetic 2023

[4] Raffel, and Mohit Bansal 2023

[5] The hugging face hub 2026

Formal links

1 machine-checked theorem link

Receipt and verification

First computed	2026-05-18T03:09:11.934066Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

e98785d9cbfb4c8c6e515ce4a3959eddff2175548d26e17b1e05dfdf0e48e916

Aliases

arxiv: 2605.12843 · arxiv_version: 2605.12843v1 · doi: 10.48550/arxiv.2605.12843 · pith_short_12: 5GDYLWOL7NGI · pith_short_16: 5GDYLWOL7NGIY3SR · pith_short_8: 5GDYLWOL

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/5GDYLWOL7NGIY3SRLTSKHFM63X \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: e98785d9cbfb4c8c6e515ce4a3959eddff2175548d26e17b1e05dfdf0e48e916

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "c31255ea55fe59d514aec810f8bb685f645b85e58b45ffc7ddd80805d91a6a11",
    "cross_cats_sorted": [
      "cs.AI"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T00:36:47Z",
    "title_canon_sha256": "7d511b61ff1cd74c70bf56c33d21e3963dd51d07833af05b906bc12ee40a9b8d"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.12843",
    "kind": "arxiv",
    "version": 1
  }
}