pith:K4CIF6EL
MPM: Mutual Pair Merging for Efficient Vision Transformers
Mutual Pair Merging shortens vision transformer sequences for semantic segmentation by averaging mutual nearest-neighbor token pairs while preserving reconstruction for existing decoders.
arxiv:2604.05718 v1 · 2026-04-07 · cs.CV
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{K4CIF6ELOWVEHIM7A4BOYW5EGV}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
On ADE20K, MPM reduces per-image latency by up to 60% for ViT-Tiny on Raspberry Pi 5, and increases throughput by up to 20% on H100 with FlashAttention-2 while keeping the mIoU drop below 3%.
That the overhead of computing mutual nearest-neighbor pairs and the subsequent gather-based reconstruction remains small enough on the target hardware to produce net latency gains, and that the merge map allows existing segmentation heads to be used unchanged without further accuracy degradation.
MPM merges mutual nearest-neighbor token pairs in cosine space for ViTs, records a merge map for reconstruction, and delivers up to 60% latency reduction on Raspberry Pi 5 and 20% throughput gain on H100 with under 3% mIoU drop on ADE20K.
References
Formal links
Receipt and verification
| First computed | 2026-06-03T01:05:50.228775Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
570482f88b75aa43a19f0702ec5ba435602d63f91baae81575c8c22a8e57762a
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/K4CIF6ELOWVEHIM7A4BOYW5EGV \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 570482f88b75aa43a19f0702ec5ba435602d63f91baae81575c8c22a8e57762a
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "7968cca217d129e3bae8b2cd1b9e35f83ac114439fddf5cb883732628f028cbd",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by-sa/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2026-04-07T11:16:18Z",
"title_canon_sha256": "9cae031688c14722c93202bc3e868fec19939cee50f2666bf964b449622204af"
},
"schema_version": "1.0",
"source": {
"id": "2604.05718",
"kind": "arxiv",
"version": 1
}
}