pith. sign in
Pith Number

pith:GCGY5GYL

pith:2026:GCGY5GYL2WNFEH7SUV3CPVEPVC
not attested not anchored not stored refs pending

Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

Chengwei Qin, Chen Liu, Chonghan Liu, Hanzhen Zhao, Hao Tang, Hui Xiong, Shuicheng Yan, Wenjie Zhang, Xiaobin Hu, Xiaomin Yu, Xiaoxing Hu, Yi Xin, Yuhui Zhang, Yu Qiao, Ziyue Qiao

ReAlign aligns text embeddings to image distributions via a training-free three-step process using unpaired data, letting MLLMs pretrain without paired image-text examples.

arxiv:2602.07026 v3 · 2026-02-02 · cs.CV · cs.AI · cs.MM

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{GCGY5GYL2WNFEH7SUV3CPVEPVC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

ReAlign, a training-free three-step procedure (Anchor, Trace, Centroid Alignment) that uses statistics from massive unpaired data, explicitly rectifies geometric misalignment so that unpaired text can substitute for paired image-text data in MLLM pretraining.

C2weakest assumption

The Fixed-frame Modality Gap Theory assumes that the decomposition into stable biases and anisotropic residuals remains valid when the reference frame is frozen and that the statistics computed from unpaired data accurately capture the target image distribution without introducing new distortions.

C3one line summary

ReAlign corrects the modality gap in unpaired data to let MLLMs learn visual distributions from text alone before instruction tuning, reducing dependence on expensive paired corpora.

Formal links

1 machine-checked theorem link

Cited by

6 papers in Pith

Receipt and verification
First computed 2026-06-08T01:03:58.232218Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

308d8e9b0bd59a521ff2a57627d48fa89f245c840d23165effe84d74f3930f9e

Aliases

arxiv: 2602.07026 · arxiv_version: 2602.07026v3 · doi: 10.48550/arxiv.2602.07026 · pith_short_12: GCGY5GYL2WNF · pith_short_16: GCGY5GYL2WNFEH7S · pith_short_8: GCGY5GYL
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/GCGY5GYL2WNFEH7SUV3CPVEPVC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 308d8e9b0bd59a521ff2a57627d48fa89f245c840d23165effe84d74f3930f9e
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "396849b50901b5c2d18b42f454bf27b4f63b4e4caa7b19452aa641d8a0379828",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.MM"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
    "primary_cat": "cs.CV",
    "submitted_at": "2026-02-02T13:59:39Z",
    "title_canon_sha256": "a3a004a856edd2a7c835034529bccf638be46badc1f9285774d0bc71a3b3d631"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2602.07026",
    "kind": "arxiv",
    "version": 3
  }
}