pith. sign in
Pith Number

pith:QIJS7POP

pith:2026:QIJS7POPXR6RYWPL2KUE7Y3VYC
not attested not anchored not stored refs resolved

Utility-Oriented Visual Evidence Selection for Multimodal Retrieval-Augmented Generation

Haofeng Zhang, Weiqing Luo, Xiao Wang, Zhiyuan Yu, Ziyi Huang, Zongye Hu

Ranking visual evidence by information gain on a latent helpfulness variable matches its answer-space utility in multimodal RAG

arxiv:2605.13277 v1 · 2026-05-13 · cs.CL · cs.AI · cs.CV · cs.IR · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{QIJS7POPXR6RYWPL2KUE7Y3VYC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

ranking evidence by information gain on this latent variable is equivalent to answer-space utility

C2weakest assumption

under mild assumptions

C3one line summary

Evidence utility is defined as information gain on the model's output distribution, with ranking by gain on a latent helpfulness variable shown equivalent to answer-space utility under mild assumptions, enabling a training-free surrogate framework that outperforms baselines.

References

43 extracted · 43 resolved · 11 Pith anchors

[1] VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks · arXiv:2410.05160
[2] Vlm2vec-v2: Advancing multimodal em- bedding for videos, images, and visual documents
[3] Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) , pages=
[4] E5-V: Universal Embeddings with Multimodal Large Language Models · arXiv:2407.12580
[5] GME: Improving Universal Multimodal Retrieval by Multimodal
Receipt and verification
First computed 2026-05-18T02:44:49.228866Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

82132fbdcfbc7d1c59ebd2a84fe375c09ce915c88ee618fca9b944316fd2c959

Aliases

arxiv: 2605.13277 · arxiv_version: 2605.13277v1 · doi: 10.48550/arxiv.2605.13277 · pith_short_12: QIJS7POPXR6R · pith_short_16: QIJS7POPXR6RYWPL · pith_short_8: QIJS7POP
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/QIJS7POPXR6RYWPL2KUE7Y3VYC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 82132fbdcfbc7d1c59ebd2a84fe375c09ce915c88ee618fca9b944316fd2c959
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a0eb75b8a0c660ed2a4f81960274c6969fbfa096c82856ae7542a61892f5e7d0",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CV",
      "cs.IR",
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T09:54:31Z",
    "title_canon_sha256": "9274f354ba30343f0b53f3f8d3261e368967506e66d00baef791b534caabfefa"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13277",
    "kind": "arxiv",
    "version": 1
  }
}