pith. sign in
Pith Number

pith:RBKAZ7AD

pith:2024:RBKAZ7ADGTBMDY7K2SR4DB7UD5
not attested not anchored not stored refs pending

ColPali: Efficient Document Retrieval with Vision Language Models

Bilel Omrani, C\'eline Hudelot, Gautier Viaud, Hugues Sibille, Manuel Faysse, Pierre Colombo, Tony Wu

Directly embedding images of document pages with a vision language model outperforms text extraction pipelines in retrieval tasks.

arxiv:2407.01449 v6 · 2024-06-27 · cs.IR · cs.CL · cs.CV

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{RBKAZ7ADGTBMDY7K2SR4DB7UD5}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

ColPali largely outperforms modern document retrieval pipelines while being drastically simpler, faster and end-to-end trainable.

C2weakest assumption

That direct image embeddings from a vision-language model capture all necessary semantic and layout information better than text extraction pipelines across the tested domains and languages.

C3one line summary

ColPali embeds document page images with a vision-language model and late interaction to outperform text-based retrieval pipelines on a new visual document benchmark while being simpler and faster.

Formal links

2 machine-checked theorem links

Cited by

33 papers in Pith

Receipt and verification
First computed 2026-05-17T23:38:53.807452Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

88540cfc0334c2c1e3ead4a3c187f41f59f6489d63cba4485a298a737bd882eb

Aliases

arxiv: 2407.01449 · arxiv_version: 2407.01449v6 · doi: 10.48550/arxiv.2407.01449 · pith_short_12: RBKAZ7ADGTBM · pith_short_16: RBKAZ7ADGTBMDY7K · pith_short_8: RBKAZ7AD
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/RBKAZ7ADGTBMDY7K2SR4DB7UD5 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 88540cfc0334c2c1e3ead4a3c187f41f59f6489d63cba4485a298a737bd882eb
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "d5f28713d6a2afcb63490758f7208bc4f97ffac8fcf8140a266c97ffe4f0ecda",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.CV"
    ],
    "license": "http://creativecommons.org/publicdomain/zero/1.0/",
    "primary_cat": "cs.IR",
    "submitted_at": "2024-06-27T15:45:29Z",
    "title_canon_sha256": "fe7ac4341d12ea95464629e8bc1ed64f6ef6f6341694a7fe2fbe9923c7a430fd"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2407.01449",
    "kind": "arxiv",
    "version": 6
  }
}