pith:W4TPLI53
PaLI-X: On Scaling up a Multilingual Vision and Language Model
Scaling up PaLI-X sets new state-of-the-art on most vision and language benchmarks and shows emergent capabilities.
arxiv:2305.18565 v1 · 2023-05-29 · cs.CV · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{W4TPLI53LVIHYZ6UCVBRL3443S}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
PaLI-X advances the state-of-the-art on most vision-and-language benchmarks considered (25+ of them) and exhibits emerging capabilities such as complex counting and multilingual object detection.
That increasing model size and broadening the training task mixture will reliably produce both higher benchmark scores and the observed emergent behaviors without requiring task-specific fine-tuning or additional architectural changes.
Scaling a multilingual vision-language model in size and training breadth yields new state-of-the-art results on over 25 benchmarks plus emerging abilities in counting and multilingual detection.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:13.814348Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
b726f5a3bb5d507c67d4154315ef9cdc823a705c1f6de9de4a167c79eed8008a
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/W4TPLI53LVIHYZ6UCVBRL3443S \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: b726f5a3bb5d507c67d4154315ef9cdc823a705c1f6de9de4a167c79eed8008a
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "94e18e05584083e5b1cbbe254aba4a9c4cbb4c6b8dc63d01ec220ef24117dc7a",
"cross_cats_sorted": [
"cs.CL",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2023-05-29T18:58:38Z",
"title_canon_sha256": "3c7150397808e13ff72bab5cc440587de27d0854e43b1fc23a199ea026595a24"
},
"schema_version": "1.0",
"source": {
"id": "2305.18565",
"kind": "arxiv",
"version": 1
}
}