pith:AFWRNCUW
CArtBench: Evaluating Vision-Language Models on Chinese Art Understanding, Interpretation, and Authenticity
Vision-language models post high overall scores on Chinese art questions yet drop sharply on evidence linking, expert-style appreciation, and authenticity discrimination.
arxiv:2604.11632 v2 · 2026-04-13 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{AFWRNCUWPFLOABIJIV46PCW2PV}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Across nine representative VLMs, we find that high overall CURATORQA accuracy can mask sharp drops on hard evidence linking and style-to-period inference; long-form appreciation remains far from expert references; and authenticity-oriented diagnostic discrimination stays near chance, underscoring the difficulty of connoisseur-level reasoning for current models.
That alignment of Wikidata objects with authoritative Palace Museum catalog pages and expert ratings provides reliable, unbiased ground truth for defensible reinterpretation and authenticity discrimination tasks.
CArtBench shows VLMs achieve high scores on easy recognition but drop sharply on evidence linking, style inference, long-form appreciation, and authenticity discrimination for Chinese art.
Receipt and verification
| First computed | 2026-05-26T02:05:09.334005Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
016d168a967956e005094579e78ada7d63bc186b7f647094b1a2a2b687fcddc2
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/AFWRNCUWPFLOABIJIV46PCW2PV \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 016d168a967956e005094579e78ada7d63bc186b7f647094b1a2a2b687fcddc2
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "8da9c408844116d6fb5ab29be49f9b1e05be9eb123eaaa0f2869994b4a94027d",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-04-13T15:44:02Z",
"title_canon_sha256": "e12777a36cd3a8ff00d38005462bec5c0d026ef11809c2b9c5064260459bdb27"
},
"schema_version": "1.0",
"source": {
"id": "2604.11632",
"kind": "arxiv",
"version": 2
}
}