pith:EKKVNZMN
Visual Aesthetic Benchmark: Can Frontier Models Judge Beauty?
Frontier multimodal models correctly pick both the best and worst image in only 26.5 percent of controlled aesthetic tasks, while human experts reach 68.9 percent.
arxiv:2605.12684 v1 · 2026-05-12 · cs.CV · cs.AI · cs.HC
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EKKVNZMNS5DASX2FSGHUW54TDN}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
the strongest system identifies both the best and the worst image correctly across three random permutations of the candidate order in only 26.5% of tasks, far below the 68.9% achieved by human experts.
That expert consensus on comparative preference over matched-subject sets constitutes a stable and generalizable ground truth for aesthetic quality, and that the 400 tasks sufficiently represent the space of visual aesthetic judgment.
The Visual Aesthetic Benchmark shows frontier MLLMs achieve 26.5% accuracy on comparative best/worst image selection versus 68.9% for human experts, with fine-tuning closing some of the gap.
References
Receipt and verification
| First computed | 2026-05-18T03:09:49.953603Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
229556e58d9746095f45918f4b77931b5a73fce8b6b8bc53e195b2871542fd43
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EKKVNZMNS5DASX2FSGHUW54TDN \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 229556e58d9746095f45918f4b77931b5a73fce8b6b8bc53e195b2871542fd43
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "4596c4afdac08db2cc45041be0ab0a872a85d4a94f3f47bab1190d76c43d8153",
"cross_cats_sorted": [
"cs.AI",
"cs.HC"
],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "cs.CV",
"submitted_at": "2026-05-12T19:33:28Z",
"title_canon_sha256": "72ad442ed7f2ab054cf1953df25d10a234f1634f7f5550ae5d1a254a1bc03be4"
},
"schema_version": "1.0",
"source": {
"id": "2605.12684",
"kind": "arxiv",
"version": 1
}
}