pith:A2JKDCHH
Unsteady Metrics and Benchmarking Cultures of AI Model Builders
AI builders select benchmarks to fit marketing narratives rather than enable consistent scientific comparison.
arxiv:2605.14164 v1 · 2026-05-13 · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{A2JKDCHHRVJTZNMZSYNZ4DFZPJ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We argue that highlighted benchmarks function less as standardized measurement tools and more as flexible narrative devices prioritizing market positioning over scientific evaluation.
That the 139 model releases from 11 major builders and the 231 highlighted benchmarks they chose accurately capture the dominant evaluation practices and narrative strategies across the industry in 2025.
AI model builders mostly highlight unique benchmarks that act as flexible narrative tools for market positioning rather than standardized scientific measurements.
References
Receipt and verification
| First computed | 2026-05-17T23:39:11.436409Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
0692a188e78d533cb599961b9e0cb97a6079a414ecfdd85b52ac905d354f5bc7
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/A2JKDCHHRVJTZNMZSYNZ4DFZPJ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0692a188e78d533cb599961b9e0cb97a6079a414ecfdd85b52ac905d354f5bc7
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "11b0518dd17c08084c46cafee5b859f185a629992a6d0fd01e68ea2bfd3d041f",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2026-05-13T22:39:10Z",
"title_canon_sha256": "9e49ad973a3a1b8668c06496813326b3b941c1390bdc917806d74f9199b4820e"
},
"schema_version": "1.0",
"source": {
"id": "2605.14164",
"kind": "arxiv",
"version": 1
}
}