pith:UC5ZIRQI
The Frequency Confound in Language-Model Surprisal and Metaphor Novelty
Word frequency beats surprisal at predicting metaphor novelty
arxiv:2605.06506 v2 · 2026-05-07 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UC5ZIRQIB5PW23CNGFHPOJRJ2Z}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Across settings, word frequency is a stronger predictor of metaphor novelty than surprisal. Across training stages, the surprisal--novelty association peaks at an early stage and then falls again, mirroring a similarly timed increase in the surprisal--frequency association.
That the collected metaphor novelty ratings reflect genuine human judgments of novelty rather than being influenced by frequency biases in the chosen stimuli or rater pool.
Word frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty association peaking early in training before declining.
Receipt and verification
| First computed | 2026-05-20T00:04:34.430097Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
a0bb9446080f5f6d6c4d314ef72629d65a3f32548d27f5714259d24fcd5bbb48
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UC5ZIRQIB5PW23CNGFHPOJRJ2Z \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a0bb9446080f5f6d6c4d314ef72629d65a3f32548d27f5714259d24fcd5bbb48
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "bd692ea9cdb6fa5a43f7e3e7532f960e881888f43a76e749f9026a8ab69f3594",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-05-07T16:20:37Z",
"title_canon_sha256": "23e27d57568f142fbf113b691c28a405b870c7a6df449b8e956eaa396453add6"
},
"schema_version": "1.0",
"source": {
"id": "2605.06506",
"kind": "arxiv",
"version": 2
}
}