pith. sign in
Pith Number

pith:UC5ZIRQI

pith:2026:UC5ZIRQIB5PW23CNGFHPOJRJ2Z
not attested not anchored not stored refs pending

The Frequency Confound in Language-Model Surprisal and Metaphor Novelty

Omar Momen, Sina Zarrie{\ss}

Word frequency beats surprisal at predicting metaphor novelty

arxiv:2605.06506 v2 · 2026-05-07 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UC5ZIRQIB5PW23CNGFHPOJRJ2Z}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across settings, word frequency is a stronger predictor of metaphor novelty than surprisal. Across training stages, the surprisal--novelty association peaks at an early stage and then falls again, mirroring a similarly timed increase in the surprisal--frequency association.

C2weakest assumption

That the collected metaphor novelty ratings reflect genuine human judgments of novelty rather than being influenced by frequency biases in the chosen stimuli or rater pool.

C3one line summary

Word frequency is a stronger predictor of metaphor novelty than LM surprisal, with the surprisal-novelty association peaking early in training before declining.

Receipt and verification
First computed 2026-05-20T00:04:34.430097Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

a0bb9446080f5f6d6c4d314ef72629d65a3f32548d27f5714259d24fcd5bbb48

Aliases

arxiv: 2605.06506 · arxiv_version: 2605.06506v2 · doi: 10.48550/arxiv.2605.06506 · pith_short_12: UC5ZIRQIB5PW · pith_short_16: UC5ZIRQIB5PW23CN · pith_short_8: UC5ZIRQI
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UC5ZIRQIB5PW23CNGFHPOJRJ2Z \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a0bb9446080f5f6d6c4d314ef72629d65a3f32548d27f5714259d24fcd5bbb48
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "bd692ea9cdb6fa5a43f7e3e7532f960e881888f43a76e749f9026a8ab69f3594",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-07T16:20:37Z",
    "title_canon_sha256": "23e27d57568f142fbf113b691c28a405b870c7a6df449b8e956eaa396453add6"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.06506",
    "kind": "arxiv",
    "version": 2
  }
}