pith:JXNDUWUH
What Makes Words Hard? Sakura at BEA 2026 Shared Task on Vocabulary Difficulty Prediction
Spelling difficulty and test item construction often drive ratings in standard vocabulary difficulty lists beyond genuine word production demands.
arxiv:2605.14257 v1 · 2026-05-14 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{JXNDUWUH4STOE2SI6PIJ6IMW23}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
The black-box model achieved r > 0.91 and topped the open track, while the explainable model reached r > 0.77 and showed that KVL item difficulty is affected by spelling difficulty or test item construction in addition to genuine production difficulty.
That the shared task dataset and KVL lists provide a clean measure of genuine word production difficulty without significant confounding from test design or spelling factors that the models are capturing post-hoc.
Fine-tuned LLM with soft-target loss tops shared task on vocabulary difficulty prediction at r>0.91 while explainable model at r>0.77 shows spelling and item construction affect difficulty beyond word production.
References
Receipt and verification
| First computed | 2026-05-17T23:39:10.516776Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
4dda3a5a87e4a6e26a48f3d09f2196d6ed1794bb13e3fe9fec276c8429898ac1
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/JXNDUWUH4STOE2SI6PIJ6IMW23 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 4dda3a5a87e4a6e26a48f3d09f2196d6ed1794bb13e3fe9fec276c8429898ac1
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "388081cf826dabe090d85772e1c019e4773ab53e10efc84ed5535f1767cb96c2",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-05-14T01:57:35Z",
"title_canon_sha256": "7d53ea9ef2e5768cab9f4ed43337e4448ba83cb0818ad7a4e6511ac45e362c0f"
},
"schema_version": "1.0",
"source": {
"id": "2605.14257",
"kind": "arxiv",
"version": 1
}
}