Pith Number

pith:EOXZTLKF

pith:2026:EOXZTLKFND4XCD5Z3ZO2DV5UZC

not attested not anchored not stored refs resolved

From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks

Anne Huang, Elena Filatova, Jinfan Frank Hu, Neh Majmudar

A conversion procedure turns Rosetta Stone puzzles into Match-Up versions that both expert humans and LLMs either solve completely or fail entirely.

arxiv:2605.13408 v1 · 2026-05-13 · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{EOXZTLKFND4XCD5Z3ZO2DV5UZC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our results show that both expert human solvers and LLMs display an all-or-nothing pattern on Match-Up puzzles, either solving them completely or failing entirely.

C2weakest assumption

The assumption that the proposed systematic conversion procedure produces Match-Up puzzles that preserve the original linguistic reasoning demands and difficulty level of the Rosetta Stone versions.

C3one line summary

A conversion method generates paired Rosetta Stone and Match-Up linguistic puzzles, with benchmarks revealing that both humans and LLMs either fully solve or completely fail the Match-Up puzzles.

References

13 extracted · 13 resolved · 2 Pith anchors

[1] From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks 2024 · arXiv:2605.13408

[2] They provide insight into what constitutes a well-designed linguistic puzzle

[3] They can inform and guide the development of future puzzle-generation procedures

[4] They can support the analysis of LLMs’ chain- of-thought reasoning, helping to better under- stand differences in decision-making between humans and LLMs. The rest of the paper is organized as follows

[5] experienced solvers are better prepared to handle these [Rosetta Stone puzzles] than problems of other types 2024

Cited by

1 paper in Pith

From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks

Receipt and verification

First computed	2026-05-18T02:44:47.480397Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

23af99ad4568f9710fb9de5da1d7b4c8803aebd186c4dadc4d84bb430a5db2ba

Aliases

arxiv: 2605.13408 · arxiv_version: 2605.13408v1 · doi: 10.48550/arxiv.2605.13408 · pith_short_12: EOXZTLKFND4X · pith_short_16: EOXZTLKFND4XCD5Z · pith_short_8: EOXZTLKF

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/EOXZTLKFND4XCD5Z3ZO2DV5UZC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 23af99ad4568f9710fb9de5da1d7b4c8803aebd186c4dadc4d84bb430a5db2ba

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "2adc96a7298c922e89e4484b313b5463db7ffca7eaac6d328f46f3db02fc521b",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T12:03:35Z",
    "title_canon_sha256": "164259911cf09cb2afcbaf9df6f0ca83ab153f454bbf879a2f30d38a592ef7d3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13408",
    "kind": "arxiv",
    "version": 1
  }
}