pith. sign in
Pith Number

pith:EOXZTLKF

pith:2026:EOXZTLKFND4XCD5Z3ZO2DV5UZC
not attested not anchored not stored refs resolved

From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks

Anne Huang, Elena Filatova, Jinfan Frank Hu, Neh Majmudar

A conversion procedure turns Rosetta Stone puzzles into Match-Up versions that both expert humans and LLMs either solve completely or fail entirely.

arxiv:2605.13408 v1 · 2026-05-13 · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EOXZTLKFND4XCD5Z3ZO2DV5UZC}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Our results show that both expert human solvers and LLMs display an all-or-nothing pattern on Match-Up puzzles, either solving them completely or failing entirely.

C2weakest assumption

The assumption that the proposed systematic conversion procedure produces Match-Up puzzles that preserve the original linguistic reasoning demands and difficulty level of the Rosetta Stone versions.

C3one line summary

A conversion method generates paired Rosetta Stone and Match-Up linguistic puzzles, with benchmarks revealing that both humans and LLMs either fully solve or completely fail the Match-Up puzzles.

References

13 extracted · 13 resolved · 2 Pith anchors

[1] From Rosetta to Match-Up: A Paired Corpus of Linguistic Puzzles with Human and LLM Benchmarks 2024 · arXiv:2605.13408
[2] They provide insight into what constitutes a well-designed linguistic puzzle
[3] They can inform and guide the development of future puzzle-generation procedures
[4] They can support the analysis of LLMs’ chain- of-thought reasoning, helping to better under- stand differences in decision-making between humans and LLMs. The rest of the paper is organized as follows
[5] experienced solvers are better prepared to handle these [Rosetta Stone puzzles] than problems of other types 2024

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-05-18T02:44:47.480397Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

23af99ad4568f9710fb9de5da1d7b4c8803aebd186c4dadc4d84bb430a5db2ba

Aliases

arxiv: 2605.13408 · arxiv_version: 2605.13408v1 · doi: 10.48550/arxiv.2605.13408 · pith_short_12: EOXZTLKFND4X · pith_short_16: EOXZTLKFND4XCD5Z · pith_short_8: EOXZTLKF
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EOXZTLKFND4XCD5Z3ZO2DV5UZC \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 23af99ad4568f9710fb9de5da1d7b4c8803aebd186c4dadc4d84bb430a5db2ba
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "2adc96a7298c922e89e4484b313b5463db7ffca7eaac6d328f46f3db02fc521b",
    "cross_cats_sorted": [],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2026-05-13T12:03:35Z",
    "title_canon_sha256": "164259911cf09cb2afcbaf9df6f0ca83ab153f454bbf879a2f30d38a592ef7d3"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13408",
    "kind": "arxiv",
    "version": 1
  }
}