pith. sign in
Pith Number

pith:TWV2C5X6

pith:2026:TWV2C5X66W4PZLZE2BGYUL7XZ3
not attested not anchored not stored refs pending

Evaluating Relational Reasoning in LLMs with REL

Ada Fang, Lukas Fesser, Marinka Zitnik, Sham M. Kakade, Yasha Ektefaie

Frontier LLMs show steady performance drops on relational tasks as the number of entities that must bind together increases, even with fixed total entities and extra compute.

arxiv:2604.12176 v2 · 2026-04-14 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{TWV2C5X66W4PZLZE2BGYUL7XZ3}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across frontier LLMs, performance degrades consistently and monotonically as RC increases, even when the total number of entities is held fixed. This failure mode persists with increased test-time compute and in-context learning, suggesting a limitation tied to the arity of the required relational binding rather than to insufficient inference steps or lack of exposure to examples.

C2weakest assumption

That the generative tasks in REL truly isolate relational complexity (arity of binding) without introducing uncontrolled confounders in input structure, vocabulary, or task framing that could explain the performance drop instead.

C3one line summary

LLMs show consistent performance degradation on higher-arity relational reasoning tasks in a new benchmark REL that isolates relational complexity across scientific domains.

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-06-03T01:05:13.536512Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

9daba176fef5b8fcaf24d04d8a2ff7cef68d180e20b89c23839d79d1829eb18e

Aliases

arxiv: 2604.12176 · arxiv_version: 2604.12176v2 · doi: 10.48550/arxiv.2604.12176 · pith_short_12: TWV2C5X66W4P · pith_short_16: TWV2C5X66W4PZLZE · pith_short_8: TWV2C5X6
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TWV2C5X66W4PZLZE2BGYUL7XZ3 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9daba176fef5b8fcaf24d04d8a2ff7cef68d180e20b89c23839d79d1829eb18e
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "55d280e5ff2c9ff860bb26d96a76c6a5e20046c66f90111525e574df15e7e675",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-04-14T01:07:15Z",
    "title_canon_sha256": "a851621559ef56051cf7edbbffbc879b3072759f2dd251310457fead204c9472"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2604.12176",
    "kind": "arxiv",
    "version": 2
  }
}