pith. sign in
Pith Number

pith:EFORJFNA

pith:2026:EFORJFNADYDAH7LN4WM5CM4KXJ
not attested not anchored not stored refs pending

Frontier Lag: A Bibliometric Audit of Capability Misrepresentation in Academic AI Evaluation

David Gringras, Misha Salahshoor

Academic LLM evaluations test models 10.85 ECI behind the frontier on average, with the lag widening and frequent overgeneralization to claims about AI.

arxiv:2605.04135 v2 · 2026-05-05 · cs.CY · cs.AI · cs.CL

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{EFORJFNADYDAH7LN4WM5CM4KXJ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The median paper evaluates a model +10.85 ECI behind the contemporaneous frontier at evaluation time (H1); the gap is widening at +5.53 ECI/year (H2; 95% CI [+5.03, +5.83]). Only 3.2% of abstracts disclose reasoning-mode status and 52.5% state conclusions at the level of 'AI'.

C2weakest assumption

That the keyword-based sampling of 112,303 records and the 18,574 admissible papers form a representative sample of the LLM evaluation literature, and that the reproduced Epoch AI Capabilities Index accurately ranks models at the time of each paper's evaluation.

C3one line summary

Academic LLM papers lag the frontier by a median 10.85 ECI points at publication time, with the gap widening 5.53 ECI per year, low disclosure of reasoning modes, and frequent overgeneralization to 'AI'.

Formal links

2 machine-checked theorem links

Cited by

1 paper in Pith

Receipt and verification
First computed 2026-06-05T00:13:46.746257Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

215d1495a01e0603fd6de599d1338aba51bb268ef28fcafdd762511341b632d9

Aliases

arxiv: 2605.04135 · arxiv_version: 2605.04135v2 · doi: 10.48550/arxiv.2605.04135 · pith_short_12: EFORJFNADYDA · pith_short_16: EFORJFNADYDAH7LN · pith_short_8: EFORJFNA
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/EFORJFNADYDAH7LN4WM5CM4KXJ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 215d1495a01e0603fd6de599d1338aba51bb268ef28fcafdd762511341b632d9
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "4fb1f0e94802056817ba98e883a914168127740ac53bb5c0f1c6b600d686e278",
    "cross_cats_sorted": [
      "cs.AI",
      "cs.CL"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CY",
    "submitted_at": "2026-05-05T17:58:35Z",
    "title_canon_sha256": "8cbb7c7abfe2f71c714433ec4e0464b603e5ecd19e4d1c6cafd4f19d670db941"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.04135",
    "kind": "arxiv",
    "version": 2
  }
}