pith. sign in

Integrity report for The Evaluation Game: Beyond Static LLM Benchmarking

A machine-verified record of the checks Pith has run against this paper: detector runs, findings, signed bundle events, and canonical identifiers.

arXiv:2605.19377 · pith:2026:QH7YROHDI2KMERXY4PYAQHNO55

0Critical
1Advisory
6Detectors run
2026-05-20Last checked

Paper page arXiv integrity.json bundle.json

Detector runs

citation_quote_validity completed v0.1.0 · findings 0 · 2026-05-20 21:49:49.473688+00:00
cited_work_retraction completed v1.0.0 · findings 0 · 2026-05-20 14:22:21.576328+00:00
doi_title_agreement completed v1.0.0 · findings 0 · 2026-05-20 08:01:22.489562+00:00
doi_compliance completed v1.0.0 · findings 1 · 2026-05-20 08:00:08.573339+00:00
ai_meta_artifact skipped v1.0.0 · findings 0 · 2026-05-20 02:33:28.684333+00:00
claim_evidence completed v1.0.0 · findings 0 · 2026-05-20 02:01:59.998856+00:00

Findings

No public integrity findings for this paper.

Signed record

The machine-readable record for this paper lives at /pith/2605.19377/integrity.json. Pith Number bundles also include signed pith.integrity.v1 events where a Pith Number exists.