Integrity report for How Much Do Large Language Model Cheat on Evaluation? Benchmarking Overestimation under the One-Time-Pad-Based Framework

A machine-verified record of the checks Pith has run against this paper: detector runs, findings, signed bundle events, and canonical identifiers.

arXiv:2507.19219 · pith:2025:SX7PTBO5XB6PTT3ZLABYKAAWG2

0Critical

0Advisory

0Detectors run

—Last checked

Paper page arXiv integrity.json bundle.json

Detector runs

Findings

No public integrity findings for this paper.

Signed record

The machine-readable record for this paper lives at /pith/SX7PTBO5/integrity.json. Pith Number bundles also include signed pith.integrity.v1 events where a Pith Number exists.