pith. sign in
Pith Number

pith:TUMS7HT3

pith:2026:TUMS7HT3H47MWFF5JE2365U6H7
not attested not anchored not stored refs pending

Beyond Static Leaderboards: Predictive Validity for the Evaluation of LLM Agents

Aaron Fan, Akshat Bhandari, Alimurtaza Mustafa Merchant, Alisha Vinod, Amaan Sheikh, Aman Upganlawar, Ananya Kapoor, Andrew Li, Ann Li, Aryaman Agrawal, Ayal Yakobe, Byeolah Kwon, Caroline Cahill, Charles Xu, Chengrui Li, Chun-Yi Tsai, Darief Maes, Dev Bahl, Dhaval C. Patel, Kaoutar El Maghraoui, Kirthana Natarajan, Krish Veera, Madhav Rajkondawar, Mana Abbaszadeh, Mao Le Jonathan Ang, Rohith Kanathur, Rui Li, Rujing Li, Rushin Bhatt, Sagar Chethan Kumar, Sajal Kumar Goyla, Sam Colman, Sanjaii Vijayakumar, Sanskruti Vijay Shejwal, Shambhawi Baswaraj Bhure, Shen Li, Shrey Arora, Shriya Aishani Rachakonda, Shuxin Lin, Siddharth Chethan Gowda, Tanisha Rathod, Tanmay Agarwal, Thai Quoc On, Thomas Ajai, Tianjun Feng, Tianyang Xu, Tomas Pasiecznik, Trisha Maturi, Vera Mazeeva, Vivek G. Iyer, Wei Alexander Xin, Winston Li, Yang-Jung Chen, Yassine Jebbouri, Yeshitha Bhuvanesh, Yihan Sun, Yitong Bai, Yubin Sally Go, Yunfeng Chen, Yusheng Li, Yuval Shemla

arxiv:2606.19704 v1 · 2026-06-18 · cs.AI

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{TUMS7HT3H47MWFF5JE2365U6H7}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.
Receipt and verification
First computed 2026-06-19T16:12:32.860063Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

9d192f9e7b3f3ecb14bd4935bf769e3febde159d4a636dd11075c5e88aff76f7

Aliases

arxiv: 2606.19704 · arxiv_version: 2606.19704v1 · doi: 10.48550/arxiv.2606.19704 · pith_short_12: TUMS7HT3H47M · pith_short_16: TUMS7HT3H47MWFF5 · pith_short_8: TUMS7HT3
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/TUMS7HT3H47MWFF5JE2365U6H7 \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9d192f9e7b3f3ecb14bd4935bf769e3febde159d4a636dd11075c5e88aff76f7
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a08d85c4518229e8d8935c75cdb8a5b82b7358c06bde6fe3a1d13547d0c4c24a",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2026-06-18T02:02:46Z",
    "title_canon_sha256": "e9e91e245b5881ce729a859ba11d5ae2a68ff34660bbf7d66becc24a1f115181"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2606.19704",
    "kind": "arxiv",
    "version": 1
  }
}