pith. sign in
Pith Number

pith:5NWYXOMO

pith:2026:5NWYXOMOUGIRLZLH7IH2XRS34P
not attested not anchored not stored refs pending

Croissant Baker: Metadata Generation for Discoverable, Governable, and Reusable ML Datasets

Anwai Archit, Christina Conrad Parry, Debanshu Das, Eric S. Rosenthal, Joan Giner-Miguelez, Joaquin Vanschoren, Lara Grosso, Luis Oala, Marzyeh Ghassemi, Matthew McDermott, Nobin Sarwar, Rafi Al Attrach, Rajat Ghosh, Rajna Fani, Sebastian Lobentanzer, Steffen Vogler, Sujata Goswami, Surbhi Motghare, Tom Pollard, Varuni H. K.

arxiv:2605.15079 v1 · 2026-05-14 · cs.LG · cs.DB · cs.DL · cs.IR

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{5NWYXOMOUGIRLZLH7IH2XRS34P}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.
Receipt and verification
First computed 2026-05-17T23:38:54.162577Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

eb6d8bb98ea19115e567fa0fabc65be3c5489568cb40383b0829aa80c8c3cd36

Aliases

arxiv: 2605.15079 · arxiv_version: 2605.15079v1 · doi: 10.48550/arxiv.2605.15079 · pith_short_12: 5NWYXOMOUGIR · pith_short_16: 5NWYXOMOUGIRLZLH · pith_short_8: 5NWYXOMO
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/5NWYXOMOUGIRLZLH7IH2XRS34P \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: eb6d8bb98ea19115e567fa0fabc65be3c5489568cb40383b0829aa80c8c3cd36
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "a3fe3d2da2a53353118bb6852b8ae2c08fb22c9facc721bfcc75c54d4a62f6ff",
    "cross_cats_sorted": [
      "cs.DB",
      "cs.DL",
      "cs.IR"
    ],
    "license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-14T17:04:39Z",
    "title_canon_sha256": "05e6cca40cfa097227ceb856ef192b40b703aba4da0645fd29d2c23873e7a805"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.15079",
    "kind": "arxiv",
    "version": 1
  }
}