{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2026:DOUQCTQEBDATN4TIJNAZDSEOX5","short_pith_number":"pith:DOUQCTQE","schema_version":"1.0","canonical_sha256":"1ba9014e0408c136f2684b4191c88ebf7dbf6c8d73914048fd987316bb2ab74b","source":{"kind":"arxiv","id":"2603.02928","version":2},"attestation_state":"computed","paper":{"title":"LOO-PIT predictive model checking","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Leave-one-out PIT values are dependent in finite samples, so standard uniformity tests for Bayesian model calibration have lower power than expected.","cross_cats":["stat.CO"],"primary_cat":"stat.ME","authors_text":"Aki Vehtari, Herman Tesso","submitted_at":"2026-03-03T12:34:52Z","abstract_excerpt":"We consider predictive checking for Bayesian model assessment using leave-one-out probability integral transform (LOO-PIT). LOO-PIT values are conditional cumulative predictive probabilities given LOO predictive distributions and corresponding left out observations. For a well-calibrated model, LOO-PIT values should be near uniformly distributed, but in the finite sample case they are not independent, due to LOO predictive distributions being determined by nearly the same data (all but one observation). We prove that this dependency is non-negligible in the finite case and depends on model com"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":true},"canonical_record":{"source":{"id":"2603.02928","kind":"arxiv","version":2},"metadata":{"license":"http://creativecommons.org/licenses/by/4.0/","primary_cat":"stat.ME","submitted_at":"2026-03-03T12:34:52Z","cross_cats_sorted":["stat.CO"],"title_canon_sha256":"6a2a8d3dc75e2b67e2012e41c32df77994f26dab3d76d63ed63bfd6d6fda5a7a","abstract_canon_sha256":"58779e983ff2a6774c912d94a817648f4c8d52a8a3841da5d175993f22330203"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-18T02:44:30.919407Z","signature_b64":"vK8gVKvzghPri+A9O3U0qzwOo7vWfXsGTxAkKL/mQ9//FHC+w5WEk/Z6Dh1oUFnNPII5GslifGw2YbABrqyBDw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"1ba9014e0408c136f2684b4191c88ebf7dbf6c8d73914048fd987316bb2ab74b","last_reissued_at":"2026-05-18T02:44:30.918889Z","signature_status":"signed_v1","first_computed_at":"2026-05-18T02:44:30.918889Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"LOO-PIT predictive model checking","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Leave-one-out PIT values are dependent in finite samples, so standard uniformity tests for Bayesian model calibration have lower power than expected.","cross_cats":["stat.CO"],"primary_cat":"stat.ME","authors_text":"Aki Vehtari, Herman Tesso","submitted_at":"2026-03-03T12:34:52Z","abstract_excerpt":"We consider predictive checking for Bayesian model assessment using leave-one-out probability integral transform (LOO-PIT). LOO-PIT values are conditional cumulative predictive probabilities given LOO predictive distributions and corresponding left out observations. For a well-calibrated model, LOO-PIT values should be near uniformly distributed, but in the finite sample case they are not independent, due to LOO predictive distributions being determined by nearly the same data (all but one observation). We prove that this dependency is non-negligible in the finite case and depends on model com"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We prove that this dependency is non-negligible in the finite case and depends on model complexity. We propose three testing procedures that can be used for continuous and discrete dependent uniform values... Extensive numerical experiments... demonstrate that the proposed tests achieve competitive performance overall and have much higher power than standard uniformity tests based on the independence assumption.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The dependence structure induced by LOO predictive distributions can be adequately captured by the three proposed testing procedures without introducing new bias or power loss in realistic finite-sample regimes; the abstract provides no detail on how the tests are derived or calibrated.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"New tests for LOO-PIT uniformity account for non-negligible dependence caused by shared data across leave-one-out predictions, achieving higher power than independence-assuming alternatives.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Leave-one-out PIT values are dependent in finite samples, so standard uniformity tests for Bayesian model calibration have lower power than expected.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"c04e151ed21ffc4eb1be1c0cb035270d4809b508a5e0560d9a6f5f82849c8547"},"source":{"id":"2603.02928","kind":"arxiv","version":2},"verdict":{"id":"ec0475d8-5165-4941-9cc7-e147aee2d6df","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-15T17:13:36.737618Z","strongest_claim":"We prove that this dependency is non-negligible in the finite case and depends on model complexity. We propose three testing procedures that can be used for continuous and discrete dependent uniform values... Extensive numerical experiments... demonstrate that the proposed tests achieve competitive performance overall and have much higher power than standard uniformity tests based on the independence assumption.","one_line_summary":"New tests for LOO-PIT uniformity account for non-negligible dependence caused by shared data across leave-one-out predictions, achieving higher power than independence-assuming alternatives.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The dependence structure induced by LOO predictive distributions can be adequately captured by the three proposed testing procedures without introducing new bias or power loss in realistic finite-sample regimes; the abstract provides no detail on how the tests are derived or calibrated.","pith_extraction_headline":"Leave-one-out PIT values are dependent in finite samples, so standard uniformity tests for Bayesian model calibration have lower power than expected."},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":1,"snapshot_sha256":"7de5f377d1da5fc6f2d96d0410644e9529d0aafd1f23052e7fce49cbb3007673"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2603.02928","created_at":"2026-05-18T02:44:30.918954+00:00"},{"alias_kind":"arxiv_version","alias_value":"2603.02928v2","created_at":"2026-05-18T02:44:30.918954+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2603.02928","created_at":"2026-05-18T02:44:30.918954+00:00"},{"alias_kind":"pith_short_12","alias_value":"DOUQCTQEBDAT","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_16","alias_value":"DOUQCTQEBDATN4TI","created_at":"2026-05-18T12:33:37.589309+00:00"},{"alias_kind":"pith_short_8","alias_value":"DOUQCTQE","created_at":"2026-05-18T12:33:37.589309+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":1,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5","json":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5.json","graph_json":"https://pith.science/api/pith-number/DOUQCTQEBDATN4TIJNAZDSEOX5/graph.json","events_json":"https://pith.science/api/pith-number/DOUQCTQEBDATN4TIJNAZDSEOX5/events.json","paper":"https://pith.science/paper/DOUQCTQE"},"agent_actions":{"view_html":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5","download_json":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5.json","view_paper":"https://pith.science/paper/DOUQCTQE","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2603.02928&json=true","fetch_graph":"https://pith.science/api/pith-number/DOUQCTQEBDATN4TIJNAZDSEOX5/graph.json","fetch_events":"https://pith.science/api/pith-number/DOUQCTQEBDATN4TIJNAZDSEOX5/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5/action/timestamp_anchor","attest_storage":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5/action/storage_attestation","attest_author":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5/action/author_attestation","sign_citation":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5/action/citation_signature","submit_replication":"https://pith.science/pith/DOUQCTQEBDATN4TIJNAZDSEOX5/action/replication_record"}},"created_at":"2026-05-18T02:44:30.918954+00:00","updated_at":"2026-05-18T02:44:30.918954+00:00"}