{"state_type":"pith_open_graph_state","state_version":"1.0","pith_number":"pith:2026:TYFP47YCUZP2S3VOTW5JJBPB4J","merge_version":"pith-open-graph-merge-v1","event_count":2,"valid_event_count":2,"invalid_event_count":0,"equivocation_count":0,"current":{"canonical_record":{"metadata":{"abstract_canon_sha256":"37fa01c8078fcd21ad6b556a4f96d3ee0e352ad10bb897463c20c6a22e816d0d","cross_cats_sorted":["cs.LG","math.PR"],"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"stat.ML","submitted_at":"2026-05-13T07:54:37Z","title_canon_sha256":"4c3b6fc00a830b55b9144ad0c1642d3ff24bfb8d45287df12715478db6a34303"},"schema_version":"1.0","source":{"id":"2605.13127","kind":"arxiv","version":1}},"source_aliases":[{"alias_kind":"arxiv","alias_value":"2605.13127","created_at":"2026-05-18T03:08:57Z"},{"alias_kind":"arxiv_version","alias_value":"2605.13127v1","created_at":"2026-05-18T03:08:57Z"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2605.13127","created_at":"2026-05-18T03:08:57Z"},{"alias_kind":"pith_short_12","alias_value":"TYFP47YCUZP2","created_at":"2026-05-18T12:33:37Z"},{"alias_kind":"pith_short_16","alias_value":"TYFP47YCUZP2S3VO","created_at":"2026-05-18T12:33:37Z"},{"alias_kind":"pith_short_8","alias_value":"TYFP47YC","created_at":"2026-05-18T12:33:37Z"}],"graph_snapshots":[{"event_id":"sha256:dad8def18b68739aba6f85aaed65ea9e72db503d5893eef00b0b72f084a5b2c2","target":"graph","created_at":"2026-05-18T03:08:57Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"graph_snapshot":{"author_claims":{"count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","strong_count":0},"builder_version":"pith-number-builder-2026-05-17-v1","claims":{"count":4,"items":[{"attestation":"unclaimed","claim_id":"C1","kind":"strongest_claim","source":"verdict.strongest_claim","status":"machine_extracted","text":"We propose new DPPs on the Euclidean space based on wavelets, with provably better accuracy guarantees than the best known rates. Second, we introduce a general method to convert such continuous DPPs into discrete kernels, which simultaneously preserves the desired variance decay and reveals a low-rank decomposition of the discrete kernel."},{"attestation":"unclaimed","claim_id":"C2","kind":"weakest_assumption","source":"verdict.weakest_assumption","status":"machine_extracted","text":"The discretization procedure preserves the variance reduction properties of the continuous wavelet DPPs with only negligible degradation when applied to finite datasets, and that the low-rank structure remains exploitable without hidden computational costs."},{"attestation":"unclaimed","claim_id":"C3","kind":"one_line_summary","source":"verdict.one_line_summary","status":"machine_extracted","text":"Wavelet DPP kernels deliver improved continuous variance reduction and a discretization procedure that preserves decay rates for discrete ML subsampling tasks including rough objectives."},{"attestation":"unclaimed","claim_id":"C4","kind":"headline","source":"verdict.pith_extraction.headline","status":"machine_extracted","text":"Wavelet-based DPPs on Euclidean space discretize to low-rank kernels that preserve superior variance reduction for minibatches on rough objectives."}],"snapshot_sha256":"b3b6cfae40857457276f3da9d3d30e7db947391561ce3c9c0fc6006054397c30"},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"paper":{"abstract_excerpt":"Determinantal point processes (DPPs) have emerged as a kernelized alternative to vanilla independent sampling for generating efficient minibatches, coresets and other parsimonious representations of large-scale datasets. While theoretical foundations and promising empirical performance have been demonstrated, there are two challenges for current proposals for DPP-based coresets or minibatches. The first is the need for families of DPPs with certain key variance reduction properties, usually constructed in a continuous setting, of which there are few known examples. The second is the need for a","authors_text":"Hoang-Son Tran, Pranav Gupta, R\\'emi Bardenet, Subhroshekhar Ghosh","cross_cats":["cs.LG","math.PR"],"headline":"Wavelet-based DPPs on Euclidean space discretize to low-rank kernels that preserve superior variance reduction for minibatches on rough objectives.","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"stat.ML","submitted_at":"2026-05-13T07:54:37Z","title":"State-of-art minibatches via novel DPP kernels: discretization, wavelets, and rough objectives"},"references":{"count":299,"internal_anchors":16,"resolved_work":299,"sample":[{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":1,"title":"Miramont, J. M. and Tan, K. A. and Mukherjee, S. S. and Bardenet, R. and Ghosh, S. , date-added =. arXiv preprint arXiv:2504.07720 , title =","work_id":"1b079da6-1530-4530-9466-ddb967f59414","year":null},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":2,"title":"Miramont, J. M. and Auger, F. and Colominas, M. A. and Laurent, N. and Meignen, S. , date-added =. Signal Processing , title =","work_id":"919a03eb-7645-450a-a33a-8d692a141907","year":null},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":3,"title":"Miramont, J. M. and Bardenet, R. and Chainais, P. and Auger, F. , booktitle =. Adaptive hyperparameter tuning for time-frequency algorithms based on the zeros of the spectrogram , year =","work_id":"249c1ea2-ebe6-440f-ad2f-843769014c62","year":null},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":4,"title":"Lacoste--Julien, S. and Husz. Approximate inference for the loss-calibrated. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics (AISTATS) , organization =","work_id":"ae3fb55e-6d4b-45f6-9c4b-ddfee0def6db","year":null},{"cited_arxiv_id":"","doi":"","is_internal_anchor":false,"ref_index":5,"title":"Advances in Neural Information Processing Systems (NeurIPS) , title =","work_id":"bc34f26f-676a-459b-919b-5d9fba8529b3","year":null}],"snapshot_sha256":"00d7e81a1190c82abae8ac2c0e16cdd82fe8752df2535c9aa15a3c3ce0b0e919"},"source":{"id":"2605.13127","kind":"arxiv","version":1},"verdict":{"created_at":"2026-05-14T18:23:15.698803Z","id":"b75d245f-ec28-4bf7-8395-8a48dee0a16e","model_set":{"reader":"grok-4.3"},"one_line_summary":"Wavelet DPP kernels deliver improved continuous variance reduction and a discretization procedure that preserves decay rates for discrete ML subsampling tasks including rough objectives.","pipeline_version":"pith-pipeline@v0.9.0","pith_extraction_headline":"Wavelet-based DPPs on Euclidean space discretize to low-rank kernels that preserve superior variance reduction for minibatches on rough objectives.","strongest_claim":"We propose new DPPs on the Euclidean space based on wavelets, with provably better accuracy guarantees than the best known rates. Second, we introduce a general method to convert such continuous DPPs into discrete kernels, which simultaneously preserves the desired variance decay and reveals a low-rank decomposition of the discrete kernel.","weakest_assumption":"The discretization procedure preserves the variance reduction properties of the continuous wavelet DPPs with only negligible degradation when applied to finite datasets, and that the low-rank structure remains exploitable without hidden computational costs."}},"verdict_id":"b75d245f-ec28-4bf7-8395-8a48dee0a16e"}}],"author_attestations":[],"timestamp_anchors":[],"storage_attestations":[],"citation_signatures":[],"replication_records":[],"corrections":[],"mirror_hints":[],"record_created":{"event_id":"sha256:61a1aa196d642045f6b094642ffec20793701e1657d17754274f4a7d6b243ed4","target":"record","created_at":"2026-05-18T03:08:57Z","signer":{"key_id":"pith-v1-2026-05","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","signer_id":"pith.science","signer_type":"pith_registry"},"payload":{"attestation_state":"computed","canonical_record":{"metadata":{"abstract_canon_sha256":"37fa01c8078fcd21ad6b556a4f96d3ee0e352ad10bb897463c20c6a22e816d0d","cross_cats_sorted":["cs.LG","math.PR"],"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"stat.ML","submitted_at":"2026-05-13T07:54:37Z","title_canon_sha256":"4c3b6fc00a830b55b9144ad0c1642d3ff24bfb8d45287df12715478db6a34303"},"schema_version":"1.0","source":{"id":"2605.13127","kind":"arxiv","version":1}},"canonical_sha256":"9e0afe7f02a65fa96eae9dba9485e1e24e0a3c97a4edf33d682ca35143f11771","receipt":{"algorithm":"ed25519","builder_version":"pith-number-builder-2026-05-17-v1","canonical_sha256":"9e0afe7f02a65fa96eae9dba9485e1e24e0a3c97a4edf33d682ca35143f11771","first_computed_at":"2026-05-18T03:08:57.855857Z","key_id":"pith-v1-2026-05","kind":"pith_receipt","last_reissued_at":"2026-05-18T03:08:57.855857Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54","receipt_version":"0.3","signature_b64":"ZBYHRjoJqp1J67LubonoY8ST+nrobrjGJvVqjpCOY4Migg952RwNyeYLnPNMC0bZChU7S1cV2VXl3KMbWhWWDA==","signature_status":"signed_v1","signed_at":"2026-05-18T03:08:57.856488Z","signed_message":"canonical_sha256_bytes"},"source_id":"2605.13127","source_kind":"arxiv","source_version":1}}},"equivocations":[],"invalid_events":[],"applied_event_ids":["sha256:61a1aa196d642045f6b094642ffec20793701e1657d17754274f4a7d6b243ed4","sha256:dad8def18b68739aba6f85aaed65ea9e72db503d5893eef00b0b72f084a5b2c2"],"state_sha256":"ee155d9000a19357390abd0eae545ca95c29eff1da23f0209bec15634f8ac42d"}