pith:7JVW7MQ7
Statistical Unlearning of Distributions: A Hypothesis Testing Approach
A hypothesis test comparing edited data to desired and unwanted distributions supplies a criterion for choosing which samples to remove when unlearning entire domains.
arxiv:2605.16645 v1 · 2026-05-15 · math.ST · cs.IT · cs.LG · math.IT · stat.ML · stat.TH
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{7JVW7MQ7KD7ONX7VG6APWPLAXY}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We formalize this using a hypothesis test of the edited data with the desired and unwanted domains, leading to an interpretable and robust criterion for selecting samples to remove. Within this statistical framework, we characterize the fundamental region of the allowable edited data distributions and the removal-preservation Pareto frontier for a broad class of distribution families.
Domains of information can be accurately modeled as probability distributions, and a hypothesis test between the edited dataset and the desired versus unwanted distributions provides a sufficient criterion for sample removal that preserves performance on the target domain.
A hypothesis testing approach to distributional unlearning that characterizes allowable edited distributions and removal-preservation Pareto frontiers for parametric and nonparametric families including Gaussians, Poisson, and Gaussian white noise.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:02:34.135065Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
fa6b6fb21f50fee6dff53780fb3d60be07392612b84d96c983fb42366df961c1
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/7JVW7MQ7KD7ONX7VG6APWPLAXY \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: fa6b6fb21f50fee6dff53780fb3d60be07392612b84d96c983fb42366df961c1
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "f8f4f8d837ff4722cdd7fe6dae8a1c16ac4b9e5fd4886e3856dad02d542eb934",
"cross_cats_sorted": [
"cs.IT",
"cs.LG",
"math.IT",
"stat.ML",
"stat.TH"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "math.ST",
"submitted_at": "2026-05-15T21:33:38Z",
"title_canon_sha256": "0bac3f9c2c2fabee2359b15caeb17b03ee20ee175c4999bb8075d9ce849a1708"
},
"schema_version": "1.0",
"source": {
"id": "2605.16645",
"kind": "arxiv",
"version": 1
}
}