pith:75ARSMLE
Understanding Self-Supervised Learning via Latent Distribution Matching
Self-supervised learning works by matching representations to an assumed latent model while maximizing their entropy to avoid collapse.
arxiv:2605.03517 v2 · 2026-05-05 · cs.LG · stat.ML
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{75ARSMLEIPA5N2TTRGROR4HIRE}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
We cast SSL as latent distribution matching (LDM): learning representations that maximize their log-probability under an assumed latent model (alignment), while maximizing latent entropy to prevent collapse (uniformity). This view unifies independent component analysis with contrastive, non-contrastive, and predictive SSL methods... We further prove that predictive LDM yields identifiable latent representations under mild assumptions, even with nonlinear predictors.
The existence and suitability of an 'assumed latent model' whose log-probability can be maximized, plus the 'mild assumptions' required for the identifiability proof; without the full derivations it is unclear how restrictive these are or whether they are satisfied by standard SSL objectives.
Self-supervised learning is cast as latent distribution matching that aligns representations to a model while enforcing uniformity, unifying multiple SSL families and proving identifiability for predictive variants even with nonlinear predictors.
Receipt and verification
| First computed | 2026-05-20T00:05:45.907367Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
ff4119316443c1d6ea7389a2e8f0e889210b655147fbe85974e0bbcd6fd76fb2
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/75ARSMLEIPA5N2TTRGROR4HIRE \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: ff4119316443c1d6ea7389a2e8f0e889210b655147fbe85974e0bbcd6fd76fb2
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "a7ce7975143800a7b5b8c6c81fc1a0663a8ffce46868a898538ad651689c5e33",
"cross_cats_sorted": [
"stat.ML"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-05T08:53:00Z",
"title_canon_sha256": "22e38605503f4afc6f2631c9c4851a56388e9d647616cd20ef2234670bf7595d"
},
"schema_version": "1.0",
"source": {
"id": "2605.03517",
"kind": "arxiv",
"version": 2
}
}