pith:JDVYZVME
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Contrastive learning with standard dropout as the only noise produces sentence embeddings that match or beat prior supervised results.
arxiv:2104.08821 v4 · 2021-04-18 · cs.CL · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{JDVYZVMEZKIKYMTR5Y7GAZ2ICM}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
our unsupervised and supervised models using BERT base achieve an average of 76.3% and 81.6% Spearman's correlation respectively, a 4.2% and 2.2% improvement compared to the previous best results. We also show -- both theoretically and empirically -- that the contrastive learning objective regularizes pre-trained embeddings' anisotropic space to be more uniform
that standard dropout is sufficient as data augmentation to prevent representation collapse in the unsupervised contrastive objective, and that NLI entailment/contradiction pairs form appropriate positive and hard-negative pairs for learning general sentence embeddings.
SimCSE achieves 76.3% unsupervised and 81.6% supervised Spearman's correlation on STS tasks with BERT-base, improving prior best results by 4.2% and 2.2% via simple contrastive learning.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:53.095688Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
48eb8cd584ca90ac3271ee3e606748130a9b5d5b1e43938a82e86154b8b68519
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/JDVYZVMEZKIKYMTR5Y7GAZ2ICM \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 48eb8cd584ca90ac3271ee3e606748130a9b5d5b1e43938a82e86154b8b68519
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "0ba08acab8d5daef298eda40b969a5252fbad3b10cd5421f91a7fa217ca5a3b2",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2021-04-18T11:27:08Z",
"title_canon_sha256": "521eac38e6d56e5902f623741973ee63232a52df03edcbb497fbb9d6cc2c02e0"
},
"schema_version": "1.0",
"source": {
"id": "2104.08821",
"kind": "arxiv",
"version": 4
}
}