pith:KPXUNR6T
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
Pretrained language models can generate toxic text from seemingly innocuous prompts, and no current control method prevents it reliably.
arxiv:2009.11462 v2 · 2020-09-24 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{KPXUNR6THD3NXDGYDRYVC6QNO2}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more
Record completeness
Claims
Using RealToxicityPrompts, we find that pretrained LMs can degenerate into toxic text even from seemingly innocuous prompts... no current method is failsafe against neural toxic degeneration.
That the automated toxicity classifier produces scores that reliably correspond to human judgments of toxicity across diverse prompts and generations.
Language models produce toxic text from innocuous prompts, and no tested control method fully prevents it, demonstrated via a new 100K-prompt web-derived dataset.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:50.603114Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
53ef46c7d338f6db8cd81c71517a0d768edd406efdeedc909e2b1c1243e8fbf3
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/KPXUNR6THD3NXDGYDRYVC6QNO2 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 53ef46c7d338f6db8cd81c71517a0d768edd406efdeedc909e2b1c1243e8fbf3
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "3749c25aaae21dcfecfa070717e38d7d70f9e1fbaa64207045cf52e3f8b4d422",
"cross_cats_sorted": [],
"license": "http://arxiv.org/licenses/nonexclusive-distrib/1.0/",
"primary_cat": "cs.CL",
"submitted_at": "2020-09-24T03:17:19Z",
"title_canon_sha256": "d92c209d59272778bc18f45a0692e5200a239338ee2718041aa4910328593b2b"
},
"schema_version": "1.0",
"source": {
"id": "2009.11462",
"kind": "arxiv",
"version": 2
}
}