pith:UUGCHFME
XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models
Large language models refuse safe prompts that resemble unsafe requests.
arxiv:2308.01263 v3 · 2023-08-02 · cs.CL · cs.AI
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{UUGCHFMEAGXRULYILDSIDJ35RX}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more
Record completeness
Claims
we introduce a new test suite called XSTest to identify such eXaggerated Safety behaviours in a systematic way. XSTest comprises 250 safe prompts across ten prompt types that well-calibrated models should not refuse to comply with, and 200 unsafe prompts as contrasts that models, for most applications, should refuse.
That the 250 prompts selected by the authors are unambiguously safe and that model refusals on them reliably indicate exaggerated safety rather than other factors such as capability limits or prompt ambiguity.
XSTest is a benchmark for detecting exaggerated safety refusals in large language models on clearly safe prompts.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:53.209339Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
a50c23958401af1a2f0858e481a77d8dfd1538538de98387ade099064474869e
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/UUGCHFMEAGXRULYILDSIDJ35RX \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: a50c23958401af1a2f0858e481a77d8dfd1538538de98387ade099064474869e
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "dcb2f0de1688e0d8877977724715ab901970b310f67a94791a0b805d0afb6017",
"cross_cats_sorted": [
"cs.AI"
],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2023-08-02T16:30:40Z",
"title_canon_sha256": "60bdeef85f4cf639f393480b8b495ace355dbbf4deb3d84130db1d1dd184504a"
},
"schema_version": "1.0",
"source": {
"id": "2308.01263",
"kind": "arxiv",
"version": 3
}
}