pith:ZGTKTZFW
Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
A survey finds that more aligned LLMs generally achieve higher trustworthiness, though the gains differ across categories.
arxiv:2308.05374 v2 · 2023-08-10 · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ZGTKTZFW6VAR2IZJADJARJK4TJ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered.
That the seven categories and 29 sub-categories comprehensively capture trustworthiness and that the selected eight sub-categories plus the chosen measurement methods accurately reflect real-world alignment.
Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:12.820356Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
c9a6a9e4b6f5411d232900d208a55c9a7de412fd7489d4c2e8ab15a9219e1409
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZGTKTZFW6VAR2IZJADJARJK4TJ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c9a6a9e4b6f5411d232900d208a55c9a7de412fd7489d4c2e8ab15a9219e1409
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "f486721c6f283b619343311b946661d598241a74b6d7b31ef1a7c3e8492341d3",
"cross_cats_sorted": [
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
"primary_cat": "cs.AI",
"submitted_at": "2023-08-10T06:43:44Z",
"title_canon_sha256": "e4f29685ef9212d331f35b161dfd4efe86e04c62c4d0faf6cdb9dac9031623f4"
},
"schema_version": "1.0",
"source": {
"id": "2308.05374",
"kind": "arxiv",
"version": 2
}
}