Pith Number

pith:ZGTKTZFW

pith:2023:ZGTKTZFW6VAR2IZJADJARJK4TJ

not attested not anchored not stored refs resolved

Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment

Hang Li, Hao Cheng, Jean-Francois Ton, Muhammad Faaiz Taufiq, Ruocheng Guo, Xiaoying Zhang, Yang Liu, Yegor Klochkov, Yuanshun Yao

A survey finds that more aligned LLMs generally achieve higher trustworthiness, though the gains differ across categories.

arxiv:2308.05374 v2 · 2023-08-10 · cs.AI · cs.LG

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{ZGTKTZFW6VAR2IZJADJARJK4TJ}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

The measurement results indicate that, in general, more aligned models tend to perform better in terms of overall trustworthiness. However, the effectiveness of alignment varies across the different trustworthiness categories considered.

C2weakest assumption

That the seven categories and 29 sub-categories comprehensively capture trustworthiness and that the selected eight sub-categories plus the chosen measurement methods accurately reflect real-world alignment.

C3one line summary

Survey organizes LLM trustworthiness into seven categories and 29 sub-categories, measures eight sub-categories on popular models, and finds that more aligned models generally score higher but with varying effectiveness.

References

300 extracted · 300 resolved · 29 Pith anchors

[1] Training language models to follow instructions with human feedback 2022

[2] Alignment of language agents 2021

[3] OpenAI. Gpt-4. https://openai.com/research/gpt-4, 2023 2023

[4] On the dangers of stochastic parrots: Can language models be too big? In Proceedings of the 2021 ACM conference on fairness, accountability, and transparency, pages 610–623, 2021 2021

[5] Language models are unsupervised multitask learners 2019

Formal links

2 machine-checked theorem links

Cited by

29 papers in Pith

Data-Centric Foundation Models in Computational Healthcare: A Survey

The AI risk repository: A meta-review, database, and taxonomy of risks from artificial intelligence

Trustworthiness in Retrieval-Augmented Generation Systems: A Survey

Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

AI Failures in the Eyes of the Downstream Developer: A First Look at Concerns, Practices, and Challenges

Receipt and verification

First computed	2026-05-17T23:38:12.820356Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

c9a6a9e4b6f5411d232900d208a55c9a7de412fd7489d4c2e8ab15a9219e1409

Aliases

arxiv: 2308.05374 · arxiv_version: 2308.05374v2 · doi: 10.48550/arxiv.2308.05374 · pith_short_12: ZGTKTZFW6VAR · pith_short_16: ZGTKTZFW6VAR2IZJ · pith_short_8: ZGTKTZFW

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZGTKTZFW6VAR2IZJADJARJK4TJ \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: c9a6a9e4b6f5411d232900d208a55c9a7de412fd7489d4c2e8ab15a9219e1409

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "f486721c6f283b619343311b946661d598241a74b6d7b31ef1a7c3e8492341d3",
    "cross_cats_sorted": [
      "cs.LG"
    ],
    "license": "http://creativecommons.org/licenses/by-nc-sa/4.0/",
    "primary_cat": "cs.AI",
    "submitted_at": "2023-08-10T06:43:44Z",
    "title_canon_sha256": "e4f29685ef9212d331f35b161dfd4efe86e04c62c4d0faf6cdb9dac9031623f4"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2308.05374",
    "kind": "arxiv",
    "version": 2
  }
}