pith:KOV64XHE
The Internal State of an LLM Knows When It's Lying
The hidden activations inside an LLM can be read by a trained classifier to detect whether a statement is true or false.
arxiv:2304.13734 v2 · 2023-04-26 · cs.CL · cs.AI · cs.LG
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{KOV64XHEEHE6X7VXUTHPFTDZDN}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
the LLM's internal state can be used to reveal the truthfulness of statements. This includes both statements provided to the LLM, and statements that the LLM itself generates.
That the hidden activations contain a generalizable signal of truthfulness that is not merely an artifact of the particular training sentences or superficial statistical properties shared with the labels.
Hidden activations in LLMs encode detectable information about statement truthfulness, enabling a classifier to identify true versus false content more reliably than the model's assigned probabilities.
References
Formal links
Cited by
Receipt and verification
| First computed | 2026-05-17T23:38:49.657733Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
53abee5ce421c9ebfeb7a4cef2cc791b5c6a8f4035a3370225bb51b68e8f82ff
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/KOV64XHEEHE6X7VXUTHPFTDZDN \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 53abee5ce421c9ebfeb7a4cef2cc791b5c6a8f4035a3370225bb51b68e8f82ff
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "5ccf1164771c2d94ac9d1d74ea147cf9ffb616d551ce05fe36a5b7fc0e739517",
"cross_cats_sorted": [
"cs.AI",
"cs.LG"
],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2023-04-26T02:49:38Z",
"title_canon_sha256": "954ff13209d1dc258fabe7058532d9a1bc37bbaf282cae3137636fd652f41ee3"
},
"schema_version": "1.0",
"source": {
"id": "2304.13734",
"kind": "arxiv",
"version": 2
}
}