Pith Number

pith:3ELOZYL4

pith:2023:3ELOZYL4ZIAQTPWHMW6B6PB5IN

not attested not anchored not stored refs resolved

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models

Adian Liusie, Mark J. F. Gales, Potsawee Manakul

Multiple stochastic samples from a black-box LLM reveal which generated facts are hallucinations by checking their consistency.

arxiv:2303.08896 v3 · 2023-03-15 · cs.CL

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{3ELOZYL4ZIAQTPWHMW6B6PB5IN}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

SelfCheckGPT can detect non-factual and factual sentences and rank passages in terms of factuality, achieving considerably higher AUC-PR scores in sentence-level hallucination detection and higher correlation scores in passage-level factuality assessment compared to grey-box methods.

C2weakest assumption

That divergence among stochastically sampled responses reliably signals hallucinated facts rather than other sources of output variation such as stylistic differences or partial knowledge.

C3one line summary

SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.

References

288 extracted · 288 resolved · 11 Pith anchors

[3] GPT - N eo X -20 B : An open-source autoregressive language model 2022 · doi:10.18653/v1/2022.bigscience-1.9

[4] Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot lear 2020

[6] Jacob Cohen. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20:37 -- 46 1960

[8] A Survey on Automated Fact-Checking 2022 · doi:10.1162/tacl_a_00454

[9] Pengcheng He, Jianfeng Gao, and Weizhu Chen. 2023. https://openreview.net/forum?id=sE7-XhLxHA De BERT av3: Improving de BERT a using ELECTRA -style pre-training with gradient-disentangled embedding sh 2023

Formal links

1 machine-checked theorem link

Cited by

43 papers in Pith

HalluScan: A Systematic Benchmark for Detecting and Mitigating Hallucinations in Instruction-Following LLMs

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models

Self-Reported Confidence of Large Language Models in Gastroenterology: Analysis of Commercial, Open-Source, and Quantized Models

TokUR: Token-Level Uncertainty Estimation for Large Language Model Reasoning

Why Semantic Entropy Fails: Geometry-Aware and Calibrated Uncertainty for Policy Optimization

Receipt and verification

First computed	2026-05-17T23:38:51.162387Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

d916ece17cca0109bec765bc1f3c3d4355b4cf38d4c18e83a0f0bc24a7771a7d

Aliases

arxiv: 2303.08896 · arxiv_version: 2303.08896v3 · doi: 10.48550/arxiv.2303.08896 · pith_short_12: 3ELOZYL4ZIAQ · pith_short_16: 3ELOZYL4ZIAQTPWH · pith_short_8: 3ELOZYL4

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/3ELOZYL4ZIAQTPWHMW6B6PB5IN \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: d916ece17cca0109bec765bc1f3c3d4355b4cf38d4c18e83a0f0bc24a7771a7d

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "05943fbaa1b7804bec5ee7292f4abfdfd47bb7dc322d85807a25d98066a762a1",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.CL",
    "submitted_at": "2023-03-15T19:31:21Z",
    "title_canon_sha256": "5ef7ef8161143d74354342d262c2ce6a2cdbd7bbeb3a33fd76d64210e7f55add"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2303.08896",
    "kind": "arxiv",
    "version": 3
  }
}