Pith Number

pith:AH6MEJKA

pith:2026:AH6MEJKABSUSIU7BDTCQKBNC5A

not attested not anchored not stored refs resolved

LLMs as Implicit Imputers: Uncertainty Should Scale with Missing Information

Stef van Buuren

LLMs should increase uncertainty as context is removed, with entropy scaling like in multiple imputation while confidence does not.

arxiv:2605.13188 v1 · 2026-05-13 · stat.ML · cs.CL · cs.LG · stat.ME

Open paper page JSON Open Graph Bundle Merged state Verified badge What is a Pith Number?

Add to your LaTeX paper

\usepackage{pith}
\pithnumber{AH6MEJKABSUSIU7BDTCQKBNC5A}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp

2 Internet Archive

3 Author claim open · sign in to claim

4 Citations open

5 Replications open

✓ Portable graph bundle live · download bundle · merged state

The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Entropy increases with context removal, consistent with the MI analogy, and explains substantially more variance in accuracy than confidence across all evidence levels (quadratic R² gap up to 0.057).

C2weakest assumption

That controlled removal of context segments on SQuAD questions creates a representative proxy for the kinds of missing information LLMs encounter in open-ended real-world use.

C3one line summary

Response entropy in LLMs rises with missing context on SQuAD while sampling-based confidence stays high, supporting the multiple imputation criterion and introducing a diagnostic for uncertainty reduction by context level.

References

12 extracted · 12 resolved · 2 Pith anchors

[1] Bartlett, J. W. and Seaman, S. R. and White, I. R. and Carpenter, J. R. , title =. Statistical Methods in Medical Research , volume =. 2015 , location = 2015

[2] International Conference on Machine Learning , pages= 2017

[3] Language Models (Mostly) Know What They Know · arXiv:2207.05221

[4] Rajpurkar, P. and Zhang, J. and Lopyrev, K. and Liang, P. , booktitle =. 2016 , publisher = 2016

[5] Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation · arXiv:2302.09664

Receipt and verification

First computed	2026-05-18T03:08:56.189437Z
Builder	pith-number-builder-2026-05-17-v1
Signature	Pith Ed25519 (`pith-v1-2026-05`) · public key
Schema	pith-number/v1.0

Canonical hash

01fcc225400ca92453e11cc50505a2e805615ff309900e044d946cb3cab9aec7

Aliases

arxiv: 2605.13188 · arxiv_version: 2605.13188v1 · doi: 10.48550/arxiv.2605.13188 · pith_short_12: AH6MEJKABSUS · pith_short_16: AH6MEJKABSUSIU7B · pith_short_8: AH6MEJKA

Agent API

Resolver JSON Graph JSON Events JSON Schema Signing key

Verify this Pith Number yourself

curl -sH 'Accept: application/ld+json' https://pith.science/pith/AH6MEJKABSUSIU7BDTCQKBNC5A \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 01fcc225400ca92453e11cc50505a2e805615ff309900e044d946cb3cab9aec7

Canonical record JSON

{
  "metadata": {
    "abstract_canon_sha256": "d8343a4d402c456398c5031032a0a0a0835170a2dc296fe0a18ced6781180fca",
    "cross_cats_sorted": [
      "cs.CL",
      "cs.LG",
      "stat.ME"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "stat.ML",
    "submitted_at": "2026-05-13T08:43:57Z",
    "title_canon_sha256": "fb4762aaf6ad55680c127267071bd7df8970c1c296c393105322566d9ca09b0a"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13188",
    "kind": "arxiv",
    "version": 1
  }
}