pith. sign in
Pith Number

pith:ARLQEZMX

pith:2026:ARLQEZMX7JNZ2G5XCLQCQXGA6I
not attested not anchored not stored refs resolved

Discovery of Hidden Miscalibration Regimes

Katarzyna Kobalczyk, Mihaela van der Schaar

Calibration errors in LLMs depend on input type and can be found without predefined groups.

arxiv:2605.13484 v1 · 2026-05-13 · cs.LG · cs.AI · stat.ME

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ARLQEZMX7JNZ2G5XCLQCQXGA6I}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

Across four real-world LLM benchmarks and twelve LLMs, we find that input-dependent calibration heterogeneity is prevalent. We further show that the discovered fields are actionable: they support local confidence correction and reduce calibration error in systematically miscalibrated regions where confidence-based methods such as isotonic regression and temperature scaling are less effective.

C2weakest assumption

That a learned calibration-aware representation of the input space exists such that kernel smoothing within its geometry accurately recovers signed local miscalibration without access to predefined data slices or additional supervision.

C3one line summary

A diagnostic framework discovers prevalent input-dependent calibration heterogeneity in LLMs via a calibration-aware representation and kernel-smoothed signed miscalibration field, enabling local corrections that outperform global methods like temperature scaling in miscalibrated regions.

References

31 extracted · 31 resolved · 0 Pith anchors

[1] Y . Bai, A. Jones, K. Ndousse, A. Askell, A. Chen, N. DasSarma, D. Drain, S. Fort, D. Ganguli, T. Henighan, N. Joseph, S. Kadavath, J. Kernion, T. Conerly, S. El-Showk, N. Elhage, Z. Hatfield- Dodds, 2022
[2] J. Blasiok and P. Nakkiran. Smooth ECE: Principled reliability diagrams via kernel smoothing. InThe Twelfth International Conference on Learning Representations, 2024 2024
[3] Chouldechova 2017
[4] Y . Chung, T. Kraska, N. Polyzotis, K. H. Tae, and S. E. Whang. Automated Data Slicing for Model Validation: A Big Data - AI Integration Approach .IEEE Transactions on Knowledge & Data Engineering, 32 2020
[5] A. P. Dawid. The well-calibrated bayesian.Journal of the American Statistical Association, 77(379):605–610, 1982 1982

Formal links

2 machine-checked theorem links

Receipt and verification
First computed 2026-05-18T02:44:41.300674Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

0457026597fa5b9d1bb712e0285cc0f233f4a47b0ac4cfc09428665194fb85b6

Aliases

arxiv: 2605.13484 · arxiv_version: 2605.13484v1 · doi: 10.48550/arxiv.2605.13484 · pith_short_12: ARLQEZMX7JNZ · pith_short_16: ARLQEZMX7JNZ2G5X · pith_short_8: ARLQEZMX
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ARLQEZMX7JNZ2G5XCLQCQXGA6I \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 0457026597fa5b9d1bb712e0285cc0f233f4a47b0ac4cfc09428665194fb85b6
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "f486e025df21c182efa4c4871109123568187b4f4bb78144ea988aa99f974234",
    "cross_cats_sorted": [
      "cs.AI",
      "stat.ME"
    ],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T13:07:50Z",
    "title_canon_sha256": "d0a5018791a536ade7c8f0efc37beedd55fd983d71646c6fc979ae428be4a554"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13484",
    "kind": "arxiv",
    "version": 1
  }
}