pith:SMUSK7IX
Grounded or Guessing? LVLM Confidence Estimation via Blind-Image Contrastive Ranking
Training probes to prefer real images over blacked-out ones lets them detect when vision-language models actually use visual input for their answers.
arxiv:2605.10893 v2 · 2026-05-11 · cs.CL
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{SMUSK7IXTVJTAMSNRIWZLWYFP6}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
BICR achieves the best cross-LVLM average on both calibration and discrimination simultaneously, with statistically significant discrimination gains robust to cluster-aware analysis at 4-18x fewer parameters than the strongest probing baseline.
That penalizing higher confidence on the blacked-out image view via ranking loss on hidden states will cause the probe to reliably treat the presence of visual information as a signal of prediction reliability (abstract, paragraph describing the training objective).
BICR trains a lightweight probe on contrastive hidden states from real versus blind images to detect visual grounding in LVLM predictions, outperforming baselines on calibration and discrimination with fewer parameters.
References
Formal links
Receipt and verification
| First computed | 2026-05-20T00:00:42.569712Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
9329257d179d5330324d8a2d95db057fb2ac517f81a169b7324ac6e87e36a8c8
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/SMUSK7IXTVJTAMSNRIWZLWYFP6 \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 9329257d179d5330324d8a2d95db057fb2ac517f81a169b7324ac6e87e36a8c8
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "a387993af524cff36faabb839a651d8c05bc86143c44b1cce94dc3d3dcf8da7a",
"cross_cats_sorted": [],
"license": "http://creativecommons.org/licenses/by/4.0/",
"primary_cat": "cs.CL",
"submitted_at": "2026-05-11T17:35:10Z",
"title_canon_sha256": "520b3966640d7ace8afa9a7ea8c97592ce5ac4545ecd0961391f3fd8adb4001e"
},
"schema_version": "1.0",
"source": {
"id": "2605.10893",
"kind": "arxiv",
"version": 2
}
}