A clinical validity protocol using indices from 2x2 contingency tables classifies LLM confidence signals as Valid, Indeterminate, or Invalid, with valid profiles showing positive correlation to accuracy and invalid ones negative.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Screen Before You Interpret: A Portable Validity Protocol for Benchmark-Based LLM Confidence Signals
A clinical validity protocol using indices from 2x2 contingency tables classifies LLM confidence signals as Valid, Indeterminate, or Invalid, with valid profiles showing positive correlation to accuracy and invalid ones negative.