A unified benchmark of 24 black-box UE methods for LLMs finds no universal winner but favors methods that reason over answer candidates and hybrid combinations of signals.
Uncertainty quantification for multimodal large language models with incoherence-adjusted semantic volume.arXiv preprint arXiv:2602.24195, 2026
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Systematic Evaluation of Black-Box Uncertainty Estimation Methods for Large Language Models
A unified benchmark of 24 black-box UE methods for LLMs finds no universal winner but favors methods that reason over answer candidates and hybrid combinations of signals.