White-box method ReXTrust achieves highest AUC (peak 93.0) on Gut-VLM across five VLMs, outperforming alternatives by statistically significant margins while black-box and some gray-box methods collapse on certain models.
A survey on uncertainty quantification of large language models: Taxonomy, open research challenges, and future directions
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 5verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
VIG-TUQ improves token-level uncertainty estimation in LVLMs by weighting language uncertainty with visual grounding scores based on the observation that confident predictions rely more on visual content.
DPUA is a two-phase framework that aligns LLM uncertainty expressions with human disagreement distributions in subjectivity analysis while preserving task performance.
SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.
Chatbot AI systems often fail complex needs while projecting authority, contributing to deskilling, labor displacement, economic concentration, and high environmental costs, so alternative pluralistic and task-specific designs are needed.
citing papers explorer
-
A Benchmark for Hallucination Detection in VLMs for Gastrointestinal Endoscopy
White-box method ReXTrust achieves highest AUC (peak 93.0) on Gut-VLM across five VLMs, outperforming alternatives by statistically significant margins while black-box and some gray-box methods collapse on certain models.
-
Leveraging Visual Signals for Robust Token-Level Uncertainty in Vision-Language Generation
VIG-TUQ improves token-level uncertainty estimation in LVLMs by weighting language uncertainty with visual grounding scores based on the observation that confident predictions rely more on visual content.
-
Aligning LLM Uncertainty with Human Disagreement in Subjectivity Analysis
DPUA is a two-phase framework that aligns LLM uncertainty expressions with human disagreement distributions in subjectivity analysis while preserving task performance.
-
SWAY: A Counterfactual Computational Linguistic Approach to Measuring and Mitigating Sycophancy
SWAY quantifies sycophancy in LLMs via shifts under linguistic pressure and a counterfactual chain-of-thought mitigation reduces it to near zero while preserving responsiveness to genuine evidence.
-
What if AI systems weren't chatbots?
Chatbot AI systems often fail complex needs while projecting authority, contributing to deskilling, labor displacement, economic concentration, and high environmental costs, so alternative pluralistic and task-specific designs are needed.