CLUES decomposes semantic uncertainty into separate ambiguity and instability scores for clinical Text-to-SQL, with instability via Schur complement, outperforming Kernel Language Entropy on failure prediction while enabling diagnostic triage.
arXiv preprint arXiv:2506.09684
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.
citing papers explorer
-
Disentangling Ambiguity from Instability in Large Language Models: A Clinical Text-to-SQL Case Study
CLUES decomposes semantic uncertainty into separate ambiguity and instability scores for clinical Text-to-SQL, with instability via Schur complement, outperforming Kernel Language Entropy on failure prediction while enabling diagnostic triage.
-
Position: Uncertainty Quantification in LLMs is Just Unsupervised Clustering
Mainstream UQ for LLMs reduces to unsupervised clustering of internal generation consistency and therefore cannot detect confident hallucinations or provide reliable safety signals.