LLMs hallucinate in 19.7% of textbook-grounded medical QA answers despite high plausibility scores, indicating they remain unfit for unsupervised clinical use.
Multi-model assur- ance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clin- ical decision support.Commun
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
Quantifying Hallucinations in Language Language Models on Medical Textbooks
LLMs hallucinate in 19.7% of textbook-grounded medical QA answers despite high plausibility scores, indicating they remain unfit for unsupervised clinical use.