Multimodal LLMs achieve far lower diagnostic accuracy on real hospital dermatology cases than on public benchmarks, with added clinical context helping but not enough for reliable deployment.
Pre-trained multimodal large language model enhances dermatological diagnosis using SkinGPT-4
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
ACCEPT 1representative citing papers
citing papers explorer
-
Are Multimodal LLMs Ready for Clinical Dermatology? A Real-World Evaluation in Dermatology
Multimodal LLMs achieve far lower diagnostic accuracy on real hospital dermatology cases than on public benchmarks, with added clinical context helping but not enough for reliable deployment.