Evaluation of 6233 MedGPTs finds 25-30% with low factual accuracy, 33.6-54.3% violating operational thresholds, and 57% of action-enabled models lacking privacy disclosures.
arXiv preprint arXiv:2409.19492 (2024).https://arxiv
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
MIRAGE combines a medical CLIP model, a diffusion generator, and an LLM into an accessible interface for retrieving and creating educational medical images and texts.
citing papers explorer
-
Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models
Evaluation of 6233 MedGPTs finds 25-30% with low factual accuracy, 33.6-54.3% violating operational thresholds, and 57% of action-enabled models lacking privacy disclosures.
-
MIRAGE: Retrieval and Generation of Multimodal Images and Texts for Medical Education
MIRAGE combines a medical CLIP model, a diffusion generator, and an LLM into an accessible interface for retrieving and creating educational medical images and texts.