arxiv: 2405.09806 · v7 · submitted 2024-05-16 · 💻 cs.CV · cs.AI· cs.CL· cs.LG

A Generalist Model for Diverse Text-Guided Medical Image Synthesis

Joseph Cho , Mrudang Mathur , Cyril Zakka , Dhamanpreet Kaur , Matthew Leipzig , Alex Dalal , Aravind Krishnan , Eubee Koo

show 10 more authors

Karen Wai Cindy S. Zhao Akshay Chaudhari Matthew Duda Ashley Choi Ehsan Rahimy Lyna Azzouz Robyn Fong Rohan Shad William Hiesinger

This is my paper

classification 💻 cs.CV cs.AIcs.CLcs.LG

keywords generalistimagesmedicalsyntheticdataimagemodelmodels

0 comments

read the original abstract

Deep learning algorithms require extensive data to achieve robust performance. However, data availability is often restricted in the medical domain due to patient privacy concerns. Synthetic data presents a possible solution to these challenges. Image generative models have found increasing use for medical applications, but are often task-specific, thus limiting their scalability. Moreover, existing models frequently rely on private datasets for training, which constrain their reproducibility. To address this, we introduce MediSyn: an open-access, generalist, text-guided latent diffusion model capable of generating synthetic images across 6 medical specialties and 10 imaging modalities, while being trained exclusively on publicly available data. Through extensive experimentation, we provide several key contributions. First, we demonstrate that training a generative model on visually diverse medical images does not degrade synthetic image quality. Second, we show that this generalist approach is substantially more computationally efficient than a coordinated suite of task-specific models. Third, we establish that a generalist model can produce realistic, text-aligned synthetic images across visually and medically distinct modalities, as validated by expert physicians. Fourth, we provide empirical evidence that these synthetic images are visually distinct from their corresponding real patient images, alleviating concerns about data memorization in image generative models. Finally, we demonstrate that a generalist model can produce synthetic images that improve classifier performance in data-limited settings across multiple medical specialties. Altogether, our findings highlight the immense potential of generalist image generative models to accelerate algorithmic research and development in medicine.

This paper has not been read by Pith yet.

A Generalist Model for Diverse Text-Guided Medical Image Synthesis

discussion (0)