Replacing the generic Stable Diffusion VAE with domain-specific MedVAE pretrained on 1.6M medical images improves diffusion-based SR PSNR by 2.91-3.29 dB on knee/brain MRI and chest X-ray, with gains in fine details and VAE quality predicting SR performance (R²=0.67).
C.; Schwaighofer, A.; Lungren, M
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3roles
baseline 1polarities
baseline 1representative citing papers
RA-RRG extracts key phrases with LLMs, retrieves them via multimodal similarity, and conditions report generation on them to achieve SOTA CheXbert scores and competitive RadGraph F1 on MIMIC-CXR and IU X-ray while supporting multi-view inputs.
M4CXR is a multi-modal large language model that performs multiple tasks in chest X-ray analysis including report generation with claimed SOTA clinical accuracy using chain-of-thought prompting.
citing papers explorer
-
Domain-Specific Latent Representations Improve the Fidelity of Diffusion-Based Medical Image Super-Resolution
Replacing the generic Stable Diffusion VAE with domain-specific MedVAE pretrained on 1.6M medical images improves diffusion-based SR PSNR by 2.91-3.29 dB on knee/brain MRI and chest X-ray, with gains in fine details and VAE quality predicting SR performance (R²=0.67).
-
RA-RRG: Multimodal Retrieval-Augmented Radiology Report Generation with Key Phrase Extraction
RA-RRG extracts key phrases with LLMs, retrieves them via multimodal similarity, and conditions report generation on them to achieve SOTA CheXbert scores and competitive RadGraph F1 on MIMIC-CXR and IU X-ray while supporting multi-view inputs.
-
M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation
M4CXR is a multi-modal large language model that performs multiple tasks in chest X-ray analysis including report generation with claimed SOTA clinical accuracy using chain-of-thought prompting.