Introduces the UCSF-PDGM-VQA dataset of 2387 QA pairs from 473 glioma MRI studies and demonstrates that state-of-the-art VLMs exhibit modality collapse on multi-sequence 3D medical images.
Generalist foundation models from a multimodal dataset for 3D computed tomography , ISSN=
8 Pith papers cite this work. Polarity classification is still indexing.
years
2026 8representative citing papers
HealthAgentBench is a new benchmark of 54 healthcare agent tasks where even the strongest frontier AI agent reaches only about 42% success rate on end-to-end clinical workflows.
DALE-CT, a 2D LeJEPA model with depth-aware dual supervision, reaches 0.833 Macro AUROC on multi-abnormality detection in CT and approaches 3D SOTA performance using less data and no textual supervision.
CheXanatomy trains VLMs to generate 2D anatomical masks via next-token prediction on synthetic CXRs from CT, matching U-Net performance with better domain-shift robustness and sample efficiency.
LoRA fine-tuning of 3-4B SLMs on 162K multi-task radiology data yields strong performance deployable on consumer CPUs at 4-8 tokens/second.
EXACT pre-trains a vision model on 25k CT-report pairs with anatomy-aware weak supervision to output explainable anomaly-aware maps that improve diagnosis, localization, and report generation over prior 3D medical models.
Using GPT-5.4 to clean labels in the CT-RATE chest CT dataset revealed 3.6% discordance with original labels, with radiologists supporting the LLM labels in 74-92% of reviewed cases.