OpenMedQ: Broad Open Pretraining for Medical Vision-Language Models

Ibrahim Gulluk; Max Van Puyvelde; Olivier Gevaert

arxiv: 2606.12953 · v1 · pith:4PSDTXTQnew · submitted 2026-06-11 · 💻 cs.AI · cs.CV· cs.LG· eess.IV

OpenMedQ: Broad Open Pretraining for Medical Vision-Language Models

Ibrahim Gulluk , Max Van Puyvelde , Olivier Gevaert This is my paper

classification 💻 cs.AI cs.CVcs.LGeess.IV

keywords medicalopenmedqbaselinebleu-1pretrainingvision-languageavailableaverage

0 comments

read the original abstract

We present OpenMedQ, a medical vision-language model pretrained on the broadest fully-open medical mix to date: 14 datasets totaling ~3.35M pretraining samples spanning pathology, radiology, microscopy, and text-only clinical QA. OpenMedQ reaches state-of-the-art BLEU-1 on PathVQA (75.9), beating Med-PaLM M variants up to 562B parameters (~80x larger), and matches the best reported VQA-MED BLEU-1 (64.5). Its vision encoder, transferred to 8 unseen medical classification benchmarks under an identical downstream recipe, obtains the highest average macro-F1 (0.757) among BiomedCLIP (0.745), PMC-CLIP (0.745), PubMedCLIP (0.746), and a from-scratch baseline (0.616). We release our code and an interactive demo is publicly available as a reproducible baseline for the community.

This paper has not been read by Pith yet.

OpenMedQ: Broad Open Pretraining for Medical Vision-Language Models

discussion (0)