BiomedCLIP, pretrained on the new 15-million-pair PMC-15M dataset, achieves state-of-the-art performance on diverse biomedical vision-language tasks and even outperforms radiology-specific models on chest X-ray pneumonia detection.
arXiv preprint arXiv:2010.00747 , year=
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
A hierarchical prior-decoder model using CLIP latents generates more diverse text-conditional images than direct methods while preserving photorealism and caption fidelity.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
MetaCLIP curates balanced 400M-pair subsets from CommonCrawl that outperform CLIP data, reaching 70.8% zero-shot ImageNet accuracy on ViT-B versus CLIP's 68.3%.
CheXanatomy trains VLMs to generate 2D anatomical masks via next-token prediction on synthetic CXRs from CT, matching U-Net performance with better domain-shift robustness and sample efficiency.
PaCX-MAE augments masked autoencoding of chest X-rays with dual contrastive-predictive alignment to ECG and laboratory embeddings, reporting gains on physiology-dependent tasks while remaining unimodal at test time.
EchoCare-CLIP achieves 0.682 paired alignment on a 16K ultrasound image-text corpus but downstream zero-shot classification peaks at 0.709 on BUSI only with partial fine-tuning, while full fine-tuning overfits.
PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.
citing papers explorer
-
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
-
CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs
CheXanatomy trains VLMs to generate 2D anatomical masks via next-token prediction on synthetic CXRs from CT, matching U-Net performance with better domain-shift robustness and sample efficiency.
-
PaCX-MAE: Physiology-Augmented Chest X-Ray Masked Autoencoder
PaCX-MAE augments masked autoencoding of chest X-rays with dual contrastive-predictive alignment to ECG and laboratory embeddings, reporting gains on physiology-dependent tasks while remaining unimodal at test time.
-
PaliGemma: A versatile 3B VLM for transfer
PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.