SonoCLIP presents a mask-guided region-aware vision-language foundation model pretrained on 1.44M fetal ultrasound images, demonstrating superior zero-shot performance.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
FetUSAgents uses tool-augmented multi-agent collaboration and Dual-Path Evidence Arbitration to exceed prior MLLMs by over 25% on a new fetal ultrasound VQA benchmark.
citing papers explorer
-
SonoCLIP: Mask-Guided Region-Aware Vision-Language Pretraining for Fetal Ultrasound Analysis
SonoCLIP presents a mask-guided region-aware vision-language foundation model pretrained on 1.44M fetal ultrasound images, demonstrating superior zero-shot performance.
-
Towards Reliable Fetal Ultrasound Interpretation with Multi-Agent Collaboration
FetUSAgents uses tool-augmented multi-agent collaboration and Dual-Path Evidence Arbitration to exceed prior MLLMs by over 25% on a new fetal ultrasound VQA benchmark.