Rad-VLSM is a cross-modal two-stage framework that converts semantic guidance from BLIP-2 into box prompts for SAM-based lesion segmentation and then uses the resulting masks as spatial priors in a visual-radiomics fusion head for diagnosis.
Miafex: an attention-based feature extraction method for medical image classification,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CV 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Rad-VLSM: A Cross-Modal Framework with Semantics-Assisted Prompting for Medical Segmentation and Diagnosis
Rad-VLSM is a cross-modal two-stage framework that converts semantic guidance from BLIP-2 into box prompts for SAM-based lesion segmentation and then uses the resulting masks as spatial priors in a visual-radiomics fusion head for diagnosis.