Introduces the first large-scale multimodal benchmark MedLayXPlain-122K showing medical VLMs suffer significant lay-register degradation while general VLMs lack clinical precision.
Title resolution pending
21 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 21representative citing papers
SAM-family models split into occluder-aware types that avoid predicting into occluded regions and occluder-agnostic types that confidently segment hidden areas, shown via a new benchmark on polyp datasets.
Prefer-DAS integrates self-training, prompt-guided contrastive learning, local direct preference optimization (LPO), and unsupervised preference optimization (UPO) to achieve effective domain adaptive segmentation in electron microscopy using sparse prompts and local preferences.
DC-TTA improves interactive segmentation accuracy by partitioning user clicks into subsets for independent test-time adaptation of SAM models and merging the specialized predictors.
Consispace is a semantic-aware resampling method that uses an implicit neural network with ODE constraints and feature reweighting to achieve consistent axial voxel spacing while preserving anatomy and semantics, improving downstream segmentation.
Presents MMIO benchmark and RTVP method achieving state-of-the-art 42.2% AP in zero-shot industrial defect detection.
MedSIGHT unifies medical image comprehension and segmentation in Med-LVLMs via a Region Perceiver module and region codebook, trained progressively on 72K pairs to reach SOTA on both tasks across modalities.
MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.
MeniOmni is a new structured multimodal benchmark dataset and evaluation framework for fine-grained Stoller grading and diagnostic report generation from knee MRI combined with clinical priors.
MedVol-R1 is an RL framework that decouples 2D evidence grounding from 3D mask generation for volumetric reasoning segmentation and reports SOTA results on M3D-Seg benchmarks.
A neurosymbolic approach uses fuzzy logic constraints to refine SAM under weak supervision, producing improved pseudo-labels that enable state-of-the-art segmentation on Pascal VOC and REFUGE2.
Adapting image editing foundation models via LoRA with multi-reference conditioning achieves state-of-the-art CT metal artifact reduction using two orders of magnitude less paired training data than prior methods.
RABC-Net achieves 86.58% DICE and 79.47% JAC on skin lesion segmentation across ISIC-2017, ISIC-2018, and PH2 using only pseudo-labels and no manual masks for training or adaptation.
AHCQ-SAM introduces ACNR, HLUQ, CAG, and LNQ quantization techniques that deliver 15.2% mAP gain on 4-bit SAM-B and 14.01% J&F gain on 4-bit SAM2-Tiny versus prior PTQ methods.
MorVess improves pulmonary vessel segmentation by jointly predicting vessel masks, distance maps, and thickness maps using a 2.5D SAM adapter and global-local fusion for better small-vessel recovery and connectivity.
RoiMAM integrates a training-free ROI Generation Module with Semantic Selective Suppression and a Text Prompt Enhancer to produce a compact VLM that reports 2 percent and 4.6 percent accuracy gains on SLAKE and PMC-VQA at less than 20 percent the size of MedVInT-TD.
Permutation-COMQ is a new post-training quantization algorithm that reorders weights within layers and uses only dot-product and rounding steps to deliver the highest reported accuracy for 2-, 4-, and 8-bit medical foundation models.
Presents APRIL-MedSeg, a modular YAML-configurable toolbox for 2D medical image segmentation integrating semi-supervised, domain adaptation, distillation, weakly supervised, text-guided, and foundation model paradigms with unified dataset and deployment interfaces.
Semi-supervised fetal cardiac ultrasound analysis using SAM-Med2D boundary refinement and DINOv3 semantic enhancement on the EchoCare backbone reports 79.99% Dice, 61.62% NSD, and 41.20% F1 on the FETUS 2026 leaderboard.
MedSynapse-V proposes a latent memory evolution framework with meta-query prior retrieval, causal counterfactual refinement via RL, and intrinsic memory transition to improve diagnostic accuracy over chain-of-thought baselines in medical VLMs.
MAE-SAM2 integrates MAE self-supervised learning with SAM2 to achieve superior segmentation of retinal vascular leakage on fluorescein angiography images, with highest Dice/IoU scores and 5% improvement over original SAM2.
citing papers explorer
-
MEDLAYXPLAIN: Benchmarking the Expert-Lay Gap in Medical Vision-Language Models
Introduces the first large-scale multimodal benchmark MedLayXPlain-122K showing medical VLMs suffer significant lay-register degradation while general VLMs lack clinical precision.
-
Seeing Through the Tool: A Controlled Benchmark for Occlusion Robustness in Foundation Segmentation Models
SAM-family models split into occluder-aware types that avoid predicting into occluded regions and occluder-agnostic types that confidently segment hidden areas, shown via a new benchmark on polyp datasets.
-
Prefer-DAS: Learning from Local Preferences and Sparse Prompts for Domain Adaptive Segmentation of Electron Microscopy
Prefer-DAS integrates self-training, prompt-guided contrastive learning, local direct preference optimization (LPO), and unsupervised preference optimization (UPO) to achieve effective domain adaptive segmentation in electron microscopy using sparse prompts and local preferences.
-
DC-TTA: Divide-and-Conquer Framework for Test-Time Adaptation of Interactive Segmentation
DC-TTA improves interactive segmentation accuracy by partitioning user clicks into subsets for independent test-time adaptation of SAM models and merging the specialized predictors.
-
Towards Voxel Spacing Consistency for Medical Image Segmentation
Consispace is a semantic-aware resampling method that uses an implicit neural network with ODE constraints and feature reweighting to achieve consistent axial voxel spacing while preserving anatomy and semantics, improving downstream segmentation.
-
Zero-Shot Learning in Industrial Scenarios: New Large-Scale Benchmark, Challenges and Baseline
Presents MMIO benchmark and RTVP method achieving state-of-the-art 42.2% AP in zero-shot industrial defect detection.
-
MedSIGHT: Towards Grounded Visual Comprehension in Medical Large Vision-Language Models
MedSIGHT unifies medical image comprehension and segmentation in Med-LVLMs via a Region Perceiver module and region codebook, trained progressively on 72K pairs to reach SOTA on both tasks across modalities.
-
MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models
MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.
-
MeniOmni: A Structured Multimodal Benchmark for Holistic Meniscus Injury Assessment
MeniOmni is a new structured multimodal benchmark dataset and evaluation framework for fine-grained Stoller grading and diagnostic report generation from knee MRI combined with clinical priors.
-
MedVol-R1: Reward-Driven Evidence Grounding for Volumetric Reasoning Segmentation
MedVol-R1 is an RL framework that decouples 2D evidence grounding from 3D mask generation for volumetric reasoning segmentation and reports SOTA results on M3D-Seg benchmarks.
-
Weakly Supervised Segmentation as Semantic-Based Regularization
A neurosymbolic approach uses fuzzy logic constraints to refine SAM under weak supervision, producing improved pseudo-labels that enable state-of-the-art segmentation on Pascal VOC and REFUGE2.
-
Leveraging Image Editing Foundation Models for Data-Efficient CT Metal Artifact Reduction
Adapting image editing foundation models via LoRA with multi-reference conditioning achieves state-of-the-art CT metal artifact reduction using two orders of magnitude less paired training data than prior methods.
-
RABC-Net: Reliability-Aware Annotation-Free Skin Lesion Segmentation for Low-Resource Dermoscopy
RABC-Net achieves 86.58% DICE and 79.47% JAC on skin lesion segmentation across ISIC-2017, ISIC-2018, and PH2 using only pseudo-labels and no manual masks for training or adaptation.
-
AHCQ-SAM: Toward Accurate and Hardware-Compatible Post-Training Segment Anything Model Quantization
AHCQ-SAM introduces ACNR, HLUQ, CAG, and LNQ quantization techniques that deliver 15.2% mAP gain on 4-bit SAM-B and 14.01% J&F gain on 4-bit SAM2-Tiny versus prior PTQ methods.
-
MorVess: Morphology-Aware Pulmonary Vessel Segmentation Network
MorVess improves pulmonary vessel segmentation by jointly predicting vessel masks, distance maps, and thickness maps using a 2.5D SAM adapter and global-local fusion for better small-vessel recovery and connectivity.
-
RoiMAM: Region-of-Interest Medical Attention Model for Efficient Vision-Language Understanding
RoiMAM integrates a training-free ROI Generation Module with Semantic Selective Suppression and a Text Prompt Enhancer to produce a compact VLM that reports 2 percent and 4.6 percent accuracy gains on SLAKE and PMC-VQA at less than 20 percent the size of MedVInT-TD.
-
Weight Group-wise Post-Training Quantization for Medical Foundation Model
Permutation-COMQ is a new post-training quantization algorithm that reorders weights within layers and uses only dot-product and rounding steps to deliver the highest reported accuracy for 2-, 4-, and 8-bit medical foundation models.
-
APRIL-MedSeg: A Modular Medical Image Segmentation Toolbox Embracing Modern Paradigms
Presents APRIL-MedSeg, a modular YAML-configurable toolbox for 2D medical image segmentation integrating semi-supervised, domain adaptation, distillation, weakly supervised, text-guided, and foundation model paradigms with unified dataset and deployment interfaces.
-
Synergistic Foundation Models for Semi-Supervised Fetal Cardiac Ultrasound Analysis: SAM-Med2D Boundary Refinement and DINOv3 Semantic Enhancement
Semi-supervised fetal cardiac ultrasound analysis using SAM-Med2D boundary refinement and DINOv3 semantic enhancement on the EchoCare backbone reports 79.99% Dice, 61.62% NSD, and 41.20% F1 on the FETUS 2026 leaderboard.
-
MedSynapse-V: Bridging Visual Perception and Clinical Intuition via Latent Memory Evolution
MedSynapse-V proposes a latent memory evolution framework with meta-query prior retrieval, causal counterfactual refinement via RL, and intrinsic memory transition to improve diagnostic accuracy over chain-of-thought baselines in medical VLMs.
-
MAE-SAM2: Mask Autoencoder-Enhanced SAM2 for Clinical Retinal Vascular Leakage Segmentation
MAE-SAM2 integrates MAE self-supervised learning with SAM2 to achieve superior segmentation of retinal vascular leakage on fluorescein angiography images, with highest Dice/IoU scores and 5% improvement over original SAM2.