MedFlowBench evaluates VLM agents on full radiology and pathology studies by requiring both task answers and verifiable evidence like key slices and regions of interest, revealing that answer-only scores overestimate performance.
hub Mixed citations
MONAI: An open-source framework for deep learning in healthcare
Mixed citation behavior. Most common role is background (40%).
abstract
Artificial Intelligence (AI) is having a tremendous impact across most areas of science. Applications of AI in healthcare have the potential to improve our ability to detect, diagnose, prognose, and intervene on human disease. For AI models to be used clinically, they need to be made safe, reproducible and robust, and the underlying software framework must be aware of the particularities (e.g. geometry, physiology, physics) of medical data being processed. This work introduces MONAI, a freely available, community-supported, and consortium-led PyTorch-based framework for deep learning in healthcare. MONAI extends PyTorch to support medical data, with a particular focus on imaging, and provide purpose-specific AI model architectures, transformations and utilities that streamline the development and deployment of medical AI models. MONAI follows best practices for software-development, providing an easy-to-use, robust, well-documented, and well-tested software framework. MONAI preserves the simple, additive, and compositional approach of its underlying PyTorch libraries. MONAI is being used by and receiving contributions from research, clinical and industrial teams from around the world, who are pursuing applications spanning nearly every aspect of healthcare.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
VISTA is a source-free TTA framework for multi-sequence MRI segmentation that uses inter-sequence spectral/patch interventions and cross-view variance gating to handle modality-interaction shifts, reporting Dice gains of 1.89% and 2.82% on SSA and PED cohorts.
SegWithU treats uncertainty as perturbation energy via rank-1 probes in a post-hoc head for efficient single-pass risk-aware medical image segmentation, outperforming other single-forward-pass methods on ACDC, BraTS2024, and LiTS.
A sequential diffusion framework generates controllable abdominal anatomies with a Volume Control Scalar that decouples organ size from body habitus, achieving Dice scores around 0.83 and reducing distributional mismatch by 73.6% in a hepatomegaly example.
Camyla autonomously generates research proposals, experiments, and manuscripts in medical image segmentation, outperforming baselines on 24 of 31 recent datasets while producing 40 human-reviewed papers.
MP-ViT uses dual transformers and cross-attention on axial and sagittal MRI to classify hemorrhages, reporting 5.5% higher AUC than standard ViT and 1.8% higher than CNNs on a dataset of 12,869 subjects.
Tabular clinical data guides contrastive learning on cardiac MR images to build better visual representations by identifying patient similarities, outperforming image-only augmentation on downstream disease prediction tasks.
Proposes a cyclic 2.5D perceptual loss with manufacturer SUVR standardization for T1w MRI to tau PET synthesis, reporting improved regional agreement on ADNI and SCAN cohorts across U-Net, UNETR, SwinUNETR, CycleGAN, and Pix2Pix.
A uniform benchmark across 77 experiments finds SRGAN superior to latent diffusion models for 3D medical image translation, with synthetic volumes indistinguishable from real ones in a 17-physician Turing test.
Tumor-aware augmentation and anisotropic cropping improve CT-to-MRI transfer for rectal cancer segmentation in hierarchical transformers by reducing attention dilution from padding and enhancing feature adaptation.
SIAM achieves state-of-the-art whole-head MRI segmentation of 16 structures including extra-cerebral tissues by training on synthetic data from just six manual templates, matching or exceeding prior methods on 301 scans across eight heterogeneous datasets.
GeoSAE extracts a compact, interpretable feature set from frozen brain MRI foundation models that predicts MCI-to-AD conversion (AUC 0.746) with age-deconfounded annotations and replicates across cohorts.
ESICA delivers state-of-the-art accuracy on a five-modality 3D medical segmentation benchmark while offering a compact variant with far fewer parameters.
A 4D diffusion generative model learns topology-preserving spatiotemporal deformations to synthesize realistic longitudinal brain anatomy trajectories in neurodegenerative diseases from sparse follow-up scans.
DAGMaN uses co-distilled attention-guided masked image modeling with a noisy teacher to enable effective self-supervised pretraining on medical images by selective masking of co-occurring patches and maintenance of attention head diversity, with demonstrations on nodule classification, immunotherapy
Neuro-Oracle distills longitudinal MRI changes into trajectory vectors via a 3D Siamese encoder, retrieves similar cases, and generates LLM-based prognoses, achieving AUC 0.834-0.905 on a resection-type proxy task versus 0.793 for single-timepoint baseline.
SUMI distills photon-counting CT quality into routine chest CT by learning to reverse clinically validated acquisition degradations, yielding 15-20% gains in image metrics, better radiologist utility, and up to 15% higher lesion detection sensitivity.
The paper reports a new annotated 7T ToF MRA dataset for small vessel segmentation and shows that top deep learning methods reach Dice scores of 0.838 on internal test data and 0.716 on an external secret dataset.
FlexiCT provides CT foundation models via agglomerative pretraining on 266227 volumes from 56 datasets that match or exceed task-specific models on five task families while organizing embeddings along tumor-stage gradients.
Semi-MedRef introduces T-PatchMix, PosAug, and ITCL within a teacher-student SSL setup to preserve image-text alignment under augmentation for medical referring segmentation on QaTa-COV19 and MosMedData+.
NeuroAgent uses a hierarchical LLM agent framework with Generate-Execute-Validate loops to automate neuroimaging preprocessing, reaching 84.8% end-to-end correctness and 0.9518 AUC for Alzheimer's classification on 1470 ADNI subjects using four modalities.
The autoPET3 challenge finds that leading AI models reach a mean Dice score of 0.66 for multitracer PET/CT lesion segmentation, with compositional generalization to unseen tracer-center pairs remaining an open problem driven by volume overestimation and case heterogeneity.
A latent diffusion model jointly synthesizes MRI volumes and mixed-type tabular clinical data in a shared space via cross-attention and separate decoders after VAE fusion.
MIGF improves multi-modal prostate MRI segmentation robustness via modality-isolated streams and dropout training, yielding ranking score gains of 2.8-13.4% across backbones and better tolerance to degraded diffusion sequences on PI-CAI and Prostate158.
citing papers explorer
-
Multimodal synthesis of MRI and tabular data with diffusion in a joint latent space via cross-attention
A latent diffusion model jointly synthesizes MRI volumes and mixed-type tabular clinical data in a shared space via cross-attention and separate decoders after VAE fusion.
-
Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix It
MaskGen improves domain generalization for biomedical image segmentation by using source intensities plus domain-stable foundation model representations with minimal added complexity.