Weakly-Supervised Lung Nodule Segmentation via Training-Free Guidance of 3D Rectified Flow

Fredrik Kahl; Jennifer Alv\'en; Richard Petersen

arxiv: 2604.08313 · v2 · submitted 2026-04-09 · 💻 cs.CV

Weakly-Supervised Lung Nodule Segmentation via Training-Free Guidance of 3D Rectified Flow

Richard Petersen , Fredrik Kahl , Jennifer Alv\'en This is my paper

Pith reviewed 2026-05-10 18:17 UTC · model grok-4.3

classification 💻 cs.CV

keywords weakly-supervised segmentationlung nodule3D rectified flowtraining-free guidancemedical image segmentationimage-level labelsLUNA16

0 comments

The pith

A frozen 3D rectified flow model produces accurate lung nodule segmentations when guided by a predictor fine-tuned only on image-level labels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Creating detailed 3D segmentation masks for medical images requires expensive expert voxel labeling. The paper demonstrates that a pretrained rectified flow model can be kept frozen and steered via training-free guidance from a separate predictor that needs only image-level labels for its fine-tuning. This plug-and-play combination yields higher-quality nodule segmentations than standard weakly-supervised baselines. The method works for multiple predictor architectures and reliably locates nodules of different sizes and shapes. Results on the LUNA16 dataset support the claim that generative models can serve as reusable backbones for weakly-supervised 3D medical segmentation.

Core claim

By pairing a pretrained 3D rectified flow model with a predictor fine-tuned solely on image-level labels and applying training-free guidance, the method achieves improved weakly-supervised segmentation of lung nodules without any retraining of the generative model. The approach produces better segmentations than baseline methods on LUNA16 and detects nodules consistently across varying sizes and shapes when tested with two different predictors.

What carries the argument

Training-free guidance of a frozen pretrained 3D rectified flow model directed by signals from a predictor fine-tuned only on image-level labels.

If this is right

The plug-and-play combination improves segmentation quality over attribution-based baselines for lung nodules.
The same guidance strategy works with at least two distinct predictor models without retraining the flow component.
Nodules of varying sizes and shapes are detected more reliably than with prior weakly-supervised techniques.
Generative foundation models can be reused across different weakly-supervised 3D medical segmentation tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Annotation costs in clinical imaging pipelines could drop if only image-level labels are needed to adapt existing generative models.
The training-free guidance pattern might extend to other 3D structures such as organs or lesions in CT or MRI volumes.
Pretrained generative models could become standard reusable components for multiple downstream medical vision tasks that currently require dense labels.

Load-bearing premise

A predictor fine-tuned solely on image-level labels supplies sufficiently precise guidance signals to steer the frozen rectified flow model toward accurate voxel-level boundaries for small and variable lung nodules.

What would settle it

Evaluating the full pipeline on the LUNA16 test set and observing that segmentation metrics such as Dice scores fall below those of standard weakly-supervised baselines, especially on small nodules.

Figures

Figures reproduced from arXiv: 2604.08313 by Fredrik Kahl, Jennifer Alv\'en, Richard Petersen.

**Figure 1.** Figure 1: Overview of the weakly-supervised segmentation (WSS) framework. A predictor-guided rectified flow model generates a counterfactual reconstruction, and the residual image with respect to the input yields the segmentation mask. 2 Method Our method leverages pretrained foundation models in a plug-and-play manner to extract weakly-supervised segmentations of lung nodules. Specifically, we combine MAISI-v2, a … view at source ↗

**Figure 2.** Figure 2: Overview of the proposed training-free guidance (TFG) framework for predictor-guided rectified flow in latent space, performed at inference. The symbol indicates that the models are frozen. Training-free guidance. In order to avoid costly retraining of the generative model, we leverage the TFG framework [2,26,27], which enables guiding an arbitrary generative model using a predictor model, rather than tra… view at source ↗

**Figure 3.** Figure 3: Visual comparisons of WSS on LUNA16 for the MedSAM TinyVit predictor. Success and failure cases in green and red frame, respectively. The proposed method suppresses the lung nodules in the guided reconstruction (Columns 1-2), resulting in WSS that closely match the shape and size of the ground-truth masks (Columns 3- 4). The CAM-based methods generally over-segments nodules, producing masks that extend bey… view at source ↗

**Figure 4.** Figure 4: Visual comparisons of the WSS on LUNA16 for the RadImgNet ResNet50 predictor. Similar trends can be observed with a CNN-based predictor, where the proposed method produces masks that more closely follow the ground-truth nodule boundaries compared to the baseline methods. Experimental results. We evaluate the proposed WSS method in a plug-and-play setting where the predictor is pretrained and kept fixed. F… view at source ↗

read the original abstract

Dense annotations, such as segmentation masks, are expensive and time-consuming to obtain, especially for 3D medical images where expert voxel-wise labeling is required. Weakly supervised approaches aim to address this limitation, but often rely on attribution-based methods that struggle to accurately capture small structures such as lung nodules. In this paper, we propose a weakly-supervised segmentation method for lung nodules by combining pretrained state-of-the-art rectified flow and predictor models in a plug-and-play manner. Our approach uses training-free guidance of a 3D rectified flow model, requiring only fine-tuning of the predictor using image-level labels and no retraining of the generative model. The proposed method produces improved-quality segmentations for two separate predictors, consistently detecting lung nodules of varying size and shapes. Experiments on LUNA16 demonstrate improvements over baseline methods, highlighting the potential of generative foundation models as tools for weakly supervised 3D medical image segmentation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows a plug-and-play way to guide a frozen 3D rectified flow model with an image-level fine-tuned predictor for lung nodule segmentation, but the abstract gives no numbers so the gains are hard to judge.

read the letter

The main point is that this work uses training-free guidance on a pretrained 3D rectified flow model, where only a separate predictor is fine-tuned on image-level labels for nodule presence. No retraining of the generative model is needed, and the claim is that this yields better segmentations than baselines on LUNA16 while handling nodules of different sizes and shapes with two different predictors. That combination is the new piece, as it avoids the usual costs of dense 3D annotations or full model retraining. It does a reasonable job of framing the practical problem of annotation burden in medical imaging and points to generative models as a possible tool for weakly supervised tasks. The plug-and-play design is straightforward and could be tried by others without heavy compute. The soft spots are mostly around evidence. The abstract asserts improvements and consistent detection but supplies no metrics, error bars, baseline details, or ablation results, which makes it impossible to tell how large the gains are or whether they are reliable. The central assumption that image-level supervision alone can produce accurate voxel-level guidance for small, variable nodules is worth testing carefully; the paper notes that attribution methods already struggle with small structures, so the guidance step has to do the heavy lifting without pixel supervision. If the full results section shows only marginal or inconsistent gains on LUNA16, the practical advance shrinks. The method description and citation of rectified flow work look solid on the surface, with no obvious circularity. This paper is for researchers in medical image segmentation who want to experiment with generative guidance under weak labels. A reader already working on LUNA16 or similar datasets would get the most out of the method details and any visualizations. It deserves a serious referee because the idea is concrete, the setup is reproducible in principle, and the problem it targets is real, even if the current write-up needs more quantitative backing to stand on its own.

Referee Report

2 major / 2 minor

Summary. The paper proposes a weakly-supervised 3D lung nodule segmentation method that combines a pretrained rectified flow generative model with a predictor fine-tuned solely on image-level labels. It uses training-free guidance of the frozen 3D flow model in a plug-and-play manner, without retraining the generative component, and claims that this produces improved segmentations on the LUNA16 dataset for two separate predictors while consistently detecting nodules of varying sizes and shapes.

Significance. If the central claim holds, the work has moderate significance by showing how pretrained generative foundation models can support weakly-supervised medical image segmentation with reduced annotation and retraining costs. The training-free guidance approach is a potential strength for practical deployment. However, the lack of quantitative metrics, ablations, or detailed baseline comparisons in the experiments makes the practical impact difficult to assess at present.

major comments (2)

[§4] §4 (Experiments): The results on LUNA16 are described only qualitatively as 'improvements over baseline methods' and 'improved-quality segmentations' without reporting standard metrics (e.g., Dice score, IoU, sensitivity), error bars, statistical significance, or specific baseline implementations. This is load-bearing for the central claim of demonstrated improvements and consistent detection across nodule sizes/shapes.
[§3] §3 (Method): The training-free guidance mechanism is not specified in sufficient detail to show how an image-level predictor (fine-tuned only on presence/absence labels) produces reliable voxel-level guidance signals for the frozen 3D rectified flow model, especially for small, variable lung nodules where attribution methods are noted to struggle. This transfer is the core assumption and requires explicit formulation or pseudocode.

minor comments (2)

The abstract and introduction would benefit from a brief qualitative figure or example showing the guidance process and resulting segmentations to illustrate the plug-and-play claim.
[§3] Notation for the rectified flow model and predictor components should be introduced consistently with equation numbers or definitions in §3 to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comments point by point below and will revise the paper to incorporate additional details and quantitative results where needed.

read point-by-point responses

Referee: [§4] §4 (Experiments): The results on LUNA16 are described only qualitatively as 'improvements over baseline methods' and 'improved-quality segmentations' without reporting standard metrics (e.g., Dice score, IoU, sensitivity), error bars, statistical significance, or specific baseline implementations. This is load-bearing for the central claim of demonstrated improvements and consistent detection across nodule sizes/shapes.

Authors: We agree that quantitative evaluation is necessary to fully support the claims of improvement. The current version focuses on qualitative visualization to highlight consistency across nodule sizes and shapes, but we will revise Section 4 to report Dice scores, IoU, sensitivity, with error bars, statistical significance tests, and explicit descriptions of the baseline methods and their implementations on LUNA16. These metrics will be added from our existing experiments to demonstrate the gains. revision: yes
Referee: [§3] §3 (Method): The training-free guidance mechanism is not specified in sufficient detail to show how an image-level predictor (fine-tuned only on presence/absence labels) produces reliable voxel-level guidance signals for the frozen 3D rectified flow model, especially for small, variable lung nodules where attribution methods are noted to struggle. This transfer is the core assumption and requires explicit formulation or pseudocode.

Authors: We will expand Section 3 with an explicit formulation of the training-free guidance. The image-level predictor outputs a scalar probability that is converted into a guidance gradient applied directly in the voxel space of the frozen 3D rectified flow at each sampling step; this steers the flow trajectory toward label-consistent regions without retraining the generative model. The 3D nature of the flow provides the voxel-level resolution, bypassing direct attribution limitations for small nodules by leveraging the pretrained generative prior. We will include the mathematical equations and pseudocode for the full guidance procedure in the revision. revision: yes

Circularity Check

0 steps flagged

No circularity: method uses external pretrained models and standard image-level fine-tuning

full rationale

The paper proposes a plug-and-play combination of a frozen pretrained 3D rectified flow model with a predictor that is fine-tuned only on image-level labels. No equation or derivation step reduces the claimed voxel-level segmentation output to a quantity defined by the target segmentation itself, nor does any central claim rest on a self-citation chain that is unverified or tautological. The reported improvements on LUNA16 are presented as empirical outcomes of this external-model guidance procedure rather than as a mathematical identity or a fitted prediction by construction. The approach is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that pretrained rectified flow models are sufficiently general to be guided by a lightly fine-tuned predictor for accurate small-structure segmentation; no free parameters or new entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Pretrained state-of-the-art rectified flow models can be effectively steered for segmentation tasks using only a fine-tuned predictor without retraining the generative model.
Invoked when the method is described as plug-and-play with no retraining of the flow model.

pith-pipeline@v0.9.0 · 5461 in / 1401 out tokens · 69016 ms · 2026-05-10T18:17:49.141125+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

[1]

Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021

work page 2021
[2]

Universal guidance for diffusion models

Bansal et al. Universal guidance for diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 843–852, 2023

work page 2023
[3]

A survey on gans for anomaly detection.arXiv preprint arXiv:1906.11632, 2019

Di Mattia et al. A survey on gans for anomaly detection.arXiv preprint arXiv:1906.11632, 2019

work page arXiv 1906
[4]

Deepresidualseparableconvolutionalneuralnetworkforlungtumor segmentation.Computers in biology and medicine, 141:105161, 2022

Dutandeetal. Deepresidualseparableconvolutionalneuralnetworkforlungtumor segmentation.Computers in biology and medicine, 141:105161, 2022

work page 2022
[5]

Scaling rectified flow transformers for high-resolution image synthesis

Esser et al. Scaling rectified flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024

work page 2024
[6]

Maisi: Medical ai for synthetic imaging

Guo et al. Maisi: Medical ai for synthetic imaging. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4430–4441. IEEE, 2025

work page 2025
[7]

A flexible 2.5 d medical image segmentation approach with in-slice and cross-slice attention.Computers in Biology and Medicine, 182:109173, 2024

Kumar et al. A flexible 2.5 d medical image segmentation approach with in-slice and cross-slice attention.Computers in Biology and Medicine, 182:109173, 2024

work page 2024
[8]

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Liu et al. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022
[9]

Segment anything in medical images.Nature Communications, 15(1):654, 2024

Ma et al. Segment anything in medical images.Nature Communications, 15(1):654, 2024

work page 2024
[10]

Radimagenet: an open radiologic deep learning research dataset for effective transfer learning.Radiology: Artificial Intelligence, 4(5):e210315, 2022

Mei et al. Radimagenet: an open radiologic deep learning research dataset for effective transfer learning.Radiology: Artificial Intelligence, 4(5):e210315, 2022

work page 2022
[11]

Anomaly detection with conditioned denoising diffusion models

Mousakhan et al. Anomaly detection with conditioned denoising diffusion models. InDAGM German Conference on Pattern Recognition, pages 181–195. Springer, 2024

work page 2024
[12]

Flowchef: Steering of rectified flow models for controlled generations

Patel et al. Flowchef: Steering of rectified flow models for controlled generations. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15308–15318, 2025

work page 2025
[13]

What is healthy? generative counterfactual diffusion for lesion local- ization

Sanchez et al. What is healthy? generative counterfactual diffusion for lesion local- ization. InMICCAI workshop on deep generative models, pages 34–44. Springer, 2022

work page 2022
[14]

f-anogan: Fast unsupervised anomaly detection with generative ad- versarial networks.Medical image analysis, 54:30–44, 2019

Schlegl et al. f-anogan: Fast unsupervised anomaly detection with generative ad- versarial networks.Medical image analysis, 54:30–44, 2019

work page 2019
[15]

Grad-cam: Visual explanations from deep networks via gradient- based localization

Selvaraju et al. Grad-cam: Visual explanations from deep networks via gradient- based localization. InProceedings of the IEEE international conference on com- puter vision, pages 618–626, 2017

work page 2017
[16]

Setio et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 chal- lenge.Medical image analysis, 42:1–13, 2017

work page 2017
[17]

Axiomatic attribution for deep networks

Sundararajan et al. Axiomatic attribution for deep networks. InInternational conference on machine learning, pages 3319–3328. PMLR, 2017

work page 2017
[18]

Score-cam: Score-weighted visual explanations for convolutional neural networks

Wang et al. Score-cam: Score-weighted visual explanations for convolutional neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 24–25, 2020. 10 Anonymized Author et al

work page 2020
[19]

3d meddiffusion: A 3d medical latent diffusion model for control- lable and high-quality medical image generation.IEEE Transactions on Medical Imaging, 2025

Wang et al. 3d meddiffusion: A 3d medical latent diffusion model for control- lable and high-quality medical image generation.IEEE Transactions on Medical Imaging, 2025

work page 2025
[20]

Weakmedsam: Weakly-supervised medical image segmentation via sam with sub-class exploration and prompt affinity mining.IEEE Transactions on Medical Imaging, 2025

Wang et al. Weakmedsam: Weakly-supervised medical image segmentation via sam with sub-class exploration and prompt affinity mining.IEEE Transactions on Medical Imaging, 2025

work page 2025
[21]

Descargan: Disease-specific anomaly detection with weak supervision

Wolleb et al. Descargan: Disease-specific anomaly detection with weak supervision. InInternational conference on medical image computing and computer-assisted in- tervention, pages 14–24. Springer, 2020

work page 2020
[22]

Diffusion models for medical anomaly detection

Wolleb et al. Diffusion models for medical anomaly detection. InInternational Conference on Medical image computing and computer-assisted intervention, pages 35–45. Springer, 2022

work page 2022
[23]

Anoddpm: Anomaly detection with denoising diffusion probabilis- tic models using simplex noise

Wyatt et al. Anoddpm: Anomaly detection with denoising diffusion probabilis- tic models using simplex noise. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 650–656, 2022

work page 2022
[24]

Diff-unet: A diffu- sion embedded network for volumetric segmentation,

Xing et al. Diff-unet: A diffusion embedded network for volumetric segmentation. arXiv preprint arXiv:2303.10326, 2023

work page arXiv 2023
[25]

Medsyn: text-guided anatomy-aware synthesis of high-fidelity 3-d ct images.IEEE Transactions on Medical Imaging, 43(10):3648–3660, 2024

Xu et al. Medsyn: text-guided anatomy-aware synthesis of high-fidelity 3-d ct images.IEEE Transactions on Medical Imaging, 43(10):3648–3660, 2024

work page 2024
[26]

Tfg: Unified training-free guidance for diffusion models.Advances in Neural Information Processing Systems, 37:22370–22417, 2024

Ye et al. Tfg: Unified training-free guidance for diffusion models.Advances in Neural Information Processing Systems, 37:22370–22417, 2024

work page 2024
[27]

Freedom: Training-free energy-guided conditional diffusion model

Yu et al. Freedom: Training-free energy-guided conditional diffusion model. InPro- ceedings of the IEEE/CVF International Conference on Computer Vision, pages 23174–23184, 2023

work page 2023
[28]

Multiple sclerosis lesion segmentation with tiramisu and 2.5 d stacked slices

Zhang et al. Multiple sclerosis lesion segmentation with tiramisu and 2.5 d stacked slices. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 338–346. Springer, 2019

work page 2019
[29]

Adding conditional control to text-to-image diffusion models

Zhang et al. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3836–3847, 2023

work page 2023
[30]

Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss.arXiv preprint arXiv:2508.05772,

Zhao et al. Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss.arXiv preprint arXiv:2508.05772, 2025

work page arXiv 2025
[31]

Learning deep features for discriminative localization

Zhou et al. Learning deep features for discriminative localization. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016

work page 2016
[32]

Cancer facts and figures 2016, 2016

American Cancer Society. Cancer facts and figures 2016, 2016

work page 2016
[33]

Reduced lung-cancer mortality with low-dose computed tomographic screening.New England Journal of Medicine, 365(5):395–409, 2011

National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-dose computed tomographic screening.New England Journal of Medicine, 365(5):395–409, 2011

work page 2011

[1] [1]

Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021

Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.Advances in neural information processing systems, 34:8780–8794, 2021

work page 2021

[2] [2]

Universal guidance for diffusion models

Bansal et al. Universal guidance for diffusion models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 843–852, 2023

work page 2023

[3] [3]

A survey on gans for anomaly detection.arXiv preprint arXiv:1906.11632, 2019

Di Mattia et al. A survey on gans for anomaly detection.arXiv preprint arXiv:1906.11632, 2019

work page arXiv 1906

[4] [4]

Deepresidualseparableconvolutionalneuralnetworkforlungtumor segmentation.Computers in biology and medicine, 141:105161, 2022

Dutandeetal. Deepresidualseparableconvolutionalneuralnetworkforlungtumor segmentation.Computers in biology and medicine, 141:105161, 2022

work page 2022

[5] [5]

Scaling rectified flow transformers for high-resolution image synthesis

Esser et al. Scaling rectified flow transformers for high-resolution image synthesis. InForty-first international conference on machine learning, 2024

work page 2024

[6] [6]

Maisi: Medical ai for synthetic imaging

Guo et al. Maisi: Medical ai for synthetic imaging. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 4430–4441. IEEE, 2025

work page 2025

[7] [7]

A flexible 2.5 d medical image segmentation approach with in-slice and cross-slice attention.Computers in Biology and Medicine, 182:109173, 2024

Kumar et al. A flexible 2.5 d medical image segmentation approach with in-slice and cross-slice attention.Computers in Biology and Medicine, 182:109173, 2024

work page 2024

[8] [8]

Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

Liu et al. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022

work page internal anchor Pith review Pith/arXiv arXiv 2022

[9] [9]

Segment anything in medical images.Nature Communications, 15(1):654, 2024

Ma et al. Segment anything in medical images.Nature Communications, 15(1):654, 2024

work page 2024

[10] [10]

Radimagenet: an open radiologic deep learning research dataset for effective transfer learning.Radiology: Artificial Intelligence, 4(5):e210315, 2022

Mei et al. Radimagenet: an open radiologic deep learning research dataset for effective transfer learning.Radiology: Artificial Intelligence, 4(5):e210315, 2022

work page 2022

[11] [11]

Anomaly detection with conditioned denoising diffusion models

Mousakhan et al. Anomaly detection with conditioned denoising diffusion models. InDAGM German Conference on Pattern Recognition, pages 181–195. Springer, 2024

work page 2024

[12] [12]

Flowchef: Steering of rectified flow models for controlled generations

Patel et al. Flowchef: Steering of rectified flow models for controlled generations. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 15308–15318, 2025

work page 2025

[13] [13]

What is healthy? generative counterfactual diffusion for lesion local- ization

Sanchez et al. What is healthy? generative counterfactual diffusion for lesion local- ization. InMICCAI workshop on deep generative models, pages 34–44. Springer, 2022

work page 2022

[14] [14]

f-anogan: Fast unsupervised anomaly detection with generative ad- versarial networks.Medical image analysis, 54:30–44, 2019

Schlegl et al. f-anogan: Fast unsupervised anomaly detection with generative ad- versarial networks.Medical image analysis, 54:30–44, 2019

work page 2019

[15] [15]

Grad-cam: Visual explanations from deep networks via gradient- based localization

Selvaraju et al. Grad-cam: Visual explanations from deep networks via gradient- based localization. InProceedings of the IEEE international conference on com- puter vision, pages 618–626, 2017

work page 2017

[16] [16]

Setio et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: the luna16 chal- lenge.Medical image analysis, 42:1–13, 2017

work page 2017

[17] [17]

Axiomatic attribution for deep networks

Sundararajan et al. Axiomatic attribution for deep networks. InInternational conference on machine learning, pages 3319–3328. PMLR, 2017

work page 2017

[18] [18]

Score-cam: Score-weighted visual explanations for convolutional neural networks

Wang et al. Score-cam: Score-weighted visual explanations for convolutional neural networks. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pages 24–25, 2020. 10 Anonymized Author et al

work page 2020

[19] [19]

3d meddiffusion: A 3d medical latent diffusion model for control- lable and high-quality medical image generation.IEEE Transactions on Medical Imaging, 2025

Wang et al. 3d meddiffusion: A 3d medical latent diffusion model for control- lable and high-quality medical image generation.IEEE Transactions on Medical Imaging, 2025

work page 2025

[20] [20]

Weakmedsam: Weakly-supervised medical image segmentation via sam with sub-class exploration and prompt affinity mining.IEEE Transactions on Medical Imaging, 2025

Wang et al. Weakmedsam: Weakly-supervised medical image segmentation via sam with sub-class exploration and prompt affinity mining.IEEE Transactions on Medical Imaging, 2025

work page 2025

[21] [21]

Descargan: Disease-specific anomaly detection with weak supervision

Wolleb et al. Descargan: Disease-specific anomaly detection with weak supervision. InInternational conference on medical image computing and computer-assisted in- tervention, pages 14–24. Springer, 2020

work page 2020

[22] [22]

Diffusion models for medical anomaly detection

Wolleb et al. Diffusion models for medical anomaly detection. InInternational Conference on Medical image computing and computer-assisted intervention, pages 35–45. Springer, 2022

work page 2022

[23] [23]

Anoddpm: Anomaly detection with denoising diffusion probabilis- tic models using simplex noise

Wyatt et al. Anoddpm: Anomaly detection with denoising diffusion probabilis- tic models using simplex noise. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 650–656, 2022

work page 2022

[24] [24]

Diff-unet: A diffu- sion embedded network for volumetric segmentation,

Xing et al. Diff-unet: A diffusion embedded network for volumetric segmentation. arXiv preprint arXiv:2303.10326, 2023

work page arXiv 2023

[25] [25]

Medsyn: text-guided anatomy-aware synthesis of high-fidelity 3-d ct images.IEEE Transactions on Medical Imaging, 43(10):3648–3660, 2024

Xu et al. Medsyn: text-guided anatomy-aware synthesis of high-fidelity 3-d ct images.IEEE Transactions on Medical Imaging, 43(10):3648–3660, 2024

work page 2024

[26] [26]

Tfg: Unified training-free guidance for diffusion models.Advances in Neural Information Processing Systems, 37:22370–22417, 2024

Ye et al. Tfg: Unified training-free guidance for diffusion models.Advances in Neural Information Processing Systems, 37:22370–22417, 2024

work page 2024

[27] [27]

Freedom: Training-free energy-guided conditional diffusion model

Yu et al. Freedom: Training-free energy-guided conditional diffusion model. InPro- ceedings of the IEEE/CVF International Conference on Computer Vision, pages 23174–23184, 2023

work page 2023

[28] [28]

Multiple sclerosis lesion segmentation with tiramisu and 2.5 d stacked slices

Zhang et al. Multiple sclerosis lesion segmentation with tiramisu and 2.5 d stacked slices. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 338–346. Springer, 2019

work page 2019

[29] [29]

Adding conditional control to text-to-image diffusion models

Zhang et al. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3836–3847, 2023

work page 2023

[30] [30]

Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss.arXiv preprint arXiv:2508.05772,

Zhao et al. Maisi-v2: Accelerated 3d high-resolution medical image synthesis with rectified flow and region-specific contrastive loss.arXiv preprint arXiv:2508.05772, 2025

work page arXiv 2025

[31] [31]

Learning deep features for discriminative localization

Zhou et al. Learning deep features for discriminative localization. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 2921–2929, 2016

work page 2016

[32] [32]

Cancer facts and figures 2016, 2016

American Cancer Society. Cancer facts and figures 2016, 2016

work page 2016

[33] [33]

Reduced lung-cancer mortality with low-dose computed tomographic screening.New England Journal of Medicine, 365(5):395–409, 2011

National Lung Screening Trial Research Team. Reduced lung-cancer mortality with low-dose computed tomographic screening.New England Journal of Medicine, 365(5):395–409, 2011

work page 2011