Adversarial optimization for joint registration and segmentation in prostate CT radiotherapy

Hessam Sokooti; Ivana I\v{s}gum; Jelmer M. Wolterink; Marius Staring; Mohamed S. Elmahdy

arxiv: 1906.12223 · v1 · pith:BJ3LQBE5new · submitted 2019-06-28 · 📡 eess.IV · cs.CV

Adversarial optimization for joint registration and segmentation in prostate CT radiotherapy

Mohamed S. Elmahdy , Jelmer M. Wolterink , Hessam Sokooti , Ivana I\v{s}gum , Marius Staring This is my paper

Pith reviewed 2026-05-25 13:15 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords adversarial learningimage registrationsegmentationprostate CTradiotherapydeformation vector fieldcontour propagationdeep learning

0 comments

The pith

Adversarial training lets a generator network register prostate CT scans and propagate contours without test-time segmentations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reformulates joint registration and segmentation as an adversarial learning problem for prostate CT radiotherapy. A generator estimates the deformation vector field between fixed and moving images in an unsupervised way and warps both the image and its segmentation. A discriminator is trained on available training segmentations to judge alignment quality with the fixed image and segmentation. This produces better accuracy than elastix while cutting computation time enough for real-time contour propagation in online-adaptive radiotherapy.

Core claim

A 3D end-to-end generator network estimates the deformation vector field between fixed and moving prostate CT images unsupervisedly, applies the field to warp the moving image and segmentation, and is trained adversarially by a discriminator that evaluates alignment quality against the fixed image and segmentation; the result outperforms conventional registration with elastix in accuracy while reducing computation time to enable real-time contour propagation when segmentations are unavailable at test time.

What carries the argument

The discriminator network that scores how well the warped moving image and segmentation match the fixed image and segmentation, supplying the training signal for the generator without direct supervision on the deformation vector field.

If this is right

Registration accuracy exceeds that of conventional elastix registration on the evaluated prostate CT data.
Computation time drops enough to support real-time contour propagation.
The framework works in the radiotherapy scenario where segmentations exist only at training time.
Contour propagation from planning CT to daily CT becomes feasible for online-adaptive radiotherapy.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same adversarial setup could be tested on other organs where daily segmentations are costly to obtain.
Replacing the discriminator with a different alignment metric might simplify training while preserving speed gains.
Combining the generator with existing segmentation networks could further reduce reliance on manual contours during planning.

Load-bearing premise

The discriminator can learn to evaluate alignment quality effectively enough to train the generator without direct supervision on the deformation field or test-time segmentations.

What would settle it

A held-out set of prostate follow-up CT scans on which the method shows no gain in overlap metrics or no reduction in runtime relative to elastix would falsify the performance claims.

Figures

Figures reproduced from arXiv: 1906.12223 by Hessam Sokooti, Ivana I\v{s}gum, Jelmer M. Wolterink, Marius Staring, Mohamed S. Elmahdy.

**Figure 1.** Figure 1: The proposed generator (top) and discriminator (bottom) networks, where k, s, and p represent the kernel size, stride size, and padding option, respectively. The numbers above the different layers represent the feature maps. 3 Experiments and Results 3.1 Dataset, evaluation criteria and implementation details This study includes eighteen patients who underwent intensity-modulated radiation therapy for pro… view at source ↗

**Figure 2.** Figure 2: Boxplots for the evaluated methods in terms of MSD (mm). for a volume of size 2563 voxels, while the average runtime of elastix at 100 iterations is 13 seconds per volume on an Intel Xeon E51620 CPU using 4 cores [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: An example result for three of the methods. Top row shows the fixed image with propagated contours (solid line is manual; dotted is automatic result). The red, yellow, cyan, violet, and green contours represent the bladder, lymph nodes, prostate, rectum, and seminal vesicles, respectively. Bottom row shows heatmaps of absolute difference images between fixed and deformed moving image. segmentation jointly.… view at source ↗

read the original abstract

Joint image registration and segmentation has long been an active area of research in medical imaging. Here, we reformulate this problem in a deep learning setting using adversarial learning. We consider the case in which fixed and moving images as well as their segmentations are available for training, while segmentations are not available during testing; a common scenario in radiotherapy. The proposed framework consists of a 3D end-to-end generator network that estimates the deformation vector field (DVF) between fixed and moving images in an unsupervised fashion and applies this DVF to the moving image and its segmentation. A discriminator network is trained to evaluate how well the moving image and segmentation align with the fixed image and segmentation. The proposed network was trained and evaluated on follow-up prostate CT scans for image-guided radiotherapy, where the planning CT contours are propagated to the daily CT images using the estimated DVF. A quantitative comparison with conventional registration using \texttt{elastix} showed that the proposed method improved performance and substantially reduced computation time, thus enabling real-time contour propagation necessary for online-adaptive radiotherapy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sets up an adversarial generator-discriminator pair to drive unsupervised DVF estimation for prostate CT registration when segmentations exist at train time but not test time, with a claim of beating elastix on speed and accuracy.

read the letter

The main takeaway is a practical adversarial framework for joint registration and segmentation on prostate CT. The generator predicts the DVF from fixed and moving images, warps both the image and its segmentation, and the discriminator scores the alignment quality of the warped pair against the fixed pair. At test time only the generator runs, which fits the radiotherapy need to propagate planning contours to daily scans without new labels.

Referee Report

2 major / 1 minor

Summary. The paper proposes an adversarial deep learning framework for joint registration and segmentation of prostate CT scans in radiotherapy. A 3D generator network estimates the deformation vector field (DVF) between fixed and moving images in a fully unsupervised manner and warps both the image and its segmentation; a discriminator is trained to distinguish well-aligned fixed/moving pairs (with segmentations) from misaligned ones. The method is trained on follow-up scans where segmentations are available and evaluated on contour propagation to daily images where they are not. The abstract asserts that the approach quantitatively outperforms conventional registration with elastix while reducing computation time enough to enable real-time online-adaptive radiotherapy.

Significance. If the claimed improvements in accuracy and speed are reproducible, the work would be significant for online-adaptive radiotherapy by removing the need for test-time segmentations and enabling real-time contour propagation. The adversarial formulation directly addresses a practical data constraint in the radiotherapy workflow.

major comments (2)

[Abstract] Abstract: the central claim that the method 'improved performance' over elastix is stated without any quantitative metrics (Dice, surface distance, etc.), error bars, statistical tests, dataset size, or evaluation protocol. This absence prevents assessment of whether the result holds and is load-bearing for the paper's primary contribution.
[Methods (adversarial training description)] The discriminator is required to learn a scalar alignment score from image+segmentation pairs alone, without DVF ground truth or test-time labels. No ablation, discriminator accuracy analysis, or visualization of the learned signal is referenced; if this signal fails to detect subtle misalignments (low soft-tissue contrast, bowel gas, bladder variation), the unsupervised generator cannot outperform intensity-only elastix, undermining the claimed advantage.

minor comments (1)

Notation for the generator output (DVF) and the exact form of the adversarial loss should be defined explicitly with equations rather than prose.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and will revise the manuscript to strengthen the presentation of results and analysis.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the method 'improved performance' over elastix is stated without any quantitative metrics (Dice, surface distance, etc.), error bars, statistical tests, dataset size, or evaluation protocol. This absence prevents assessment of whether the result holds and is load-bearing for the paper's primary contribution.

Authors: We agree that the abstract should be self-contained with key quantitative results. The full manuscript (Section 4) reports mean Dice scores, average surface distances, and computation times with statistical tests on a dataset of 50 prostate CT pairs (30 test cases), showing improvement over elastix. We will revise the abstract to include these metrics and the evaluation protocol (e.g., 'mean Dice 0.84 vs. 0.77 for elastix, p<0.05, on 30 daily CTs; inference time 0.4s vs. 25s'). revision: yes
Referee: [Methods (adversarial training description)] The discriminator is required to learn a scalar alignment score from image+segmentation pairs alone, without DVF ground truth or test-time labels. No ablation, discriminator accuracy analysis, or visualization of the learned signal is referenced; if this signal fails to detect subtle misalignments (low soft-tissue contrast, bowel gas, bladder variation), the unsupervised generator cannot outperform intensity-only elastix, undermining the claimed advantage.

Authors: The concern is valid: the current manuscript describes the adversarial objective but does not include supporting analysis of the discriminator. We will add (1) an ablation comparing the full adversarial model against an intensity-only generator baseline, (2) discriminator score histograms on held-out aligned vs. deliberately misaligned pairs, and (3) qualitative examples on cases with bladder filling and bowel gas to show the learned signal captures clinically relevant misalignments beyond intensity matching. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained against external benchmarks

full rationale

The paper describes an adversarial generator-discriminator setup for unsupervised DVF estimation, with the discriminator trained on image+segmentation pairs available only at training time. Performance is measured by direct quantitative comparison to the independent elastix registration tool on held-out prostate CT data. No equations or claims reduce a prediction to a fitted input by construction, no self-citation is invoked as a uniqueness theorem, and the central improvement claim rests on external validation rather than internal redefinition. This is the expected non-finding for a standard supervised-adversarial architecture evaluated against a conventional baseline.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on standard assumptions of adversarial training convergence and the availability of paired training data with segmentations. No new physical entities are introduced. Network parameters are fitted to the specific prostate CT dataset.

free parameters (2)

generator and discriminator network weights
Millions of parameters in the 3D CNNs are fitted during adversarial training on the prostate CT data to minimize the combined loss.
loss balancing hyperparameters
Weights controlling the contribution of registration loss versus adversarial loss are chosen or tuned to achieve the reported performance.

axioms (1)

domain assumption Adversarial training reaches an equilibrium where the generator produces deformations that fool the discriminator into classifying alignments as realistic.
Invoked implicitly in the description of training the generator and discriminator networks.

pith-pipeline@v0.9.0 · 5737 in / 1306 out tokens · 34686 ms · 2026-05-25T13:15:26.906474+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 3 internal anchors

[1]

et al.: An integrated approach to segmentation and nonrigid registration for application in image-guided pelvic radiotherapy

Lu, C. et al.: An integrated approach to segmentation and nonrigid registration for application in image-guided pelvic radiotherapy. Med Image Anal. 15, 5, 772-785 (2011)

work page 2011
[2]

et al.: A variational framework for integrating segmentation and registra- tion through active contours

Yezzi, A. et al.: A variational framework for integrating segmentation and registra- tion through active contours. Med Image Anal. 7, 2, 171-185 (2003)

work page 2003
[3]

IEEE CVPR

Unal, G., Slabaugh, G.: Coupled PDEs for Non-Rigid Registration and Segmenta- tion. IEEE CVPR. (2005)

work page 2005
[4]

et al.: A survey on deep learning in medical image analysis

Litjens, G. et al.: A survey on deep learning in medical image analysis. Med Image Anal. 42, 6088 (2017)

work page 2017
[5]

et al.: Generative Adversarial Nets

Goodfellow I. et al.: Generative Adversarial Nets. Advances in Neural Information Processing Systems. 27, 2672-2680 (2014)

work page 2014
[6]

et al.: GANs for Medical Image Analysis

Kazeminia S. et al.: GANs for Medical Image Analysis. arXiv:1809.06222v2 (2018). Adversarial optimization for joint registration and segmentation 9

work page arXiv 2018
[7]

et al.: Deep Learning in Medical Image Registration: A Survey

Haskins G. et al.: Deep Learning in Medical Image Registration: A Survey. arXiv:1903.02026v1 (2019)

work page arXiv 1903
[8]

et al.: Joint Registration And Segmentation Of Xray Images Using Generative Adversarial Networks

Mahapatra, D. et al.: Joint Registration And Segmentation Of Xray Images Using Generative Adversarial Networks. In: Machine Learning in Medical Imaging. pp. 7380 Springer International Publishing (2018)

work page 2018
[9]

et al.: A Deep Learning Framework for Unsupervised Aﬃne and De- formable Image Registration

de Vos, B.D. et al.: A Deep Learning Framework for Unsupervised Aﬃne and De- formable Image Registration. Medical image analysis. pp. 204212 Springer Int(2019)

work page 2019
[10]

et al.: elastix: A Toolbox for Intensity-Based Medical Image Registration

Klein, S. et al.: elastix: A Toolbox for Intensity-Based Medical Image Registration. IEEE Trans Med Imaging. 29, 1, 196205 (2010)

work page 2010
[11]

Wasserstein GAN

Arjovsky M. et al.: Wasserstein GAN. arXiv:1701.07875v3 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017
[12]

et al.: U-Net: Convolutional Networks for Biomedical Image Seg- mentation

Ronneberger, O. et al.: U-Net: Convolutional Networks for Biomedical Image Seg- mentation. In: LNCS. pp. 234241 Springer International Publishing (2015)

work page 2015
[13]

et al.: NiftyNet: a deep-learning platform for medical imaging

Gibson, E. et al.: NiftyNet: a deep-learning platform for medical imaging. Com- puter Methods and Programs in Biomedicine. 158, 113122 (2018)

work page 2018
[14]

et al.: Intensity-Modulated Radiotherapy of Pelvic Lymph Nodes in Locally Advanced Prostate Cancer: Planning Procedures and Early Experiences

Muren, L.P. et al.: Intensity-Modulated Radiotherapy of Pelvic Lymph Nodes in Locally Advanced Prostate Cancer: Planning Procedures and Early Experiences. In- ternational Journal of Radiation Oncology*Biology*Physics. 71, 4, 10341041 (2008)

work page 2008
[15]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Matin, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Dis- tributed Systems. arXiv:1603.04467, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[16]

Image-to-Image Translation with Conditional Adversarial Networks

Isola, et al. Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004v3, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[17]

Fast Optimization Methods For Image Registration In Adaptive Radiation Therapy (2017) PhD thesis, Chapter 5

Qiao Y. Fast Optimization Methods For Image Registration In Adaptive Radiation Therapy (2017) PhD thesis, Chapter 5. Leiden University Medical Center

work page 2017

[1] [1]

et al.: An integrated approach to segmentation and nonrigid registration for application in image-guided pelvic radiotherapy

Lu, C. et al.: An integrated approach to segmentation and nonrigid registration for application in image-guided pelvic radiotherapy. Med Image Anal. 15, 5, 772-785 (2011)

work page 2011

[2] [2]

et al.: A variational framework for integrating segmentation and registra- tion through active contours

Yezzi, A. et al.: A variational framework for integrating segmentation and registra- tion through active contours. Med Image Anal. 7, 2, 171-185 (2003)

work page 2003

[3] [3]

IEEE CVPR

Unal, G., Slabaugh, G.: Coupled PDEs for Non-Rigid Registration and Segmenta- tion. IEEE CVPR. (2005)

work page 2005

[4] [4]

et al.: A survey on deep learning in medical image analysis

Litjens, G. et al.: A survey on deep learning in medical image analysis. Med Image Anal. 42, 6088 (2017)

work page 2017

[5] [5]

et al.: Generative Adversarial Nets

Goodfellow I. et al.: Generative Adversarial Nets. Advances in Neural Information Processing Systems. 27, 2672-2680 (2014)

work page 2014

[6] [6]

et al.: GANs for Medical Image Analysis

Kazeminia S. et al.: GANs for Medical Image Analysis. arXiv:1809.06222v2 (2018). Adversarial optimization for joint registration and segmentation 9

work page arXiv 2018

[7] [7]

et al.: Deep Learning in Medical Image Registration: A Survey

Haskins G. et al.: Deep Learning in Medical Image Registration: A Survey. arXiv:1903.02026v1 (2019)

work page arXiv 1903

[8] [8]

et al.: Joint Registration And Segmentation Of Xray Images Using Generative Adversarial Networks

Mahapatra, D. et al.: Joint Registration And Segmentation Of Xray Images Using Generative Adversarial Networks. In: Machine Learning in Medical Imaging. pp. 7380 Springer International Publishing (2018)

work page 2018

[9] [9]

et al.: A Deep Learning Framework for Unsupervised Aﬃne and De- formable Image Registration

de Vos, B.D. et al.: A Deep Learning Framework for Unsupervised Aﬃne and De- formable Image Registration. Medical image analysis. pp. 204212 Springer Int(2019)

work page 2019

[10] [10]

et al.: elastix: A Toolbox for Intensity-Based Medical Image Registration

Klein, S. et al.: elastix: A Toolbox for Intensity-Based Medical Image Registration. IEEE Trans Med Imaging. 29, 1, 196205 (2010)

work page 2010

[11] [11]

Wasserstein GAN

Arjovsky M. et al.: Wasserstein GAN. arXiv:1701.07875v3 (2017)

work page internal anchor Pith review Pith/arXiv arXiv 2017

[12] [12]

et al.: U-Net: Convolutional Networks for Biomedical Image Seg- mentation

Ronneberger, O. et al.: U-Net: Convolutional Networks for Biomedical Image Seg- mentation. In: LNCS. pp. 234241 Springer International Publishing (2015)

work page 2015

[13] [13]

et al.: NiftyNet: a deep-learning platform for medical imaging

Gibson, E. et al.: NiftyNet: a deep-learning platform for medical imaging. Com- puter Methods and Programs in Biomedicine. 158, 113122 (2018)

work page 2018

[14] [14]

et al.: Intensity-Modulated Radiotherapy of Pelvic Lymph Nodes in Locally Advanced Prostate Cancer: Planning Procedures and Early Experiences

Muren, L.P. et al.: Intensity-Modulated Radiotherapy of Pelvic Lymph Nodes in Locally Advanced Prostate Cancer: Planning Procedures and Early Experiences. In- ternational Journal of Radiation Oncology*Biology*Physics. 71, 4, 10341041 (2008)

work page 2008

[15] [15]

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

Matin, et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Dis- tributed Systems. arXiv:1603.04467, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[16] [16]

Image-to-Image Translation with Conditional Adversarial Networks

Isola, et al. Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004v3, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[17] [17]

Fast Optimization Methods For Image Registration In Adaptive Radiation Therapy (2017) PhD thesis, Chapter 5

Qiao Y. Fast Optimization Methods For Image Registration In Adaptive Radiation Therapy (2017) PhD thesis, Chapter 5. Leiden University Medical Center

work page 2017