pith. sign in

arxiv: 1907.07676 · v1 · pith:WDRD74FVnew · submitted 2019-07-17 · 📡 eess.IV · cs.CV· cs.LG

Lung Nodules Detection and Segmentation Using 3D Mask-RCNN

Pith reviewed 2026-05-24 20:16 UTC · model grok-4.3

classification 📡 eess.IV cs.CVcs.LG
keywords lung nodule detection3D segmentationMask-RCNNCT scansLUNA16object detectionmedical imaging
0
0 comments X

The pith

A 3D version of Mask-RCNN detects lung nodules in CT scans and produces their 3D segmentations at competitive accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper adapts the Mask-RCNN model from 2D images to 3D CT volumes so that one network can both locate lung nodules across a full scan and generate 3D masks for each one. It demonstrates this on the LUNA16 dataset and reports detection performance that matches existing methods. A reader would care because the work merges two separate tasks—detection from whole scans and segmentation inside regions of interest—into a single automated step. This could cut the manual effort radiologists spend outlining nodules to assess their size and shape.

Core claim

We adapt the state of the art architecture for 2D object detection and segmentation, MaskRCNN, to handle 3D images and employ it to detect and segment lung nodules from CT scans. We report on competitive results for the lung nodule detection on LUNA16 data set. The added value of our method is that in addition to lung nodule detection, our framework produces 3D segmentations of the detected nodules.

What carries the argument

3D Mask-RCNN obtained by replacing the 2D convolutional and pooling operations of the original Mask-RCNN with their 3D counterparts to process volumetric CT data for joint detection and segmentation.

If this is right

  • The single model outputs both nodule detections and 3D segmentations from full CT volumes.
  • Detection performance on the LUNA16 benchmark remains competitive with prior methods.
  • The approach addresses both whole-scan detection and ROI segmentation inside one framework.
  • Automation of nodule outlining reduces the time and error in radiologist interpretation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same 3D extension could be tried on other volumetric medical tasks such as tumor segmentation in MRI.
  • Performance on CT scans from scanners not represented in LUNA16 would test generalization.
  • The outputs could feed directly into downstream volume-based measurements of nodule growth.
  • Combining the 3D detections with existing 2D slice review tools might create hybrid clinical workflows.

Load-bearing premise

That replacing 2D operations in Mask-RCNN with their 3D counterparts will preserve detection accuracy and produce usable segmentations when trained on the LUNA16 dataset.

What would settle it

Training and testing the 3D Mask-RCNN on the LUNA16 dataset and finding that its detection sensitivity falls below published 2D baselines or that its 3D segmentations deviate substantially from the provided ground-truth masks.

Figures

Figures reproduced from arXiv: 1907.07676 by Evi Kopelowitz, Guy Engelhard.

Figure 1
Figure 1. Figure 1: Diagram of the 3DMaskRCNN model B Figures 7 [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Examples of Nodule segmentation with 3DMaskRCNN. Box size is [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Examples of nodules detected by 3DMaskRCNN [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

Accurate assessment of Lung nodules is a time consuming and error prone ingredient of the radiologist interpretation work. Automating 3D volume detection and segmentation can improve workflow as well as patient care. Previous works have focused either on detecting lung nodules from a full CT scan or on segmenting them from a small ROI. We adapt the state of the art architecture for 2D object detection and segmentation, MaskRCNN, to handle 3D images and employ it to detect and segment lung nodules from CT scans. We report on competitive results for the lung nodule detection on LUNA16 data set. The added value of our method is that in addition to lung nodule detection, our framework produces 3D segmentations of the detected nodules.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The manuscript adapts the 2D Mask R-CNN architecture to 3D operations and applies it to detect and segment lung nodules in CT volumes. It asserts competitive detection performance on the LUNA16 benchmark while noting that the framework additionally outputs 3D segmentations of detected nodules.

Significance. If the empirical claims hold with proper validation, the work would supply a single model for both detection and 3D segmentation, addressing a practical gap in automated lung-nodule analysis. The absence of any quantitative results, baselines, or implementation details in the available text prevents assessment of whether this contribution is realized.

major comments (2)
  1. [Abstract] Abstract: the assertion of 'competitive results' on LUNA16 supplies no metrics, baselines, error bars, or description of the 3D modifications, so the central empirical claim cannot be evaluated.
  2. [Abstract] Abstract, first paragraph: the assumption that direct replacement of 2D operations by 3D counterparts will preserve detection accuracy on LUNA16 is stated without supporting experiments, ablation studies, or training details, leaving the soundness of the adaptation unverified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review. The comments focus on the abstract; we address them point-by-point below and will revise the abstract accordingly while preserving the manuscript's existing experimental content.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion of 'competitive results' on LUNA16 supplies no metrics, baselines, error bars, or description of the 3D modifications, so the central empirical claim cannot be evaluated.

    Authors: We agree the abstract is too terse on this point. The manuscript body reports quantitative detection results on LUNA16 (including sensitivity at specified false-positive rates) together with comparisons to published baselines and a description of the 3D convolutional and pooling replacements. We will revise the abstract to state the key metrics, note the baselines, and briefly indicate the 3D modifications. revision: yes

  2. Referee: [Abstract] Abstract, first paragraph: the assumption that direct replacement of 2D operations by 3D counterparts will preserve detection accuracy on LUNA16 is stated without supporting experiments, ablation studies, or training details, leaving the soundness of the adaptation unverified.

    Authors: The abstract is a high-level summary; the methods and results sections supply the training protocol on LUNA16 and the empirical outcomes that validate the 3D adaptation. Explicit ablation studies isolating only the 2D-to-3D swap are not present. We will add a short clause in the revised abstract that points to the supporting experiments already contained in the paper. revision: partial

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper describes an empirical adaptation of the existing Mask-RCNN architecture by replacing 2D operations with 3D counterparts and evaluates it on the public LUNA16 benchmark for detection and segmentation performance. No derivation chain, equations, fitted parameters presented as predictions, or load-bearing self-citations appear in the provided text. The central claim is supported by reported results on an external dataset rather than any internal reduction to inputs by construction, making the work self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the standard assumption that 2D convolutional architectures can be extended to 3D by direct replacement of operations and that the LUNA16 benchmark is representative for training and evaluation.

axioms (1)
  • domain assumption Mask-RCNN can be extended to 3D volumes by replacing 2D convolutions, RoIAlign, and other layers with 3D equivalents while preserving training stability and performance.
    This premise is required for the adaptation described in the abstract and is not derived in the provided text.

pith-pipeline@v0.9.0 · 5654 in / 1293 out tokens · 25516 ms · 2026-05-24T20:16:54.450292+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 8 internal anchors

  1. [1]

    Welsh, Kellie Bodeker, Elizabeth Fallon, Sundershan K

    Jessemae L. Welsh, Kellie Bodeker, Elizabeth Fallon, Sundershan K. Bha- tia, John M. Buatti, and Joseph J. Cullen. Comparison of response evalua- tion criteria in solid tumors with volumetric measurements for estimation of tumor burden in pancreatic adenocarcinoma and hepatocellular carcinoma. Am J Surg. , 204(5):580585, 2012

  2. [2]

    Comparison of ct volumetric measurement with recist response in patients with lung cancer

    SA Hayes, MC Pietanza, D ODriscoll, J Zheng, CS Moskowitz, MG Kris, and MS Ginsberg. Comparison of ct volumetric measurement with recist response in patients with lung cancer. Eur J Radiol , 85(3):524–33, Mar 2016

  3. [3]

    Accurate Pulmonary Nodule Detection in Computed Tomography Images Using Deep Convolutional Neural Networks

    Jia Ding, Aoxue Li, Zhiqiang Hu, and Liwei Wang. Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. CoRR, abs/1706.04303, 2017

  4. [4]

    Jaeger, Simon A

    Paul F. Jaeger, Simon A. A. Kohl, Sebastian Bickelhaupt, Fabian Isensee, Tristan Anselm Kuder, Heinz-Peter Schlemmer, and Klaus H. Maier-Hein. 4 Retina u-net: Embarrassingly simple exploitation of segmentation supervi- sion for medical object detection. CoRR, abs/1811.08661, 2018

  5. [5]

    Arnaud Arindra Adiyoso Setio, Alberto Traverso, Thomas de Bel, Moira S. N. Berens, Cas van den Bogaard, Piergiorgio Cerello, Hao Chen, Qi Dou, Maria Evelina Fantacci, Bram Geurts, Robbert van der Gugten, Pheng-Ann Heng, Bart Jansen, Michael M. J. de Kaste, Valentin Kotov, Jack Yu-Hung Lin, Jeroen T. M. C. Manders, Alexander S´ onora-Mengana, Juan Carlos G...

  6. [6]

    Lung nodule segmen- tation with convolutional neural network trained by simple diameter infor- mation

    Changmo Nam, Jihang Kim, and Kyong Joon Lee. Lung nodule segmen- tation with convolutional neural network trained by simple diameter infor- mation. 2018

  7. [7]

    Discriminative Localization in CNNs for Weakly-Supervised Segmentation of Pulmonary Nodules

    Xinyang Feng, Jie Yang, Andrew F. Laine, and Elsa D. Angelini. Dis- criminative localization in cnns for weakly-supervised segmentation of pul- monary nodules. CoRR, abs/1707.01086, 2017

  8. [8]

    Y. Qin, H. Zheng, X. Huang, J. Yang, and Y. M. Zhu. Pulmonary nodule segmentation with CT sample synthesis using adversarial networks. Med Phys, 46(3):1218–1229, Mar 2019

  9. [9]

    Joint Learning for Pulmonary Nodule Segmentation, Attributes and Malignancy Prediction

    Botong Wu, Zhen Zhou, Jianwei Wang, and Yizhou Wang. Joint learning for pulmonary nodule segmentation, attributes and malignancy prediction. CoRR, abs/1802.03584, 2018

  10. [10]

    Mask R-CNN

    Kaiming He, Georgia Gkioxari, Piotr Doll´ ar, and Ross B. Girshick. Mask R-CNN. CoRR, abs/1703.06870, 2017

  11. [11]

    Mask r-cnn for object detection and instance segmentation on keras and tensorflow

    Waleed Abdulla. Mask r-cnn for object detection and instance segmentation on keras and tensorflow. https://github.com/matterport/Mask_RCNN, 2017

  12. [12]

    Full model description, 2019

    Evi Kopelowitz. Full model description, 2019

  13. [13]

    Focal Loss for Dense Object Detection

    Tsung-Yi Lin, Priya Goyal, Ross B. Girshick, Kaiming He, and Piotr Doll´ ar. Focal loss for dense object detection.CoRR, abs/1708.02002, 2017

  14. [14]

    Data From LIDC-IDRI

    Armato III, Samuel G, McLennan, Geoffrey, Bidaut, Luc, McNitt-Gray, Michael F, Meyer, Charles R, Reeves, Anthony P andClarke, and Laurence P. Data From LIDC-IDRI. The Cancer Imaging Archive kernel description, 2015. 5

  15. [15]

    S. G. Armato, G. McLennan, L. Bidaut, M. F. McNitt-Gray, C. R. Meyer, A. P. Reeves, B. Zhao, D. R. Aberle, C. I. Henschke, E. A. Hoffman, E. A. Kazerooni, H. MacMahon, E. J. Van Beeke, D. Yankelevitz, A. M. Biancardi, P. H. Bland, M. S. Brown, R. M. Engelmann, G. E. Lader- ach, D. Max, R. C. Pais, D. P. Qing, R. Y. Roberts, A. R. Smith, A. Starkey, P. Batr...

  16. [16]

    [1st place] Solution Overview and Code kernel description, 2018

    Ian Pan. [1st place] Solution Overview and Code kernel description, 2018

  17. [17]

    Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning

    Christian Szegedy, Sergey Ioffe, and Vincent Vanhoucke. Inception-v4, inception-resnet and the impact of residual connections on learning. CoRR, abs/1602.07261, 2016

  18. [18]

    Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. CoRR, abs/1606.00915, 2016. A 3DMaskRCNN Model Architecture The 3DMaskRCNN is composed of four parts: backbone, RPN, RCNN for clas- sification and bounding ...