pith. sign in

arxiv: 1907.11587 · v1 · pith:NZBOK3YOnew · submitted 2019-07-26 · 📡 eess.IV · cs.CV

Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation

Pith reviewed 2026-05-24 15:09 UTC · model grok-4.3

classification 📡 eess.IV cs.CV
keywords medical image segmentationfully convolutional networks2D-3D ensembleevolutionary algorithmprostate segmentationself-adaptive architecturevolumetric segmentationmultiobjective optimization
0
0 comments X

The pith

An automatically evolved 2D-3D FCN ensemble reaches top-10 ranking on prostate segmentation while using far fewer parameters than other auto-designed models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to demonstrate that medical volume segmentation improves when a 2D network extracting slice-internal features is paired with a 3D network capturing relations between slices. Both networks are not hand-designed; instead a multiobjective evolutionary search adapts their structures on the target dataset to cut segmentation error and parameter count at the same time. The resulting ensemble is evaluated on the PROMISE12 prostate MRI challenge, where it places in the top ten submissions. A reader should care because manual architecture tuning for each new medical task is slow and because large 3D models quickly exhaust memory on volumetric data. If the claim holds, compact yet competitive segmenters can be produced for new datasets without expert redesign.

Core claim

The central claim is that a self-adaptive ensemble formed by one 2D FCN and one 3D FCN, whose architectures are jointly evolved by a multiobjective algorithm that minimizes both segmentation error and network size, produces a model that ranks in the top ten on the PROMISE12 challenge and surpasses other automatically designed networks while remaining considerably smaller.

What carries the argument

The multiobjective evolutionary algorithm that searches for 2D and 3D FCN architectures minimizing segmentation error together with parameter count on the given medical dataset.

If this is right

  • The 2D-3D split lets intra-slice detail and inter-slice context be optimized separately within one model.
  • Evolutionary search can replace manual trial-and-error when creating segmentation networks for new medical volumes.
  • High leaderboard placement remains possible even after the search explicitly penalizes large parameter counts.
  • Volumetric medical tasks become feasible on hardware with tighter memory limits.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same search process could be applied to other organs or modalities to test whether the discovered 2D-3D balance is dataset-specific.
  • Reduced model size may allow the networks to run on clinical workstations that cannot host full 3D models.
  • Extending the evolutionary objective to include inference speed would further tighten the link between accuracy and practical deployment.

Load-bearing premise

The evolutionary search discovers network structures that generalize from the training and validation data to the hidden test cases rather than overfitting to the challenge split.

What would settle it

Running the identical evolutionary procedure on the PROMISE12 training set and finding that the produced network falls outside the top ten or loses its size advantage on the official test set would falsify the central claim.

Figures

Figures reproduced from arXiv: 1907.11587 by Maria G. Baldeon Calisto, Susana K. Lai-Yuen.

Figure 2
Figure 2. Figure 2: Example of a five-residual block FCN. Once the overall structure of the FCN is defined, nine hyperparameters that are encoded into nine decision variables need to be set to construct the final architecture. As shown in [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Example of our segmentation results on the PROMISE12 dataset. The network produces spatially consistent segmentation with smooth boundaries. 3.2 Comparison with State-of-the-art The evaluation of the test cases were carried via an online submission to the PROMISE12 challenge. Four evaluation metrics are used to assess the volumetric segmentations of the whole prostate, apex and base parts of the prostate. … view at source ↗
read the original abstract

Segmentation is a critical step in medical image analysis. Fully Convolutional Networks (FCNs) have emerged as powerful segmentation models achieving state-of-the-art results in various medical image datasets. Network architectures are usually designed manually for a specific segmentation task so applying them to other medical datasets requires extensive experience and time. Moreover, the segmentation requires handling large volumetric data that results in big and complex architectures. Recently, methods that automatically design neural networks for medical image segmentation have been presented; however, most approaches either do not fully consider volumetric information or do not optimize the size of the network. In this paper, we propose a novel self-adaptive 2D-3D ensemble of FCNs for medical image segmentation that incorporates volumetric information and optimizes both the model's performance and size. The model is composed of an ensemble of a 2D FCN that extracts intra-slice information, and a 3D FCN that exploits inter-slice information. The architectures of the 2D and 3D FCNs are automatically adapted to a medical image dataset using a multiobjective evolutionary based algorithm that minimizes both the segmentation error and number of parameters in the network. The proposed 2D-3D FCN ensemble was tested on the task of prostate segmentation on the image dataset from the PROMISE12 Grand Challenge. The resulting network is ranked in the top 10 submissions, surpassing the performance of other automatically-designed architectures while being considerably smaller in size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes a self-adaptive 2D-3D ensemble of FCNs for medical image segmentation. Architectures for the 2D (intra-slice) and 3D (inter-slice) components are discovered by a multiobjective evolutionary algorithm that jointly minimizes segmentation error and parameter count. On the PROMISE12 prostate segmentation challenge the resulting ensemble is reported to rank in the top 10 submissions while being considerably smaller than other automatically designed networks.

Significance. If the evolutionary search protocol demonstrably avoids overfitting to the fitness validation data and the reported ranking is on the hidden test set, the work would provide a concrete example of automatically producing compact, high-performing 2D-3D ensembles for volumetric medical segmentation, a useful contribution given the tension between model size and accuracy in clinical applications.

major comments (2)
  1. [Methods] Methods section (evolutionary algorithm description): the manuscript supplies no protocol details on how the validation split used for fitness evaluation is isolated from the official PROMISE12 validation or hidden test sets, nor on the total number of fitness evaluations performed. Without this information the central claim that the discovered architecture achieves a genuine top-10 ranking without inflation from repeated validation exposure cannot be verified.
  2. [Experiments] Experiments section (ranking and size comparison): the reported top-10 ranking and size advantage over other auto-designed nets is presented without an ablation that isolates the contribution of the 2D-3D ensemble versus the individual 2D or 3D networks, or versus a single-objective search; this leaves open whether the multiobjective formulation is load-bearing for the claimed performance-size trade-off.
minor comments (2)
  1. [Abstract] Abstract and introduction: the phrase 'self-adaptive' is used without a precise definition or pointer to the evolutionary operators that implement the adaptation.
  2. [Figures] Figure captions and tables: several figures lack error bars or statistical significance markers on the reported Dice scores, making direct comparison with challenge submissions harder to interpret.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments point-by-point below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Methods] Methods section (evolutionary algorithm description): the manuscript supplies no protocol details on how the validation split used for fitness evaluation is isolated from the official PROMISE12 validation or hidden test sets, nor on the total number of fitness evaluations performed. Without this information the central claim that the discovered architecture achieves a genuine top-10 ranking without inflation from repeated validation exposure cannot be verified.

    Authors: We agree that these protocol details are important for verifying the integrity of the search. In the revised manuscript we will add an explicit subsection describing the fitness validation split construction (random 80/20 split within the official training cases only, with no overlap to the challenge validation or hidden test sets) and will report the total number of fitness evaluations performed during the multiobjective evolutionary search. revision: yes

  2. Referee: [Experiments] Experiments section (ranking and size comparison): the reported top-10 ranking and size advantage over other auto-designed nets is presented without an ablation that isolates the contribution of the 2D-3D ensemble versus the individual 2D or 3D networks, or versus a single-objective search; this leaves open whether the multiobjective formulation is load-bearing for the claimed performance-size trade-off.

    Authors: The multiobjective formulation is integral to the method because it directly produces the compact high-performing ensemble reported; the size-performance trade-off is the explicit objective of the search. While an explicit ablation against single-objective search or isolated 2D/3D components was not included, the comparison against other automatically designed networks already demonstrates the practical advantage of the resulting model. We will add a short discussion paragraph clarifying the design rationale for the multiobjective approach but do not believe a full new ablation study is required to support the stated claims. revision: partial

Circularity Check

0 steps flagged

No circularity; empirical ranking on external challenge dataset

full rationale

The paper describes an empirical pipeline: a multiobjective evolutionary algorithm designs 2D and 3D FCN architectures by minimizing segmentation error and parameter count on the PROMISE12 dataset, followed by ensemble evaluation that yields a measured top-10 ranking. No equations, fitted parameters renamed as predictions, or self-citation chains are present in the provided text. The central claim reduces to an externally verifiable ranking on a public challenge test set rather than any derivation that collapses to its own inputs by construction. The method is self-contained against the external benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that evolutionary search over network hyperparameters will locate high-performing yet compact 2D-3D ensembles; no new physical entities or free parameters beyond standard evolutionary hyperparameters are introduced in the abstract.

axioms (1)
  • domain assumption A multiobjective evolutionary algorithm can effectively explore the joint space of 2D and 3D FCN architectures while trading off segmentation accuracy against parameter count.
    Invoked when the paper states that the architectures are automatically adapted using the evolutionary algorithm.

pith-pipeline@v0.9.0 · 5800 in / 1310 out tokens · 22790 ms · 2026-05-24T15:09:01.220824+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · 11 internal anchors

  1. [1]

    Automated medical image segmentation techniques,

    N. Sharma and L. M. Aggarwal, "Automated medical image segmentation techniques," Journal of medical physics, vol. 35, p. 3, 2010

  2. [2]

    Fully Convolutional Networks for Semantic Segmentation,

    J. Long, E. Shelhamer and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Conference on Computer Vision and Pattern Recognition, 2014

  3. [3]

    U-Net: Convolutional Networks for Biomedical Image Segmentation

    O. Ronneberg, P. Fischer and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," arXiv:1505.04597, 2015

  4. [4]

    Segnet: A deep convolutional encoder -decoder architecture for image segmentation,

    V. Badrinarayanan, A. Kendall and R. Cipolla, "Segnet: A deep convolutional encoder -decoder architecture for image segmentation," IEEE transactions on pattern analysis and machine intelligence, pp. 2481-2495, 2017. 8

  5. [5]

    3D U-Net: learning dense volumetric segmentation from sparse annotation,

    O. Cicek, A. Abdulkadir, S. S. Lienkamp, T. Brox and O. Ronneberger, "3D U-Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer - assisted intervention, 2016

  6. [6]

    Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation,

    K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, D. Rueckert and B. Glocker, "Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," Medical Image Analysis, vol. 36, pp. 61-78, 2017

  7. [7]

    VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images,

    H. Chen, Q. Dou, L. Yu, J. Qin and P.-A. Heng, "VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images," NeuroImage, vol. 170, pp. 446-455, 2018

  8. [8]

    H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes

    X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu and P. A. Heng, "H-DenseUNet: Hybrid densely connected UNet for liver and liver tumor segmentation from CT volumes," arXiv preprint arXiv:1709.07330, 2017

  9. [9]

    3D convolutional neural networks for tumor segmentation using long-range 2D context,

    P. Mlynarski, H. Delingette, A. Criminisi and N. Ayache, "3D convolutional neural networks for tumor segmentation using long-range 2D context," Computerized Medical Imaging and Graphics, pp. 60-72, 2019

  10. [10]

    Neural Architecture Search: A Survey,

    T. Elsken, J. H. Metzen and F. Hutter, "Neural Architecture Search: A Survey," Journal of Machine Learning Research, pp. 1-21, 2019

  11. [11]

    Neural Architecture Search with Reinforcement Learning

    B. Zoph and Q. V. Le, "Neural architecture search with reinforcement learning," arXiv preprint arXiv:1611.01578, 2016

  12. [12]

    Efficient Neural Architecture Search via Parameter Sharing

    H. Pham, M. Y. Guan, B. Zoph, Q. V. Le and J. Dean, "Efficient neural architecture search via parameter sharing," in arXiv preprint arXiv:1802.03268, 2018

  13. [13]

    Designing Neural Network Architectures using Reinforcement Learning

    B. Baker, O. Gupta, N. Naik and R. Raskar, "Designing neural network architectures using reinforcement learning," in arXiv preprint arXiv:1611.02167, 2016

  14. [14]

    Evolving Deep Neural Networks

    R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan and N. Duffy, "Evolving deep neural networks," arXiv preprint arXiv:1703.00548, 2017

  15. [15]

    Large-scale evolution of image classifiers,

    E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le and A. Kurakin, "Large-scale evolution of image classifiers," in Proceedings of the 34th International Conference on Machine Learning , 2017

  16. [16]

    A Genetic Programming Approach to Designing Convo lutional Neural Network Architectures,

    M. Suganuma, S. Shirakawa and T. Nagao, "A Genetic Programming Approach to Designing Convo lutional Neural Network Architectures," in Proceedings of the Genetic and Evolutionary Computation Conference , Berlin, 2017

  17. [17]

    Neural architecture search with bayesian optimisation and optimal transport,

    K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos and E. P. Xing, "Neural architecture search with bayesian optimisation and optimal transport," in Advances in Neural Information Processing Systems, 2018

  18. [18]

    DARTS: Differentiable Architecture Search

    H. Liu, K. Simonyan and Y. Yang, "Darts: Differentiable architecture search," in arXiv preprint arXiv:1806.09055, 2018

  19. [19]

    SNAS: stochastic neural architecture,

    S. Xie, H. Zheng, C. Liu and L. Lin, "SNAS: stochastic neural architecture," in Proceedings of the International Conference on Learning Representations, New Orleans, 2019

  20. [20]

    A Survey on Neural Architecture Search

    M. Wistuba, A. Rawat and T. Pedapati, "A Survey on Neural Architecture Search," in arXiv preprint arXiv:1905.01392, 2019

  21. [21]

    nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation

    F. Isensee, J. Petersen, A. Klein, D. Zimmerer, P. F. Jaeger, S. Kohl, J. Wasserthal, G. Koehler, T. Norajitra, S. Wirkert and others, "nnu-net: Self-adapting framework for u-net-based medical image segmentation," arXiv preprint arXiv:1809.10486, 2018

  22. [22]

    Automatically designing CNN architectures for medical image segmentation,

    A. Mortazi and U. Bagci, "Automatically designing CNN architectures for medical image segmentation," in International Workshop on Machine Learning in Medical Imaging, 2018

  23. [23]

    NAS-Unet: Neural Architecture Search for Medical Image Segmentation,

    Y. Weng, T. Zhou, Y. Li and X. Qiu, "NAS-Unet: Neural Architecture Search for Medical Image Segmentation," IEEE Access, vol. 7, pp. 44247-44257, 2019

  24. [24]

    Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,

    M. Baldeon and S. Lai-Yuen, "Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation," Neurocomputing, 2019

  25. [25]

    V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation,

    Z. Zhu, C. Liu, D. Yang, A. Yuille and D. Xu, "V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation," in arXiv preprint arXiv:1906.02817, 2019

  26. [26]

    Scalable Neural Architecture Search for 3D Medical Image Segmentation,

    S. Kim, I. Kim, S. Lim, W. Baek, C. Kim, H. Cho, B. Yoon and T. Kim, "Scalable Neural Architecture Search for 3D Medical Image Segmentation," in arXiv preprint arXiv:1906.05956, 2019. 9

  27. [27]

    Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge,

    G. Litjens, R. Toth, W. van de Ven, C. Hoeks, S. Kerkstra, B. van Ginneken, G. Vincent, G. Guillard, N. Birbeck and J. Zhang, "Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge," Medical Image Analysis, vol. 18, pp. 359-373, 2014

  28. [28]

    Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

    S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015

  29. [29]

    Identity mappings in deep residual networks.,

    K. He, X. Zhang, S. Ren and J. Sun, "Identity mappings in deep residual networks.," in European Conference on Computer Vision, 2016

  30. [30]

    Efficient object localization using convolutional networks,

    J. Tompson, R. Goroshin, A. Jain, Y. LeCun and C. Bregler, "Efficient object localization using convolutional networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

  31. [31]

    MOEA/D: A multiobjective Evolutionary Algorithm Based on Decomposition,

    Q. Zhang and H. Li, "MOEA/D: A multiobjective Evolutionary Algorithm Based on Decomposition," IEEE Transactions on Evolutionary Computations, vol. 11, pp. 712-731, 2007

  32. [32]

    Neural network ensembles,

    L. K. Hansen and P. Salamon, "Neural network ensembles," IEEE Transactions on Pattern Analysis & Machine Intelligence, pp. 993-1001, 1990

  33. [33]

    Adam: A Method for Stochastic Optimization

    D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014

  34. [34]

    Chollet, "Keras," 2015

    F. Chollet, "Keras," 2015. [Online]. Available: https://github.com/keras-team/keras