Self-Adaptive 2D-3D Ensemble of Fully Convolutional Networks for Medical Image Segmentation
Pith reviewed 2026-05-24 15:09 UTC · model grok-4.3
The pith
An automatically evolved 2D-3D FCN ensemble reaches top-10 ranking on prostate segmentation while using far fewer parameters than other auto-designed models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a self-adaptive ensemble formed by one 2D FCN and one 3D FCN, whose architectures are jointly evolved by a multiobjective algorithm that minimizes both segmentation error and network size, produces a model that ranks in the top ten on the PROMISE12 challenge and surpasses other automatically designed networks while remaining considerably smaller.
What carries the argument
The multiobjective evolutionary algorithm that searches for 2D and 3D FCN architectures minimizing segmentation error together with parameter count on the given medical dataset.
If this is right
- The 2D-3D split lets intra-slice detail and inter-slice context be optimized separately within one model.
- Evolutionary search can replace manual trial-and-error when creating segmentation networks for new medical volumes.
- High leaderboard placement remains possible even after the search explicitly penalizes large parameter counts.
- Volumetric medical tasks become feasible on hardware with tighter memory limits.
Where Pith is reading between the lines
- The same search process could be applied to other organs or modalities to test whether the discovered 2D-3D balance is dataset-specific.
- Reduced model size may allow the networks to run on clinical workstations that cannot host full 3D models.
- Extending the evolutionary objective to include inference speed would further tighten the link between accuracy and practical deployment.
Load-bearing premise
The evolutionary search discovers network structures that generalize from the training and validation data to the hidden test cases rather than overfitting to the challenge split.
What would settle it
Running the identical evolutionary procedure on the PROMISE12 training set and finding that the produced network falls outside the top ten or loses its size advantage on the official test set would falsify the central claim.
Figures
read the original abstract
Segmentation is a critical step in medical image analysis. Fully Convolutional Networks (FCNs) have emerged as powerful segmentation models achieving state-of-the-art results in various medical image datasets. Network architectures are usually designed manually for a specific segmentation task so applying them to other medical datasets requires extensive experience and time. Moreover, the segmentation requires handling large volumetric data that results in big and complex architectures. Recently, methods that automatically design neural networks for medical image segmentation have been presented; however, most approaches either do not fully consider volumetric information or do not optimize the size of the network. In this paper, we propose a novel self-adaptive 2D-3D ensemble of FCNs for medical image segmentation that incorporates volumetric information and optimizes both the model's performance and size. The model is composed of an ensemble of a 2D FCN that extracts intra-slice information, and a 3D FCN that exploits inter-slice information. The architectures of the 2D and 3D FCNs are automatically adapted to a medical image dataset using a multiobjective evolutionary based algorithm that minimizes both the segmentation error and number of parameters in the network. The proposed 2D-3D FCN ensemble was tested on the task of prostate segmentation on the image dataset from the PROMISE12 Grand Challenge. The resulting network is ranked in the top 10 submissions, surpassing the performance of other automatically-designed architectures while being considerably smaller in size.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a self-adaptive 2D-3D ensemble of FCNs for medical image segmentation. Architectures for the 2D (intra-slice) and 3D (inter-slice) components are discovered by a multiobjective evolutionary algorithm that jointly minimizes segmentation error and parameter count. On the PROMISE12 prostate segmentation challenge the resulting ensemble is reported to rank in the top 10 submissions while being considerably smaller than other automatically designed networks.
Significance. If the evolutionary search protocol demonstrably avoids overfitting to the fitness validation data and the reported ranking is on the hidden test set, the work would provide a concrete example of automatically producing compact, high-performing 2D-3D ensembles for volumetric medical segmentation, a useful contribution given the tension between model size and accuracy in clinical applications.
major comments (2)
- [Methods] Methods section (evolutionary algorithm description): the manuscript supplies no protocol details on how the validation split used for fitness evaluation is isolated from the official PROMISE12 validation or hidden test sets, nor on the total number of fitness evaluations performed. Without this information the central claim that the discovered architecture achieves a genuine top-10 ranking without inflation from repeated validation exposure cannot be verified.
- [Experiments] Experiments section (ranking and size comparison): the reported top-10 ranking and size advantage over other auto-designed nets is presented without an ablation that isolates the contribution of the 2D-3D ensemble versus the individual 2D or 3D networks, or versus a single-objective search; this leaves open whether the multiobjective formulation is load-bearing for the claimed performance-size trade-off.
minor comments (2)
- [Abstract] Abstract and introduction: the phrase 'self-adaptive' is used without a precise definition or pointer to the evolutionary operators that implement the adaptation.
- [Figures] Figure captions and tables: several figures lack error bars or statistical significance markers on the reported Dice scores, making direct comparison with challenge submissions harder to interpret.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the two major comments point-by-point below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [Methods] Methods section (evolutionary algorithm description): the manuscript supplies no protocol details on how the validation split used for fitness evaluation is isolated from the official PROMISE12 validation or hidden test sets, nor on the total number of fitness evaluations performed. Without this information the central claim that the discovered architecture achieves a genuine top-10 ranking without inflation from repeated validation exposure cannot be verified.
Authors: We agree that these protocol details are important for verifying the integrity of the search. In the revised manuscript we will add an explicit subsection describing the fitness validation split construction (random 80/20 split within the official training cases only, with no overlap to the challenge validation or hidden test sets) and will report the total number of fitness evaluations performed during the multiobjective evolutionary search. revision: yes
-
Referee: [Experiments] Experiments section (ranking and size comparison): the reported top-10 ranking and size advantage over other auto-designed nets is presented without an ablation that isolates the contribution of the 2D-3D ensemble versus the individual 2D or 3D networks, or versus a single-objective search; this leaves open whether the multiobjective formulation is load-bearing for the claimed performance-size trade-off.
Authors: The multiobjective formulation is integral to the method because it directly produces the compact high-performing ensemble reported; the size-performance trade-off is the explicit objective of the search. While an explicit ablation against single-objective search or isolated 2D/3D components was not included, the comparison against other automatically designed networks already demonstrates the practical advantage of the resulting model. We will add a short discussion paragraph clarifying the design rationale for the multiobjective approach but do not believe a full new ablation study is required to support the stated claims. revision: partial
Circularity Check
No circularity; empirical ranking on external challenge dataset
full rationale
The paper describes an empirical pipeline: a multiobjective evolutionary algorithm designs 2D and 3D FCN architectures by minimizing segmentation error and parameter count on the PROMISE12 dataset, followed by ensemble evaluation that yields a measured top-10 ranking. No equations, fitted parameters renamed as predictions, or self-citation chains are present in the provided text. The central claim reduces to an externally verifiable ranking on a public challenge test set rather than any derivation that collapses to its own inputs by construction. The method is self-contained against the external benchmark.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption A multiobjective evolutionary algorithm can effectively explore the joint space of 2D and 3D FCN architectures while trading off segmentation accuracy against parameter count.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The architectures of the 2D and 3D FCNs are automatically adapted ... using a multiobjective evolutionary based algorithm that minimizes both the segmentation error and number of parameters
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Automated medical image segmentation techniques,
N. Sharma and L. M. Aggarwal, "Automated medical image segmentation techniques," Journal of medical physics, vol. 35, p. 3, 2010
work page 2010
-
[2]
Fully Convolutional Networks for Semantic Segmentation,
J. Long, E. Shelhamer and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," in IEEE Conference on Computer Vision and Pattern Recognition, 2014
work page 2014
-
[3]
U-Net: Convolutional Networks for Biomedical Image Segmentation
O. Ronneberg, P. Fischer and T. Brox, "U-Net: Convolutional Networks for Biomedical Image Segmentation," arXiv:1505.04597, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[4]
Segnet: A deep convolutional encoder -decoder architecture for image segmentation,
V. Badrinarayanan, A. Kendall and R. Cipolla, "Segnet: A deep convolutional encoder -decoder architecture for image segmentation," IEEE transactions on pattern analysis and machine intelligence, pp. 2481-2495, 2017. 8
work page 2017
-
[5]
3D U-Net: learning dense volumetric segmentation from sparse annotation,
O. Cicek, A. Abdulkadir, S. S. Lienkamp, T. Brox and O. Ronneberger, "3D U-Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer - assisted intervention, 2016
work page 2016
-
[6]
Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation,
K. Kamnitsas, C. Ledig, V. F. Newcombe, J. P. Simpson, A. D. Kane, D. K. Menon, D. Rueckert and B. Glocker, "Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation," Medical Image Analysis, vol. 36, pp. 61-78, 2017
work page 2017
-
[7]
VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images,
H. Chen, Q. Dou, L. Yu, J. Qin and P.-A. Heng, "VoxResNet: Deep voxelwise residual networks for brain segmentation from 3D MR images," NeuroImage, vol. 170, pp. 446-455, 2018
work page 2018
-
[8]
H-DenseUNet: Hybrid Densely Connected UNet for Liver and Tumor Segmentation from CT Volumes
X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu and P. A. Heng, "H-DenseUNet: Hybrid densely connected UNet for liver and liver tumor segmentation from CT volumes," arXiv preprint arXiv:1709.07330, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[9]
3D convolutional neural networks for tumor segmentation using long-range 2D context,
P. Mlynarski, H. Delingette, A. Criminisi and N. Ayache, "3D convolutional neural networks for tumor segmentation using long-range 2D context," Computerized Medical Imaging and Graphics, pp. 60-72, 2019
work page 2019
-
[10]
Neural Architecture Search: A Survey,
T. Elsken, J. H. Metzen and F. Hutter, "Neural Architecture Search: A Survey," Journal of Machine Learning Research, pp. 1-21, 2019
work page 2019
-
[11]
Neural Architecture Search with Reinforcement Learning
B. Zoph and Q. V. Le, "Neural architecture search with reinforcement learning," arXiv preprint arXiv:1611.01578, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[12]
Efficient Neural Architecture Search via Parameter Sharing
H. Pham, M. Y. Guan, B. Zoph, Q. V. Le and J. Dean, "Efficient neural architecture search via parameter sharing," in arXiv preprint arXiv:1802.03268, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[13]
Designing Neural Network Architectures using Reinforcement Learning
B. Baker, O. Gupta, N. Naik and R. Raskar, "Designing neural network architectures using reinforcement learning," in arXiv preprint arXiv:1611.02167, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[14]
R. Miikkulainen, J. Liang, E. Meyerson, A. Rawal, D. Fink, O. Francon, B. Raju, H. Shahrzad, A. Navruzyan and N. Duffy, "Evolving deep neural networks," arXiv preprint arXiv:1703.00548, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[15]
Large-scale evolution of image classifiers,
E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q. V. Le and A. Kurakin, "Large-scale evolution of image classifiers," in Proceedings of the 34th International Conference on Machine Learning , 2017
work page 2017
-
[16]
A Genetic Programming Approach to Designing Convo lutional Neural Network Architectures,
M. Suganuma, S. Shirakawa and T. Nagao, "A Genetic Programming Approach to Designing Convo lutional Neural Network Architectures," in Proceedings of the Genetic and Evolutionary Computation Conference , Berlin, 2017
work page 2017
-
[17]
Neural architecture search with bayesian optimisation and optimal transport,
K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos and E. P. Xing, "Neural architecture search with bayesian optimisation and optimal transport," in Advances in Neural Information Processing Systems, 2018
work page 2018
-
[18]
DARTS: Differentiable Architecture Search
H. Liu, K. Simonyan and Y. Yang, "Darts: Differentiable architecture search," in arXiv preprint arXiv:1806.09055, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[19]
SNAS: stochastic neural architecture,
S. Xie, H. Zheng, C. Liu and L. Lin, "SNAS: stochastic neural architecture," in Proceedings of the International Conference on Learning Representations, New Orleans, 2019
work page 2019
-
[20]
A Survey on Neural Architecture Search
M. Wistuba, A. Rawat and T. Pedapati, "A Survey on Neural Architecture Search," in arXiv preprint arXiv:1905.01392, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1905
-
[21]
nnU-Net: Self-adapting Framework for U-Net-Based Medical Image Segmentation
F. Isensee, J. Petersen, A. Klein, D. Zimmerer, P. F. Jaeger, S. Kohl, J. Wasserthal, G. Koehler, T. Norajitra, S. Wirkert and others, "nnu-net: Self-adapting framework for u-net-based medical image segmentation," arXiv preprint arXiv:1809.10486, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[22]
Automatically designing CNN architectures for medical image segmentation,
A. Mortazi and U. Bagci, "Automatically designing CNN architectures for medical image segmentation," in International Workshop on Machine Learning in Medical Imaging, 2018
work page 2018
-
[23]
NAS-Unet: Neural Architecture Search for Medical Image Segmentation,
Y. Weng, T. Zhou, Y. Li and X. Qiu, "NAS-Unet: Neural Architecture Search for Medical Image Segmentation," IEEE Access, vol. 7, pp. 44247-44257, 2019
work page 2019
-
[24]
Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation,
M. Baldeon and S. Lai-Yuen, "Adaresu-net: Multiobjective adaptive convolutional neural network for medical image segmentation," Neurocomputing, 2019
work page 2019
-
[25]
V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation,
Z. Zhu, C. Liu, D. Yang, A. Yuille and D. Xu, "V-NAS: Neural Architecture Search for Volumetric Medical Image Segmentation," in arXiv preprint arXiv:1906.02817, 2019
-
[26]
Scalable Neural Architecture Search for 3D Medical Image Segmentation,
S. Kim, I. Kim, S. Lim, W. Baek, C. Kim, H. Cho, B. Yoon and T. Kim, "Scalable Neural Architecture Search for 3D Medical Image Segmentation," in arXiv preprint arXiv:1906.05956, 2019. 9
-
[27]
Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge,
G. Litjens, R. Toth, W. van de Ven, C. Hoeks, S. Kerkstra, B. van Ginneken, G. Vincent, G. Guillard, N. Birbeck and J. Zhang, "Evaluation of prostate segmentation algorithms for MRI: the PROMISE12 challenge," Medical Image Analysis, vol. 18, pp. 359-373, 2014
work page 2014
-
[28]
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," arXiv preprint arXiv:1502.03167, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[29]
Identity mappings in deep residual networks.,
K. He, X. Zhang, S. Ren and J. Sun, "Identity mappings in deep residual networks.," in European Conference on Computer Vision, 2016
work page 2016
-
[30]
Efficient object localization using convolutional networks,
J. Tompson, R. Goroshin, A. Jain, Y. LeCun and C. Bregler, "Efficient object localization using convolutional networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015
work page 2015
-
[31]
MOEA/D: A multiobjective Evolutionary Algorithm Based on Decomposition,
Q. Zhang and H. Li, "MOEA/D: A multiobjective Evolutionary Algorithm Based on Decomposition," IEEE Transactions on Evolutionary Computations, vol. 11, pp. 712-731, 2007
work page 2007
-
[32]
L. K. Hansen and P. Salamon, "Neural network ensembles," IEEE Transactions on Pattern Analysis & Machine Intelligence, pp. 993-1001, 1990
work page 1990
-
[33]
Adam: A Method for Stochastic Optimization
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[34]
F. Chollet, "Keras," 2015. [Online]. Available: https://github.com/keras-team/keras
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.