An Efficient Solution for Breast Tumor Segmentation and Classification in Ultrasound Images Using Deep Adversarial Learning

Domenec Puig; Farhan Akram; Hatem A. Rashwan; Md. Mostafa Kamal Sarker; Mohamed Abdel-Nasser; Nidhi Pandey; Santiago Romani; Vivek Kumar Singh

arxiv: 1907.00887 · v1 · pith:GMAX5Q4Enew · submitted 2019-07-01 · 📡 eess.IV · cs.CV

An Efficient Solution for Breast Tumor Segmentation and Classification in Ultrasound Images Using Deep Adversarial Learning

Vivek Kumar Singh , Hatem A. Rashwan , Mohamed Abdel-Nasser , Md. Mostafa Kamal Sarker , Farhan Akram , Nidhi Pandey , Santiago Romani , Domenec Puig This is my paper

Pith reviewed 2026-05-25 11:24 UTC · model grok-4.3

classification 📡 eess.IV cs.CV

keywords breast ultrasoundtumor segmentationconditional GANadversarial learningimage classificationDice coefficientIoU metric

0 comments

The pith

Adding atrous convolution and channel weighting to a cGAN yields 93.76% Dice and 88.82% IoU on breast tumor segmentation in ultrasound images, with boundary shape statistics classifying tumors at 85% accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces modifications to conditional generative adversarial networks for segmenting breast tumors in ultrasound images. An atrous convolution layer is added to capture features at multiple resolutions, and a channel-wise weighting block rebalances the impact of encoded features. The network is trained with a combination of SSIM, L1-norm, and adversarial losses. These changes yield better segmentation results than existing models. Extracted shape statistics from the resulting masks classify tumors as benign or malignant with 85% accuracy.

Core claim

The authors claim that their enhanced cGAN model, with atrous convolutions for multi-resolution feature learning and channel-wise weighting, achieves state-of-the-art segmentation performance on breast ultrasound images with Dice score of 93.76% and IoU of 88.82%. Furthermore, statistical features from the boundaries of these segmented masks enable classification of benign and malignant tumors at 85% accuracy.

What carries the argument

The key machinery is the integration of an atrous convolution layer and a channel-wise weighting block into the cGAN architecture, which enables learning of tumor features at different scales and automatic rebalancing of feature channels.

If this is right

The enhanced model outperforms prior segmentation approaches on Dice and IoU metrics.
Simple boundary shape statistics suffice to discriminate tumor types.
The combined loss functions support effective training of the segmentation model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the modifications are the true source of gains, similar additions could benefit other segmentation tasks in medical imaging.
Classification accuracy might increase with integration of texture or intensity features alongside shape.
Validation on larger, more diverse datasets would strengthen the claims.

Load-bearing premise

The reported gains in segmentation accuracy stem specifically from the atrous convolution and channel-wise weighting additions rather than dataset properties or training procedures.

What would settle it

Running an ablation of the model without the atrous convolution and channel-wise weighting on the same data and observing no drop in Dice or IoU scores would falsify the claim that those elements drive the performance.

Figures

Figures reproduced from arXiv: 1907.00887 by Domenec Puig, Farhan Akram, Hatem A. Rashwan, Md. Mostafa Kamal Sarker, Mohamed Abdel-Nasser, Nidhi Pandey, Santiago Romani, Vivek Kumar Singh.

**Figure 1.** Figure 1: The architecture of the proposed segmentation model for BUS images. En4, the generator network is enabled to characterize features at different scales and also to expand the actual receptive field of the filters. As a consequence, the network is more aware of contextual information without increasing the number of parameters or the amount of computation. We use 1, 6 and 9 dilation rates with kernel size 3 … view at source ↗

**Figure 2.** Figure 2: shows boxplots of Dice and IoU values obtained for the 50 testing samples using FCN, SegNet, ERFNet, UNet and the proposed model. The two models based on cGAN provided small ranges of Dice and IoU values. For instance, our model is in the range 88% to 94% for Dice coefficient and 80% to 89% for IoU, while other deep segmentation methods, FCN, SegNet, ERFNet and UNet show a wider range of values. Moreover,… view at source ↗

**Figure 3.** Figure 3: Segmentation results on four samples of the BUS dataset. The rows (a) and (b) show benign samples while rows (c) and (d) rows show malignant samples. Breast tumor classification results: To test our classification strategy, we have checked our method with different segmentation method output with the leave-one-out cross-validation technique and calculated the precision, recall, accuracy and F1-score metri… view at source ↗

read the original abstract

This paper proposes an efficient solution for tumor segmentation and classification in breast ultrasound (BUS) images. We propose to add an atrous convolution layer to the conditional generative adversarial network (cGAN) segmentation model to learn tumor features at different resolutions of BUS images. To automatically re-balance the relative impact of each of the highest level encoded features, we also propose to add a channel-wise weighting block in the network. In addition, the SSIM and L1-norm loss with the typical adversarial loss are used as a loss function to train the model. Our model outperforms the state-of-the-art segmentation models in terms of the Dice and IoU metrics, achieving top scores of 93.76% and 88.82%, respectively. In the classification stage, we show that few statistics features extracted from the shape of the boundaries of the predicted masks can properly discriminate between benign and malignant tumors with an accuracy of 85%$

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes enhancements to a conditional GAN (cGAN) segmentation model for breast ultrasound (BUS) images: an atrous convolution layer to capture multi-resolution tumor features and a channel-wise weighting block to rebalance high-level encoded features. Training combines adversarial loss with SSIM and L1-norm losses. The work claims state-of-the-art Dice (93.76%) and IoU (88.82%) scores on segmentation and shows that a small set of boundary shape statistics from the predicted masks can classify benign vs. malignant tumors at 85% accuracy.

Significance. If the reported metrics prove reproducible and the gains are shown to stem from the atrous and weighting additions rather than dataset or training choices, the approach could offer a practical pipeline for BUS tumor analysis that combines accurate segmentation with lightweight shape-based classification. No machine-checked proofs or parameter-free derivations are present; the contribution is empirical.

major comments (3)

[Abstract] Abstract: the headline segmentation claim (Dice 93.76%, IoU 88.82%) and the assertion that the model 'outperforms the state-of-the-art' cannot be evaluated because the abstract supplies no dataset size, patient demographics, train/test split, augmentation protocol, or list of the specific SOTA baselines and their re-implementations. Without these controls the attribution of performance to atrous convolution plus channel weighting has no empirical anchor.
[Abstract] Abstract: the classification result (85% accuracy from 'few statistics features extracted from the shape of the boundaries') is presented without any ablation linking mask quality to classification performance, without naming the exact features or selection method, and without error analysis showing how segmentation errors propagate to the 85% figure.
[Abstract] Abstract: no statistical tests, confidence intervals, or cross-validation details accompany the reported metrics, so it is impossible to determine whether the claimed improvements are significant or merely within the variability of an unspecified experimental setup.

minor comments (1)

[Abstract] Abstract ends with the malformed token '85%$' (likely a LaTeX artifact).

Simulated Author's Rebuttal

3 responses · 0 unresolved

We appreciate the referee's comments regarding the abstract. We will revise the abstract to address the concerns about missing experimental details, classification specifics, and statistical information. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract] Abstract: the headline segmentation claim (Dice 93.76%, IoU 88.82%) and the assertion that the model 'outperforms the state-of-the-art' cannot be evaluated because the abstract supplies no dataset size, patient demographics, train/test split, augmentation protocol, or list of the specific SOTA baselines and their re-implementations. Without these controls the attribution of performance to atrous convolution plus channel weighting has no empirical anchor.

Authors: We agree with the referee that the abstract would benefit from additional details on the experimental setup to support the performance claims. We will revise the abstract to include information on the dataset size, train/test split, augmentation protocol, and the specific state-of-the-art methods used for comparison. This will provide the necessary context for evaluating the contribution of the atrous convolution and channel weighting blocks. revision: yes
Referee: [Abstract] Abstract: the classification result (85% accuracy from 'few statistics features extracted from the shape of the boundaries') is presented without any ablation linking mask quality to classification performance, without naming the exact features or selection method, and without error analysis showing how segmentation errors propagate to the 85% figure.

Authors: We acknowledge that the abstract does not detail the specific shape features or provide an ablation study for the classification stage. We will revise the abstract to name the features used and indicate that the classification is performed on the predicted masks. We will also add a brief note on the relationship between segmentation and classification performance. revision: yes
Referee: [Abstract] Abstract: no statistical tests, confidence intervals, or cross-validation details accompany the reported metrics, so it is impossible to determine whether the claimed improvements are significant or merely within the variability of an unspecified experimental setup.

Authors: We agree that including statistical tests and confidence intervals would help assess the significance of the results. We will revise the abstract to include cross-validation details and confidence intervals for the reported metrics. revision: yes

Circularity Check

0 steps flagged

No circularity detected; purely empirical architecture proposal with reported metrics

full rationale

The paper proposes adding atrous convolution and channel-wise weighting to a cGAN, combines SSIM+L1 with adversarial loss, and reports Dice/IoU/accuracy numbers on segmentation and classification. No equations, derivations, or self-citations are presented that reduce any claimed result to a fitted parameter or prior result by construction. The work is self-contained as an empirical demonstration; performance figures are not forced by internal definitions or renamings.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

Only the abstract is available; network hyperparameters, exact atrous rates, and training details are unspecified, so the ledger reflects typical deep-learning assumptions rather than paper-specific items.

free parameters (2)

atrous convolution dilation rates
Chosen to capture multi-resolution tumor features; values not stated in abstract.
channel weighting parameters
Learned weights for re-balancing encoder features; training details absent.

axioms (1)

domain assumption Adversarial training reaches a stable equilibrium useful for medical image segmentation
Core premise of cGAN-based segmentation models.

pith-pipeline@v0.9.0 · 5721 in / 1345 out tokens · 38027 ms · 2026-05-25T11:24:29.520684+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages · 2 internal anchors

[1]

Engineering Applications of Artiﬁcial Intelligence 59, 84–92 (2017)

Abdel-Nasser, M., Melendez, J., Moreno, A., Omer, O.A., Puig, D.: Breast tu- mor classiﬁcation in ultrasound images using texture analysis and super-resolution methods. Engineering Applications of Artiﬁcial Intelligence 59, 84–92 (2017)

work page 2017
[2]

IEEE transactions on pat- tern analysis and machine intelligence 39(12), 2481–2495 (2017)

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pat- tern analysis and machine intelligence 39(12), 2481–2495 (2017)

work page 2017
[3]

Dual Attention Network for Scene Segmentation

Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene seg- mentation. arXiv preprint arXiv:1809.02983 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018
[4]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7132–7141 (2018)

work page 2018
[5]

Medical physics 46(1), 215–228 (2019)

Hu, Y., Guo, Y., Wang, Y., Yu, J., Zhou, S., Chang, C.: Automatic tumor seg- mentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model. Medical physics 46(1), 215–228 (2019)

work page 2019
[6]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with condi- tional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1125–1134 (2017)

work page 2017
[7]

In: Proceedings of the 34th Inter- national Conference on Machine Learning-Volume 70

Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th Inter- national Conference on Machine Learning-Volume 70. pp. 1857–1865 (2017)

work page 2017
[8]

New England journal of medicine 372(24), 2353–2358 (2015)

Lauby-Secretan, B., Scoccianti, C., Loomis, D., Benbrahim-Tallaa, L., Bouvard, V., Bianchini, F., Straif, K.: Breast-cancer screeningviewpoint of the iarc working group. New England journal of medicine 372(24), 2353–2358 (2015)

work page 2015
[9]

Lee, C.Y., Chen, G.L., Zhang, Z.X., Chou, Y.H., Hsu, C.C.: Is intensity inhomo- geneity correction useful for classiﬁcation of breast cancer in sonograms using deep neural network? Journal of healthcare engineering 2018 (2018)

work page 2018
[10]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440 (2015)

work page 2015
[11]

Mendeley Data (2017)

Rodrigues, P.S.: Breast ultrasound image. Mendeley Data (2017)

work page 2017
[12]

IEEE Transactions on In- telligent Transportation Systems 19(1), 263–272 (2018) An Eﬃcient Solution for Breast Tumor Segmentation and Classiﬁcation 9

Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: Erfnet: Eﬃcient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on In- telligent Transportation Systems 19(1), 263–272 (2018) An Eﬃcient Solution for Breast Tumor Segmentation and Classiﬁcation 9

work page 2018
[13]

In: International Conference on Medical image computing and computer-assisted intervention

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi- cal image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)

work page 2015
[14]

CA: a cancer journal for clinicians 67(1), 7–30 (2017)

Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2017. CA: a cancer journal for clinicians 67(1), 7–30 (2017)

work page 2017
[15]

IEEE transactions on image processing 13(4), 600–612 (2004)

Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality as- sessment: from error visibility to structural similarity. IEEE transactions on image processing 13(4), 600–612 (2004)

work page 2004
[16]

Ultrasonics 91, 1–9 (2019)

Xu, Y., Wang, Y., Yuan, J., Cheng, Q., Wang, X., Carson, P.L.: Medical breast ultrasound image segmentation by machine learning. Ultrasonics 91, 1–9 (2019)

work page 2019
[17]

In: Medical Imaging 2008: Ultrasonic Imaging and Signal Processing

Yang, W., Zhang, S., Chen, Y., Li, W., Chen, Y.: Measuring shape complexity of breast lesions on ultrasound images. In: Medical Imaging 2008: Ultrasonic Imaging and Signal Processing. vol. 6920, p. 69200J. International Society for Optics and Photonics (2008)

work page 2008
[18]

Multi-Scale Context Aggregation by Dilated Convolutions

Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[1] [1]

Engineering Applications of Artiﬁcial Intelligence 59, 84–92 (2017)

Abdel-Nasser, M., Melendez, J., Moreno, A., Omer, O.A., Puig, D.: Breast tu- mor classiﬁcation in ultrasound images using texture analysis and super-resolution methods. Engineering Applications of Artiﬁcial Intelligence 59, 84–92 (2017)

work page 2017

[2] [2]

IEEE transactions on pat- tern analysis and machine intelligence 39(12), 2481–2495 (2017)

Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pat- tern analysis and machine intelligence 39(12), 2481–2495 (2017)

work page 2017

[3] [3]

Dual Attention Network for Scene Segmentation

Fu, J., Liu, J., Tian, H., Fang, Z., Lu, H.: Dual attention network for scene seg- mentation. arXiv preprint arXiv:1809.02983 (2018)

work page internal anchor Pith review Pith/arXiv arXiv 2018

[4] [4]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 7132–7141 (2018)

work page 2018

[5] [5]

Medical physics 46(1), 215–228 (2019)

Hu, Y., Guo, Y., Wang, Y., Yu, J., Zhou, S., Chang, C.: Automatic tumor seg- mentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model. Medical physics 46(1), 215–228 (2019)

work page 2019

[6] [6]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with condi- tional adversarial networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1125–1134 (2017)

work page 2017

[7] [7]

In: Proceedings of the 34th Inter- national Conference on Machine Learning-Volume 70

Kim, T., Cha, M., Kim, H., Lee, J.K., Kim, J.: Learning to discover cross-domain relations with generative adversarial networks. In: Proceedings of the 34th Inter- national Conference on Machine Learning-Volume 70. pp. 1857–1865 (2017)

work page 2017

[8] [8]

New England journal of medicine 372(24), 2353–2358 (2015)

Lauby-Secretan, B., Scoccianti, C., Loomis, D., Benbrahim-Tallaa, L., Bouvard, V., Bianchini, F., Straif, K.: Breast-cancer screeningviewpoint of the iarc working group. New England journal of medicine 372(24), 2353–2358 (2015)

work page 2015

[9] [9]

Lee, C.Y., Chen, G.L., Zhang, Z.X., Chou, Y.H., Hsu, C.C.: Is intensity inhomo- geneity correction useful for classiﬁcation of breast cancer in sonograms using deep neural network? Journal of healthcare engineering 2018 (2018)

work page 2018

[10] [10]

In: Proceedings of the IEEE conference on computer vision and pattern recognition

Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 3431–3440 (2015)

work page 2015

[11] [11]

Mendeley Data (2017)

Rodrigues, P.S.: Breast ultrasound image. Mendeley Data (2017)

work page 2017

[12] [12]

IEEE Transactions on In- telligent Transportation Systems 19(1), 263–272 (2018) An Eﬃcient Solution for Breast Tumor Segmentation and Classiﬁcation 9

Romera, E., Alvarez, J.M., Bergasa, L.M., Arroyo, R.: Erfnet: Eﬃcient residual factorized convnet for real-time semantic segmentation. IEEE Transactions on In- telligent Transportation Systems 19(1), 263–272 (2018) An Eﬃcient Solution for Breast Tumor Segmentation and Classiﬁcation 9

work page 2018

[13] [13]

In: International Conference on Medical image computing and computer-assisted intervention

Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedi- cal image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)

work page 2015

[14] [14]

CA: a cancer journal for clinicians 67(1), 7–30 (2017)

Siegel, R.L., Miller, K.D., Jemal, A.: Cancer statistics, 2017. CA: a cancer journal for clinicians 67(1), 7–30 (2017)

work page 2017

[15] [15]

IEEE transactions on image processing 13(4), 600–612 (2004)

Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., et al.: Image quality as- sessment: from error visibility to structural similarity. IEEE transactions on image processing 13(4), 600–612 (2004)

work page 2004

[16] [16]

Ultrasonics 91, 1–9 (2019)

Xu, Y., Wang, Y., Yuan, J., Cheng, Q., Wang, X., Carson, P.L.: Medical breast ultrasound image segmentation by machine learning. Ultrasonics 91, 1–9 (2019)

work page 2019

[17] [17]

In: Medical Imaging 2008: Ultrasonic Imaging and Signal Processing

Yang, W., Zhang, S., Chen, Y., Li, W., Chen, Y.: Measuring shape complexity of breast lesions on ultrasound images. In: Medical Imaging 2008: Ultrasonic Imaging and Signal Processing. vol. 6920, p. 69200J. International Society for Optics and Photonics (2008)

work page 2008

[18] [18]

Multi-Scale Context Aggregation by Dilated Convolutions

Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015