pith. sign in

arxiv: 1907.04424 · v1 · pith:TJH34TMSnew · submitted 2019-07-09 · 💻 cs.CV

Automatic Mass Detection in Breast Using Deep Convolutional Neural Network and SVM Classifier

Pith reviewed 2026-05-25 00:44 UTC · model grok-4.3

classification 💻 cs.CV
keywords mass detectionmammographyVGG19SVM classifiertransfer learningbreast cancerINbreastdeep learning
0
0 comments X

The pith

Pre-trained VGG19 extracts mammogram features for SVM to detect masses at 0.994 AUC

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates an automated system for detecting masses in breast mammograms by extracting features with a pre-trained VGG19 network. Selected features then train an SVM classifier to separate mass from non-mass regions. Experiments on the INbreast dataset yield an AUC of 0.994 with a narrow confidence interval. This setup addresses the difficulty of manual detection given large variations in mass appearance. A reader would value it for showing how transfer learning can achieve strong results on a standard public dataset without custom network training.

Core claim

Pre-trained VGG19 network is used to extract features which are then followed by bagged decision tree for features selection and then a Support Vector Machine (SVM) classifier is trained and used for classifying between the mass and non-mass. The best AUC obtained is 0.994 +/- 0.003 on the INbreast dataset. The results conclude that high-level distinctive features can be extracted from Mammograms which when used with the proposed SVM classifier is able to robustly distinguish between the mass and non-mass present in breast.

What carries the argument

The combination of VGG19 feature extraction, bagged decision tree feature selection, and SVM classification for mass versus non-mass in mammograms

If this is right

  • High AUC performance is achieved with the selected SVM after feature selection.
  • Both C-SVM and nu-SVM classifiers were evaluated for robustness before choosing the best.
  • Pre-trained networks enable distinctive feature extraction from mammograms without training from scratch.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Performance on other datasets would test whether the reported AUC holds outside the single collection used here.
  • The pipeline could be adapted to detect other abnormalities if the feature extraction captures general lesion properties.
  • Clinical deployment would require checking sensitivity to different mammography machines and patient populations.

Load-bearing premise

The INbreast dataset sufficiently represents the variations in mammogram appearance and mass characteristics found across different clinical settings and equipment.

What would settle it

Running the identical pipeline on a separate collection of mammograms and finding substantially lower AUC would show the distinction is not as robust as claimed.

Figures

Figures reproduced from arXiv: 1907.04424 by Md. Kamrul Hasan, Tajwar Abrar Aleef.

Figure 1
Figure 1. Figure 1: Presence of malignant tumor in the breast4 . There are several imaging techniques available that help clinicians to analyze and pinpoint suspicious regions. Modalities such as X-ray (Mammography, Digital breast tomosynthesis, Xeromammography, Galactography), MRI, CT, PET, Ultrasound, and Scintimammography are some of the non-invasive techniques used to detect mass in the breast. Among all these methods, ma… view at source ↗
Figure 2
Figure 2. Figure 2: Two major views of Mammograms available in INbreast database. https://www.overleaf.com/project/5c7440defb0def2fb94c57a2 [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overall pipeline for the proposed breast mass-detection system. https://www.overleaf.com/project/5c7440defb0def2fb94c57a2 Patch Extraction Patch-based classification increases the robustness of the classifier by increasing the number of training samples20 and reducing the computational complexity. In this study, patches having dimensions of 454×454 pixels are extracted from the original mammograms. To crea… view at source ↗
Figure 4
Figure 4. Figure 4: Details of the 19 layers of VGG19 network21 used for feature extraction. Patches were normalized with zero mean and unit variance before feeding them into the network for feature extraction. Two sets of features were extracted from the VGG19 model, first one after the fully connected layer 2 (FC2) that gives 4096 features and second one after the flatten layer (FL) that gives 25088 features. These two sets… view at source ↗
Figure 5
Figure 5. Figure 5: Flow diagram of the implemented feature selector. Selected Feature importance = 0.95×Total importance (1) Train, Validation and Test Data Selection After selecting the optimum set of features, this feature vector was split into train, validation and test data. To do so, firstly, whole data was divided into five (5-fold cross validation24) equal parts as shown in [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Mass and Non-mass data split for the cross validation. From [PITH_FULL_IMAGE:figures/full_fig_p005_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Block diagram for tuning the best hyper-parameters and predicting results on test set. From [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: ROC curves for test data of Experiment 1. It is seen from [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: ROC curves for test data of Experiment 2. Experiment 3 In this experiment, two geometric augmentations (flipping and rotation) was added on extracted patches and the features were extracted from fully connected layer 2 (FC2) of VGG19 network which gave 4096 features and now with augmentation, there were 6000 instances of the observations for this experiment. The selected feature number is 97 in this experi… view at source ↗
Figure 10
Figure 10. Figure 10: ROC curves for test data of Experiment 3. Experiment 4 In this experiment, again the same two geometric augmentations (flipping and rotation) were executed on extracted patches and the features were extracted from the flatten layer (FL) of the network which provides 25088 features. In this experiment, a total of 205 features were selected. From the validation test, it is seen that maximum AUC is at C=10 a… view at source ↗
Figure 11
Figure 11. Figure 11: ROC curves for test data of Experiment 4. Conclusions In this literature, the robustness of the VGG19 along with SVM (C-SVM and υ-SVM) for the breast mass and non-mass classification was analyzed. All the parameters and hyperparameters of the C-SVM and υ-SVM (except the penalization parameters), the number of observations, and the dimensions of the features vector were kept the same during each of the exp… view at source ↗
read the original abstract

Mammography is the most widely used gold standard for screening breast cancer, where, mass detection is considered as the prominent step. Detecting mass in the breast is, however, an arduous problem as they usually have large variations between them in terms of shape, size, boundary, and texture. In this literature, the process of mass detection is automated with the use of transfer learning techniques of Deep Convolutional Neural Networks (DCNN). Pre-trained VGG19 network is used to extract features which are then followed by bagged decision tree for features selection and then a Support Vector Machine (SVM) classifier is trained and used for classifying between the mass and non-mass. Area Under ROC Curve (AUC) is chosen as the performance metric, which is then maximized during classifier selection and hyper-parameter tuning. The robustness of the two selected type of classifiers, C-SVM, and \u{psion}-SVM, are investigated with extensive experiments before selecting the best performing classifier. All experiments in this paper were conducted using the INbreast dataset. The best AUC obtained from the experimental results is 0.994 +/- 0.003 i.e. [0.991, 0.997]. Our results conclude that by using pre-trained VGG19 network, high-level distinctive features can be extracted from Mammograms which when used with the proposed SVM classifier is able to robustly distinguish between the mass and non-mass present in breast.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 0 minor

Summary. The manuscript presents a mass vs. non-mass classification pipeline for mammograms that extracts features from a pre-trained VGG19 network, applies bagged decision-tree feature selection, and trains either C-SVM or ν-SVM classifiers. All experiments are performed on the INbreast dataset; the headline result is an AUC of 0.994 ± 0.003 obtained after hyper-parameter tuning and classifier selection.

Significance. If the reported AUC is obtained without data leakage and generalizes beyond INbreast, the work would illustrate a practical transfer-learning route for mammography CAD. The combination of off-the-shelf CNN features with a lightweight selector and SVM is straightforward and could be useful for resource-constrained settings; however, the single-dataset design currently prevents any claim of robustness from being considered established.

major comments (3)
  1. [Experimental results / Methods] The abstract and results description provide no information on the train/test partitioning strategy, the number of folds in cross-validation, or whether the bagged decision-tree feature selection step was executed inside or outside each CV fold. Feature selection performed on the full dataset before CV introduces optimistic bias that directly undermines the validity of the reported AUC and its error bars.
  2. [Abstract and conclusion] The claim that the classifier 'robustly distinguish[es] between the mass and non-mass' rests entirely on INbreast. No external test set, multi-center collection, or cross-dataset experiment is described; differences in vendor, compression, or lesion-size distribution could therefore render the separation non-reproducible.
  3. [Methods] No description is given of patch extraction (mass and non-mass ROI definition), class balancing, or any preprocessing/augmentation pipeline. These choices are load-bearing for the mass/non-mass separation task and must be specified before the numerical result can be interpreted.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and address validity concerns where appropriate.

read point-by-point responses
  1. Referee: [Experimental results / Methods] The abstract and results description provide no information on the train/test partitioning strategy, the number of folds in cross-validation, or whether the bagged decision-tree feature selection step was executed inside or outside each CV fold. Feature selection performed on the full dataset before CV introduces optimistic bias that directly undermines the validity of the reported AUC and its error bars.

    Authors: We agree the manuscript omitted these critical experimental details. Our protocol used 5-fold cross-validation on the INbreast dataset with the bagged decision-tree feature selection performed strictly inside each training fold (using only training data) to avoid leakage; the reported AUC and ±0.003 were obtained from the held-out folds. We will add a full description of the partitioning, fold count, and placement of feature selection in the revised Methods section. revision: yes

  2. Referee: [Abstract and conclusion] The claim that the classifier 'robustly distinguish[es] between the mass and non-mass' rests entirely on INbreast. No external test set, multi-center collection, or cross-dataset experiment is described; differences in vendor, compression, or lesion-size distribution could therefore render the separation non-reproducible.

    Authors: We acknowledge that all results are confined to INbreast and that the term 'robustly' overstates generalizability. The work was scoped as a demonstration on this standard public benchmark. We will revise the abstract and conclusion to remove or qualify the robustness claim, explicitly note the single-dataset limitation, and suggest multi-center validation as future work. revision: yes

  3. Referee: [Methods] No description is given of patch extraction (mass and non-mass ROI definition), class balancing, or any preprocessing/augmentation pipeline. These choices are load-bearing for the mass/non-mass separation task and must be specified before the numerical result can be interpreted.

    Authors: We agree these implementation details are missing. Positive patches were 224×224 ROIs centered on annotated masses; negative patches were randomly sampled from normal tissue regions. The dataset was balanced by random undersampling of the non-mass class; images were normalized to [0,1] with no augmentation applied. We will insert a dedicated 'Data Preparation' subsection in Methods describing patch extraction, ROI definition, balancing, and preprocessing. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical pipeline on public dataset with standard components

full rationale

The paper reports an empirical AUC of 0.994 on the INbreast dataset using a pre-trained VGG19 feature extractor, bagged decision tree selection, and SVM classifier. No equations, derivations, predictions, or uniqueness theorems are present. The result is a direct measurement on external data using off-the-shelf models; it does not reduce any claimed output to a fitted quantity or self-citation defined by the authors. This matches the default case of a self-contained empirical study with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

The central claim rests on the transferability of ImageNet-trained features to mammograms and the representativeness of a single public dataset; no new physical entities are postulated.

free parameters (2)
  • SVM hyperparameters (C, kernel parameters)
    Tuned via experiments to maximize AUC on the chosen dataset
  • Feature subset size after bagged decision tree selection
    Determined during the feature selection stage
axioms (2)
  • domain assumption Features from VGG19 pre-trained on ImageNet transfer usefully to distinguish mass versus non-mass in mammograms without network fine-tuning
    The method description relies on direct use of the pre-trained network for feature extraction
  • domain assumption The INbreast dataset is sufficient to demonstrate general robustness of the classifier
    All reported experiments and the final AUC are confined to this dataset

pith-pipeline@v0.9.0 · 5787 in / 1628 out tokens · 36336 ms · 2026-05-25T00:44:58.721228+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    Bray, F. & et. al. Global cancer statistics 2018: Globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 68, 394–424, DOI: 10.3322/caac.21492 (2018)

  2. [2]

    Available at: http://www.who.int/news-room/fact-sheets/detail/cancer

    World Health Organization: Cancer. Available at: http://www.who.int/news-room/fact-sheets/detail/cancer

  3. [3]

    Hou, L. et al. Cancer statistics. Wiley Online Libr. 68, 7–30, DOI: https://doi.org/10.3322/caac.21551 (2018)

  4. [4]

    Available at: https://www.kpwomenshealth.org/breast_health_breast_cancer.asp

    Breast cancer. Available at: https://www.kpwomenshealth.org/breast_health_breast_cancer.asp

  5. [5]

    V ., Ng, E

    Sree, S. V ., Ng, E. Y ., Acharya, R. U. & Fauste, O. Breast imaging: A survey.World J. Clin. Oncol. 2, 171–178, DOI: 10.5306/wjco.v2.i4.171 (2011)

  6. [6]

    Non-mass-like lesions on breast ultrasonography: a systematic review

    Uematsu, T. Non-mass-like lesions on breast ultrasonography: a systematic review. Breast Cancer 19, 295–301, DOI: 10.1007/s12282-012-0364-z (2012). 10/11

  7. [7]

    & Paiva, A

    Nunes, A., Silva, A. & Paiva, A. Detection of masses in mammographic images using geometry, simpson’s diversity index and svm. Int. J. Signal Imaging Syst. Eng. (IJSISE) 3, DOI: 10.1504/IJSISE.2010.034631 (2010)

  8. [8]

    Newell, D. et al. Selection of diagnostic features on breast mri to differentiate between malignant and benign lesions using computer-aided diagnosis: differences in lesions presenting as mass and non-mass-like enhancement. Eur. Radiol. 20, 771–781, DOI: 10.1007/s00330-009-1616-y (2010)

  9. [9]

    John, E. R. S. & et. al. Rapid evaporative ionisation mass spectrometry of electrosurgical vapours for the identification of breast pathology: towards an intelligent knife for breast cancer surgery. Breast Cancer Res. 19, DOI: https://doi.org/10. 1186/s13058-017-0845-2 (2017)

  10. [10]

    Suzuki, S. & et. al. Mass detection using deep convolutional neural network for mammographic computer-aided diagnosis. Annu. Conf. Soc. Instrum. Control. Eng. Jpn. (SICE) 1382–1386, DOI: 10.1109/SICE.2016.7749265 (2016)

  11. [11]

    Wang, Y . & et al. Computer-aided classification of mammographic masses using visually sensitive image features. J. X-Ray Sci. Technol. 25, 1–16, DOI: 10.3233/XST-16212 (2016)

  12. [12]

    Oliveira, F. S. S. D. & et. al. Classification of breast regions as mass and non-mass based on digital mammograms using taxonomic indexes and svm. Breast Cancer 57, 42–53, DOI: https://doi.org/10.1016/j.compbiomed.2014.11.016 (2015)

  13. [13]

    Varela, C. & et. al. Non-mass-like lesions on breast ultrasonography: a systematic review. Phys. Medicine & Biol. 51, DOI: 10.1088/0031-9155/51/2/016 (2006)

  14. [14]

    Wei, J. & et. al. Computer aided detection of breast masses on full field digital mammograms. Med. Phys. 32, 2827–2837, DOI: 0.1118/1.1997327 (2005)

  15. [15]

    Combining deep convolutional networks and svms for mass detection on digital mammograms

    Wichakam, I. Combining deep convolutional networks and svms for mass detection on digital mammograms. Int. Conf. on Knowl. Smart Technol. (KST) 239–244, DOI: 10.1109/KST.2016.7440527 (2016)

  16. [16]

    Ragab, D. A. & et. al. Breast cancer detection using deep convolutional neural networks and support vector machines. PeerJ DOI: http://doi.org/10.7717/peerj.6201 (2016)

  17. [17]

    Sampaio, W. B. & et. al. Detection of masses in mammogram images using cnn, geostatistic functions and svm. Comput. Biol Med 653–664, DOI: 10.1016/j.compbiomed.2011.05.017 (2011)

  18. [18]

    Wael, E. F. & et. al. A deep learning approach for breast cancer mass detection. Int. J. Adv. Comput. Sci. Appl. 10 (2019)

  19. [19]

    Moreira, I. C. & et. al. Inbreast: toward a full-field digital mammographic database. Acad. Radiol. 19, 236–248, DOI: 10.1016/j.acra.2011.09.014 (2012)

  20. [20]

    L., Miller, D., K

    Seigel, R. L., Miller, D., K. & Jemal, A. Patch-based convolutional neural network for whole slide tissue image classification. IEEE Conf. on Comput. Vis. Pattern Recognit. (CVPR) DOI: 10.1109/CVPR.2016.266 (2016)

  21. [21]

    Simonyan, K. & et. al. Very deep convolutional networks for large-scale image recognition. Comput. Vis. Pattern Recognit. (cs.CV) DOI: arXiv:1409.1556 (2014)

  22. [22]

    Deng, J. & et. al. Imagenet: A large-scale hierarchical image database. IEEE Conf. on Comput. Vis. Pattern Recognit. DOI: 10.1109/CVPR.2009.5206848 (2009)

  23. [23]

    Guan, D. & et. al. A review of ensemble learning based feature selection. IETE Tech. Rev. 31, 190–198, DOI: https: //doi.org/10.1080/02564602.2014.906859 (2014)

  24. [24]

    Wong, T. & et. al. Dependency analysis of accuracy estimates in k-fold cross validation. IEEE Transactions on Knowl. Data Eng. 29, 2417–2427, DOI: 10.1109/TKDE.2017.2740926 (2017)

  25. [25]

    Marzban, C. & et. al. The roc curve and the area under it as performance measures. Weather. F orecast.19, 1106, DOI: https://doi.org/10.1175/825.1 (2004)

  26. [26]

    Liu, S. & et. al. A new weighted support vector machine with ga-based parameter selection. Int. Conf. on Mach. Learn. Cybern. DOI: 10.1109/ICMLC.2005.1527703 (2005)

  27. [27]

    Karatzoglou, A. & et. al. Support vector machines in r. J. Stat. Softw. 15 (2006)

  28. [28]

    Diaz, G. & et. al. An effective algorithm for hyperparameter optimization of neural networks. Ibm J. Res. Dev. 61, DOI: 10.1147/JRD.2017.2709578 (2017)

  29. [29]

    Kuo, B. & et. al. A kernel-based feature selection method for svm with rbf kernel for hyperspectral image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens. 7, 317–326, DOI: 10.1109/JSTARS.2013.2262926 (2014). 11/11