pith. sign in

arxiv: 2506.23334 · v3 · submitted 2025-06-29 · 📡 eess.IV · cs.AI· cs.CV

Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation

Pith reviewed 2026-05-19 07:28 UTC · model grok-4.3

classification 📡 eess.IV cs.AIcs.CV
keywords federated learningbreast ultrasoundsynthetic data augmentationgenerative adversarial networksdiffusion modelsbreast cancer classificationdata privacy
0
0 comments X

The pith

Balanced synthetic ultrasound images from GANs and diffusion models raise average AUC in federated breast cancer detection from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that federated learning models for classifying breast ultrasound images benefit from adding controlled amounts of synthetic images created by deep convolutional generative adversarial networks and class-conditioned denoising diffusion probabilistic models. This tackles the common problems of small dataset sizes and non-identical data distributions across hospitals while avoiding any direct sharing of patient records. Tests across the BUSI, BUS-BRA, and UDIAT datasets confirm measurable gains in average area under the curve, yet show that using too many synthetic samples reverses the gains and lowers accuracy. The work therefore stresses the value of finding the right real-to-synthetic ratio for effective augmentation.

Core claim

Incorporating a suitable number of synthetic breast ultrasound images generated by a deep convolutional generative adversarial network and a class-conditioned denoising diffusion probabilistic model into federated training with FedAvg and FedProx improves average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx on the BUSI, BUS-BRA, and UDIAT datasets, while excessive synthetic data reduces performance.

What carries the argument

Generative model-based data augmentation that produces synthetic ultrasound images with DCGAN and class-conditioned DDPM to supplement real samples during federated optimization without sharing patient data.

If this is right

  • An optimal balance between real and synthetic samples is required to achieve the reported AUC gains in non-IID federated settings.
  • Both FedAvg and FedProx benefit from the same augmentation strategy, though FedProx reaches higher final performance.
  • Adding too many synthetic images consistently harms accuracy, indicating a clear upper limit on useful augmentation.
  • The method preserves privacy by keeping all original patient ultrasound data local to each institution.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same generative augmentation approach could be tested on other ultrasound-based diagnostic tasks that face similar data scarcity and privacy limits.
  • Measuring image quality metrics such as FID scores against real data might help predict the exact point at which extra synthetic samples stop helping.
  • Adaptive mechanisms that adjust the synthetic ratio during training based on local dataset size could reduce the need for manual tuning across sites.

Load-bearing premise

The synthetic images are realistic and close enough in distribution to real ultrasound scans that adding a controlled quantity improves generalization instead of adding artifacts or causing harmful distribution shift.

What would settle it

No AUC gain or an actual drop when the reported quantities of synthetic images are added to training on held-out portions of the BUSI, BUS-BRA, or UDIAT test sets, or clear visual differences that allow easy distinction between synthetic and real scans.

Figures

Figures reproduced from arXiv: 2506.23334 by Gorkem Durak, Hongyi Pan, Ulas Bagci, Ziliang Hong, Ziyue Xu.

Figure 1
Figure 1. Figure 1: Architecture of the class-specific DCGAN used for [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Breast ultrasound image samples from the BUSI, BUS [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Synthetic breast ultrasound images generated by class [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗
read the original abstract

Federated learning enables collaborative training of deep learning models across institutions without sharing sensitive patient data. However, its performance is often limited by small datasets and non-independent, identically distributed data, which can impair model generalization. In this work, we propose a generative model-based data augmentation framework for breast ultrasound classification. It leverages synthetic images generated by deep convolutional generative adversarial networks and a class-conditioned denoising diffusion probabilistic model. Experiments on three publicly available datasets (BUSI, BUS-BRA, and UDIAT) demonstrated that incorporating a suitable number of synthetic images improved average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx. Furthermore, we noticed that excessive use of synthetic data reduced performance. This highlights the importance of balancing real and synthetic samples. Our results underscore the potential of generative model-based augmentation to enhance federated breast ultrasound image classification.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes a generative model-based data augmentation framework for federated breast ultrasound classification. Synthetic images are generated via deep convolutional GANs and class-conditioned denoising diffusion probabilistic models, then added to training with FedAvg and FedProx. Experiments on the public BUSI, BUS-BRA, and UDIAT datasets report that a suitable number of synthetic images raises average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx, while excessive synthetic data degrades performance.

Significance. If the reported AUC gains prove robust to augmentation-ratio choice and are supported by full experimental details, the work would offer a practical route to mitigating data scarcity and non-IID effects in privacy-preserving medical imaging. The multi-dataset evaluation and comparison of two standard federated algorithms provide a reasonable breadth of evidence for the augmentation strategy.

major comments (2)
  1. [Abstract] Abstract: the headline claim that 'incorporating a suitable number of synthetic images improved average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx' while 'excessive use of synthetic data reduced performance' indicates that multiple augmentation ratios were evaluated and the best-performing count was selected post-hoc. Without a pre-specified protocol, a held-out validation set used solely for ratio selection, or reporting of performance across all tested ratios, the observed gains are vulnerable to selection bias. This is load-bearing for the central claim given the small dataset sizes and non-IID federated partitions.
  2. [Abstract] Abstract / results: the reported AUC deltas are supplied without error bars, standard deviations, exact counts of synthetic images added, or details of the integration protocol (local vs. global augmentation) and ablation studies on the two generative models. These omissions prevent assessment of statistical reliability and reproducibility of the claimed improvements.
minor comments (1)
  1. [Abstract] The abstract would benefit from briefly stating the number of clients/institutions and the class balance in the federated partitions to contextualize the non-IID setting.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments on our manuscript. We address each major comment below with clarifications and indicate where revisions will be made to improve transparency and reproducibility.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the headline claim that 'incorporating a suitable number of synthetic images improved average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx' while 'excessive use of synthetic data reduced performance' indicates that multiple augmentation ratios were evaluated and the best-performing count was selected post-hoc. Without a pre-specified protocol, a held-out validation set used solely for ratio selection, or reporting of performance across all tested ratios, the observed gains are vulnerable to selection bias. This is load-bearing for the central claim given the small dataset sizes and non-IID federated partitions.

    Authors: We acknowledge the referee's concern about potential post-hoc selection bias. Our experiments did evaluate multiple augmentation ratios (0.5x, 1x, 2x, and 3x relative to real data volume per client), with performance improving up to an optimal point before degrading due to synthetic data distribution shift. To strengthen the manuscript, we have added an explicit description of the ratio selection process, which used per-client validation splits held out from local training data. We also include a new table reporting AUC values for every tested ratio on all three datasets (BUSI, BUS-BRA, UDIAT), confirming the selected ratio lies within a consistent performance plateau rather than an isolated peak. These changes appear in the revised abstract, Section 4, and supplementary material. revision: yes

  2. Referee: [Abstract] Abstract / results: the reported AUC deltas are supplied without error bars, standard deviations, exact counts of synthetic images added, or details of the integration protocol (local vs. global augmentation) and ablation studies on the two generative models. These omissions prevent assessment of statistical reliability and reproducibility of the claimed improvements.

    Authors: We agree that the original presentation lacked sufficient statistical and procedural detail. The revised manuscript now reports standard deviations computed over five independent runs with different random seeds for both FedAvg and FedProx. Exact synthetic image counts are stated (e.g., 400 benign and 250 malignant synthetic images added per client at the optimal ratio). The protocol is clarified as local augmentation performed independently at each client site prior to federated rounds. New ablation results compare DCGAN-only, diffusion-only, and combined augmentation, showing additive gains from both generators. These updates are incorporated into Section 4 and the supplementary materials. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical results on public benchmarks with no derivation chain

full rationale

The paper reports experimental AUC improvements from adding synthetic ultrasound images generated by DCGAN and class-conditioned DDPM to federated training (FedAvg and FedProx) on the BUSI, BUS-BRA, and UDIAT datasets. No equations, first-principles derivations, or parameter-fitting procedures are described that could reduce to their own inputs by construction. The phrase 'suitable number of synthetic images' reflects standard hyperparameter exploration (with the observation that excess synthetics hurt performance), but this is not a fitted-input-called-prediction or self-definitional loop; it is ordinary model selection on held-out validation splits. No self-citations are load-bearing for any uniqueness theorem or ansatz. The work is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claim depends on empirical performance gains rather than new theory. The only notable free parameter is the count of synthetic images, chosen after observing performance degradation with excess data. No new entities are postulated.

free parameters (1)
  • number of synthetic images
    The abstract states that a suitable number improves results while excessive use reduces performance, indicating this quantity was selected empirically to achieve the reported AUC gains.
axioms (1)
  • domain assumption Synthetic images from DCGAN and class-conditioned DDPM can be mixed with real ultrasound data to improve generalization in federated settings without introducing bias
    This assumption is required for the generative augmentation framework to deliver the claimed AUC lift.

pith-pipeline@v0.9.0 · 5705 in / 1417 out tokens · 37527 ms · 2026-05-19T07:28:15.765901+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

38 extracted references · 38 canonical work pages · 3 internal anchors

  1. [1]

    Breast cancer—epidemiology, risk factors, classification, prognostic markers, and current treatment strategies—an updated review

    Sergiusz Łukasiewicz, Marcin Czeczelewski, Alicja Forma, Jacek Baj, Robert Sitarz, and Andrzej Stanisławek. Breast cancer—epidemiology, risk factors, classification, prognostic markers, and current treatment strategies—an updated review. Cancers, 13(17):4287, 2021

  2. [2]

    Seer cancer statistics review, 1975–2005

    LAG Ries, D Melbert, M Krapcho, DG Stinchcomb, N Howlader, MJ Horner, A Mariotto, BA Miller, EJ Feuer, SF Altekruse, et al. Seer cancer statistics review, 1975–2005. Bethesda, MD: National Cancer Institute, 2999, 2008

  3. [3]

    Minimally invasive breast cancer: how to find early breast cancers

    Harnoor Singh and Nilan Bhakta. Minimally invasive breast cancer: how to find early breast cancers. Current Breast Cancer Reports, 16(2):117– 125, 2024

  4. [4]

    Breast ultrasound image classification using a pre-trained convolutional neural network

    Mohammad I Daoud, Samir Abdel-Rahman, and Rami Alazrai. Breast ultrasound image classification using a pre-trained convolutional neural network. In 2019 15th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS) , pages 167–171. IEEE, 2019

  5. [5]

    Improving breast cancer classifica- tion in fine-grain ultrasound images through feature discrimination and a transfer learning approach

    Fatemeh Taheri and Kambiz Rahbar. Improving breast cancer classifica- tion in fine-grain ultrasound images through feature discrimination and a transfer learning approach. Biomedical Signal Processing and Control, 106:107690, 2025

  6. [6]

    A hybrid learnable fusion of convnext and swin transformer for optimized image classification

    Jaber Qezelbash-Chamak and Karen Hicklin. A hybrid learnable fusion of convnext and swin transformer for optimized image classification. IoT, 6(2):30, 2025

  7. [7]

    Hctnet: A hybrid cnn- transformer network for breast ultrasound image segmentation

    Qiqi He, Qiuju Yang, and Minghao Xie. Hctnet: A hybrid cnn- transformer network for breast ultrasound image segmentation. Com- puters in Biology and Medicine , 155:106629, 2023

  8. [8]

    Fet-unet: Merging cnn and transformer architectures for superior breast ultrasound image segmen- tation

    Huaikun Zhang, Jing Lian, and Yide Ma. Fet-unet: Merging cnn and transformer architectures for superior breast ultrasound image segmen- tation. Physica Medica, 133:104969, 2025

  9. [9]

    A cross-scale attention-based u-net for breast ultrasound image segmentation

    Teng Wang, Jun Liu, and Jinshan Tang. A cross-scale attention-based u-net for breast ultrasound image segmentation. Journal of Imaging Informatics in Medicine , pages 1–14, 2025

  10. [10]

    Generative adversarial nets

    Ian J Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014

  11. [11]

    Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

    Alec Radford, Luke Metz, and Soumith Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 , 2015

  12. [12]

    Breast ultrasound image synthesis using deep convolutional generative adversarial networks

    Tomoyuki Fujioka, Mio Mori, Kazunori Kubota, Yuka Kikuchi, Leona Katsuta, Mio Adachi, Goshi Oda, Tsuyoshi Nakagawa, Yoshio Ki- tazume, and Ukihide Tateishi. Breast ultrasound image synthesis using deep convolutional generative adversarial networks. Diagnostics, 9(4):176, 2019

  13. [13]

    Synthetic vs

    Yasamin Medghalchi, Niloufar Zakariaei, Arman Rahmim, and Ilker Hacihaliloglu. Synthetic vs. classic data augmentation: Impacts on breast ultrasound image classification. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 2025

  14. [14]

    A style-based generator architecture for generative adversarial networks

    Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4401–4410, 2019

  15. [15]

    Denoising diffusion prob- abilistic models

    Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion prob- abilistic models. Advances in neural information processing systems , 33:6840–6851, 2020

  16. [16]

    High-resolution image synthesis with latent diffusion models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 10684–10695, 2022

  17. [17]

    Adding conditional control to text-to-image diffusion models

    Lvmin Zhang, Anyi Rao, and Maneesh Agrawala. Adding conditional control to text-to-image diffusion models. In Proceedings of the IEEE/CVF international conference on computer vision , pages 3836– 3847, 2023

  18. [18]

    Diffboost: Enhanc- ing medical image segmentation via text-guided diffusion model

    Zheyuan Zhang, Lanhong Yao, Bin Wang, Debesh Jha, Gorkem Durak, Elif Keles, Alpay Medetalibeyoglu, and Ulas Bagci. Diffboost: Enhanc- ing medical image segmentation via text-guided diffusion model. IEEE Transactions on Medical Imaging , 2024

  19. [19]

    Communication-efficient learning of deep networks from decentralized data

    Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Aguera y Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017

  20. [20]

    Federated optimization in heterogeneous networks

    Tian Li, Anit Kumar Sahu, Manzil Zaheer, Maziar Sanjabi, Ameet Talwalkar, and Virginia Smith. Federated optimization in heterogeneous networks. Proceedings of Machine learning and systems , 2:429–450, 2020

  21. [21]

    Fedbn: Federated learning on non-iid features via local batch normal- ization

    Xiaoxiao Li, Meirui Jiang, Xiaofei Zhang, Michael Kamp, and Qi Dou. Fedbn: Federated learning on non-iid features via local batch normal- ization. arXiv preprint arXiv:2102.07623 , 2021

  22. [22]

    Federated momentum contrastive clustering

    Runxuan Miao and Erdem Koyuncu. Federated momentum contrastive clustering. ACM Transactions on Intelligent Systems and Technology , 15(4):1–19, 2024

  23. [23]

    Digest: Fast and communication efficient decentralized learning with local updates

    Peyman Gholami and Hulya Seferoglu. Digest: Fast and communication efficient decentralized learning with local updates. IEEE Transactions on Machine Learning in Communications and Networking , 2:1456–1474, 2024

  24. [24]

    Adaptive aggregation weights for fed- erated segmentation of pancreas mri

    Hongyi Pan, Gorkem Durak, Zheyuan Zhang, Yavuz Taktak, Elif Keles, Halil Ertugrul Aktas, Alpay Medetalibeyoglu, Yury Velichko, Concetto Spampinato, Ivo Schoots, et al. Adaptive aggregation weights for fed- erated segmentation of pancreas mri. In 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI) , pages 1–5. IEEE, 2025

  25. [25]

    Ipmn risk assessment under fed- erated learning paradigm

    Hongyi Pan, Ziliang Hong, Gorkem Durak, Elif Keles, Halil Ertugrul Aktas, Yavuz Taktak, Alpay Medetalibeyoglu, Zheyuan Zhang, Yury Velichko, Concetto Spampinato, et al. Ipmn risk assessment under fed- erated learning paradigm. In 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI) , pages 1–5. IEEE, 2025

  26. [26]

    A privacy-preserving domain adversarial federated learning for multi-site brain functional connectivity analysis

    Yipu Zhang, Likai Wang, Kuan-Jui Su, Aiying Zhang, Hao Zhu, Xiaowen Liu, Hui Shen, Vince D Calhoun, Yuping Wang, and Hongwen Deng. A privacy-preserving domain adversarial federated learning for multi-site brain functional connectivity analysis. arXiv preprint arXiv:2502.01885, 2025

  27. [27]

    Towards personalized federated learning

    Alysa Ziying Tan, Han Yu, Lizhen Cui, and Qiang Yang. Towards personalized federated learning. IEEE transactions on neural networks and learning systems , 34(12):9587–9603, 2022

  28. [28]

    Feddg: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space

    Quande Liu, Cheng Chen, Jing Qin, Qi Dou, and Pheng-Ann Heng. Feddg: Federated domain generalization on medical image segmentation via episodic learning in continuous frequency space. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages 1013–1023, 2021

  29. [29]

    Federated domain generalization with generalization adjustment

    Ruipeng Zhang, Qinwei Xu, Jiangchao Yao, Ya Zhang, Qi Tian, and Yanfeng Wang. Federated domain generalization with generalization adjustment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages 3954–3963, 2023

  30. [30]

    Domain generalization with fourier transform and soft thresholding

    Hongyi Pan, Bin Wang, Zheyuan Zhang, Xin Zhu, Debesh Jha, Ah- met Enis Cetin, Concetto Spampinato, and Ulas Bagci. Domain generalization with fourier transform and soft thresholding. In ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 2106–2110. IEEE, 2024

  31. [31]

    Frequency- based federated domain generalization for polyp segmentation

    Hongyi Pan, Debesh Jha, Koushik Biswas, and Ulas Bagci. Frequency- based federated domain generalization for polyp segmentation. In ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , pages 1–5. IEEE, 2025

  32. [32]

    Bus-bra: a breast ultrasound dataset for assessing computer-aided diagnosis systems

    Wilfrido G ´omez-Flores, Maria Julia Gregorio-Calas, and Wagner Coelho de Albuquerque Pereira. Bus-bra: a breast ultrasound dataset for assessing computer-aided diagnosis systems. Medical Physics , 51(4):3110–3123, 2024

  33. [33]

    Dataset of breast ultrasound images

    Walid Al-Dhabyani, Mohammed Gomaa, Hussien Khaled, and Aly Fahmy. Dataset of breast ultrasound images. Data in brief , 28:104863, 2020

  34. [34]

    Automated breast ultrasound lesions detection using convolutional neural networks

    Moi Hoon Yap, Gerard Pons, Joan Marti, Sergi Ganau, Melcior Sentis, Reyer Zwiggelaar, Adrian K Davison, and Robert Marti. Automated breast ultrasound lesions detection using convolutional neural networks. IEEE journal of biomedical and health informatics , 22(4):1218–1226, 2017

  35. [35]

    Multi-contrast mri segmentation trained on synthetic images

    Ismail Irmakci, Zeki Emre Unel, Nazli Ikizler-Cinbis, and Ulas Bagci. Multi-contrast mri segmentation trained on synthetic images. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC) , pages 5030–5034. IEEE, 2022

  36. [36]

    Adam: A Method for Stochastic Optimization

    Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 , 2014

  37. [37]

    Densely connected convolutional networks

    Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 4700–4708, 2017

  38. [38]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regulariza- tion. arXiv preprint arXiv:1711.05101 , 2017