pith. sign in

arxiv: 2606.08033 · v1 · pith:LGDDEOYKnew · submitted 2026-06-06 · 💻 cs.CV · cs.LG

Balancing Real and Synthetic Data for CNN-based Masonry Crack Detection

classification 💻 cs.CV cs.LG
keywords datarealsyntheticcracktrainingdatasetdetectionimages
0
0 comments X
read the original abstract

Cracks are a critical indicator of building health, and early stage identification is fundamental to prevent harmful damages. Advances in deep learning (DL), particularly convolutional neural networks (CNNs), have enabled scalable solutions for automated crack detection. However, CNN performance strongly depends on the availability of large and diverse datasets, which is particularly challenging for complex surfaces such as masonry. Collecting sufficient real data is time-consuming, while publicly available datasets may not be adequate. To address this limitation, we explored generating synthetic crack data, which complements real data and improves training effectiveness. The real dataset consists of masonry crack images collected from buildings in Bologna and surrounding areas. In contrast, the synthetic dataset was generated using a crack overlay tool that adds cracks to background images in a controlled orientation and placement. The real dataset was used to train several DL architectures, to identify the best-performing model (InceptionV4) employed for experiments with generated data. Six training scenarios were tested in InceptionV4 by varying the ratio of real and synthetic data, with evaluation performed on a test set composed of real images using the F1-score and mean Intersection over Union (mIoU) metrics. Results show that training on synthetic data plus a modest addition of 20% real data achieves results comparable to training on real data only. Moreover, the 20/80 scenario (synthetic/real) achieved an 76% F1-score and 80% mean IoU, outperforming the real-only case. As can be seen, the method demonstrates the potential of synthetic data to reduce collection efforts while enhancing crack detection accuracy.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.