Structure-Guided Mixed Masked Pretraining and Spatial Continuity Regularization for Printed Circuit Board Defect Detection
Pith reviewed 2026-06-28 10:17 UTC · model grok-4.3
The pith
A two-phase framework of structure-guided masked pretraining on unlabeled PCB images and spatial continuity regularization during fine-tuning improves detection of tiny defects.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The proposed framework combines structure-guided mixed masked pretraining with spatial continuity regularization. In pretraining, structure-guided mixed masking constructs informative masked inputs for sparse convolutional reconstruction that suppresses invalid responses from masked regions and enables inference of missing PCB structures from visible conductive patterns to learn structural priors. In the fine-tuning stage, the spatial continuity regularization term constrains dispersed positive predictions assigned to the same defect instance to promote compact localization on elongated defect regions. On the DsPCBSD+ dataset, it achieves 85.5% mAP0.5 and 52.3% mAP0.5:0.95, outperforming bas
What carries the argument
Structure-guided mixed masked pretraining scheme using sparse convolutional reconstruction to learn PCB structural priors from visible patterns.
Load-bearing premise
The structure-guided mixed masking constructs informative masked inputs such that the sparse convolutional reconstruction pipeline suppresses invalid responses from masked regions and enables the detector backbone to infer missing PCB structures from visible conductive patterns, thereby learning PCB structural priors.
What would settle it
An experiment showing that removing the structure-guided masking or the spatial continuity regularization does not decrease the mAP scores on DsPCBSD+ would falsify the central claim.
Figures
read the original abstract
Printed circuit board (PCB) defect detection is an essential part of automated optical inspection (AOI); yet it remains challenging in practice because many defects are tiny, low-contrast, and embedded in dense circuit backgrounds. To address these issues, this paper presents a two-phase PCB defect detection framework that combines structure-guided mixed masked pretraining with spatial continuity regularization. In the pretraining stage, we design a sparse convolutional masked pretraining scheme to exploit unlabeled PCB images, where structure-guided mixed masking is used to construct informative masked inputs. The sparse convolutional reconstruction pipeline suppresses invalid responses from masked regions and enables the detector backbone to infer missing PCB structures from visible conductive patterns, thereby learning PCB structural priors. In the fine-tuning stage, the pretrained backbone is transferred to the downstream defect detection task. For the task, a spatial continuity regularization term is introduced during fine-tuning. This term constrains dispersed positive predictions assigned to the same defect instance and promotes more compact localization on elongated defect regions. Experiments on the DsPCBSD+ dataset show that the proposed method achieves 85.5% mAP0.5 and 52.3% mAP0.5:0.95, outperforming several strong baseline detectors. Ablation studies and qualitative results further confirm the effectiveness of the proposed framework for robust PCB defect detection in industrial AOI scenarios.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a two-phase framework for printed circuit board (PCB) defect detection. It combines structure-guided mixed masked pretraining using sparse convolutional reconstruction on unlabeled PCB images to learn structural priors, with a spatial continuity regularization term during fine-tuning to constrain dispersed predictions and improve localization of elongated defects. On the DsPCBSD+ dataset, it reports achieving 85.5% mAP0.5 and 52.3% mAP0.5:0.95, outperforming several strong baseline detectors, with ablation studies and qualitative results supporting the framework's effectiveness.
Significance. If the empirical results hold, this work could contribute to improving automated optical inspection (AOI) systems by better handling tiny, low-contrast defects in dense circuit backgrounds through self-supervised pretraining and task-specific regularization. The use of unlabeled data and the regularization for compact localization are potentially valuable for industrial applications.
major comments (1)
- [Abstract] Abstract: The abstract states performance numbers and claims outperformance but supplies no experimental details, baseline definitions, ablation tables, or error analysis; without these it is impossible to verify whether the reported gains are attributable to the proposed components (structure-guided mixed masking and spatial continuity regularization).
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address the single major comment below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract states performance numbers and claims outperformance but supplies no experimental details, baseline definitions, ablation tables, or error analysis; without these it is impossible to verify whether the reported gains are attributable to the proposed components (structure-guided mixed masking and spatial continuity regularization).
Authors: We agree that the abstract would benefit from additional context to substantiate the reported gains. In the revised manuscript we will expand the abstract to briefly reference the DsPCBSD+ dataset, note that comparisons are performed against multiple strong detectors (including YOLO variants and two-stage detectors), and state that ablation studies isolate the contributions of structure-guided mixed masked pretraining and spatial continuity regularization. Full tables, baseline definitions, and error analysis will continue to appear in the main body due to length constraints, but the added abstract phrasing will allow readers to better attribute the improvements to the proposed components. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper describes an empirical two-stage method: structure-guided mixed masked pretraining on unlabeled PCB images to learn structural priors via sparse convolutional reconstruction, followed by fine-tuning on the defect detection task with an added spatial continuity regularization term. The reported results (85.5% mAP0.5 and 52.3% mAP0.5:0.95 on DsPCBSD+) are measured empirical outcomes on a held-out dataset, not quantities obtained by fitting parameters to a subset and then renaming the fit as a prediction, nor by self-definitional equations, nor by load-bearing self-citations that reduce the central claim to prior author work. No derivation chain, uniqueness theorem, or ansatz is presented that collapses to the inputs by construction. The framework is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Pretraining on unlabeled PCB images via structure-guided masking transfers useful structural priors to the downstream supervised defect detection task
Reference graph
Works this paper leans on
-
[1]
J. Tang, Z. Wang, H. Zhang, H. Li, P. Wu, N. Zeng, A lightweight sur- face defect detection framework combined with dual-domain attention mechanism, Expert Systems with Applications 238 (2024) 121726
2024
-
[2]
Angelopoulos, E
A. Angelopoulos, E. T. Michailidis, N. Nomikos, P. Trakadas, A. Hatziefremidis, S. Voliotis, T. Zahariadis, Tackling faults in the in- dustry 4.0 era—a survey of machine-learning solutions and key aspects, Sensors 20 (1) (2019) 109. 31
2019
-
[3]
Q. Tan, L. Liu, M. Yu, J. Li, An innovative method of recycling metals in printed circuit board (pcb) using solutions from pcb production, Journal of Hazardous Materials 390 (2020) 121892
2020
-
[4]
Moganti, F
M. Moganti, F. Ercal, C. H. Dagli, S. Tsunekawa, Automatic pcb inspec- tion algorithms: A survey, Computer Vision and Image Understanding 63 (2) (1996) 287–313
1996
-
[5]
Y. Zhou, M. Yuan, J. Zhang, G. Ding, S. Qin, Review of vision-based defect detection research and its perspectives for printed circuit board, Journal of Manufacturing Systems 70 (2023) 557–578
2023
-
[6]
Q. Ling, N. A. M. Isa, Printed circuit board defect detection methods based on image processing, machine learning and deep learning: A sur- vey, IEEE Access 11 (2023) 15921–15944
2023
-
[7]
D. Kang, J. Lai, Y. Han, Improving surface defect detection with context-guided asymmetric modulation networks and confidence- boosting loss, Expert Systems with Applications 225 (2023) 120121
2023
-
[8]
P. Sun, C. Hua, W. Ding, C. Hua, A real–time detection framework for surface defects in ceramic tableware based on deep learning, Expert Systems with Applications 286 (2025) 128101
2025
-
[9]
S. Meng, S. Zhang, X. Liang, J. Hu, Automatic extraction of scale infor- mation for interactive measurement of anything in microscopy images, Knowledge-Based Systems 324 (2025) 113578
2025
-
[10]
S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (6) (2017) 1137–1149
2017
-
[11]
C. Song, J. Chen, Z. Lu, F. Li, Y. Liu, Steel surface defect detection via deformableconvolutionandbackgroundsuppression, IEEETransactions on Instrumentation and Measurement 72 (2023) 1–9
2023
-
[12]
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, A. C. Berg, Ssd: Single shot multibox detector, in: European conference on computer vision, Springer, 2016, pp. 21–37. 32
2016
-
[13]
K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, Q. Tian, Centernet: Key- point triplets for object detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019
2019
-
[14]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, et al., An image is worth 16x16 words: Transformers for image recognition at scale, arXiv preprint arXiv:2010.11929 (2020)
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[15]
Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo, Swin transformer: Hierarchical vision transformer using shifted windows, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10012–10022
2021
-
[16]
N.Carion, F.Massa, G.Synnaeve, N.Usunier, A.Kirillov, S.Zagoruyko, End-to-end object detection with transformers, in: European conference on computer vision, Springer, 2020, pp. 213–229
2020
-
[17]
Z. Zong, G. Song, Y. Liu, Detrs with collaborative hybrid assignments training, in: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 19609–19619
2023
-
[18]
Y. Ma, J. Yin, F. Huang, Q. Li, Surface defect inspection of industrial products with object detection deep networks: A systematic review, Artificial Intelligence Review 57 (12) (2024) 333
2024
-
[19]
L. Zhu, R. Zhao, A novel pcb surface defect detection method based on separated global context attention to guide residual context aggregation, Scientific Reports 15 (1) (2025) 9620
2025
-
[20]
A. Khan, Z. Rauf, A. Sohail, A. R. Khan, H. Asif, A. Asif, U. Farooq, A survey of the vision transformers and their cnn-transformer based variants, Artificial Intelligence Review 56 (Suppl 3) (2023) 2917–2970
2023
-
[21]
Q. Yuan, Y. Shi, M. Li, A review of computer vision-based crack detec- tion methods in civil infrastructure: Progress and challenges, Remote Sensing 16 (16) (2024)
2024
-
[22]
Y. He, S. Li, X. Wen, J. Xu, A survey on surface defect inspection based on generative models in manufacturing, Applied Sciences 14 (15) (2024). 33
2024
-
[23]
Sohan, T
M. Sohan, T. Sai Ram, C. V. Rami Reddy, A review on yolov8 and its advancements, in: International conference on data intelligence and cognitive informatics, Springer, 2024, pp. 529–545
2024
-
[24]
M. Yaseen, What is yolov8: An in-depth exploration of the inter- nal features of the next-generation object detector, arXiv preprint arXiv:2408.15857 (2024)
-
[25]
Q.Zhao, T.Ji, S.Liang, W.Yu, Pcbsurfacedefectfastdetectionmethod based on attention and multi-source fusion, Multimedia Tools and Ap- plications 83 (2) (2024) 5451–5472
2024
-
[26]
G. Liu, H. Wen, Printed circuit board defect detection based on MobileNet-Yolo-Fast, Journal of Electronic Imaging 30 (4) (2021) 043004
2021
-
[27]
J. Tang, S. Liu, D. Zhao, L. Tang, W. Zou, B. Zheng, Pcb-yolo: An improved detection algorithm of pcb surface defects based on yolov5, Sustainability 15 (7) (2023) 5963
2023
-
[28]
W. Xuan, G. Jian-She, H. Bo-Jie, W. Zong-Shan, D. Hong-Wei, W. Jie, A lightweight modified yolox network using coordinate attention mech- anism for pcb surface defect detection, IEEE Sensors Journal 22 (21) (2022) 20910–20920
2022
-
[29]
X. Liu, J. Hu, H. Wang, Z. Zhang, X. Lu, C. Sheng, S. Song, J. Nie, Gaussian-iou loss: Better learning for bounding box regression on pcb component detection, Expert Systems with Applications 190 (2022) 116178
2022
-
[30]
M. Yuan, Y. Zhou, X. Ren, H. Zhi, J. Zhang, H. Chen, Yolo-hmc: An improved method for pcb surface defect detection, IEEE Transactions on Instrumentation and Measurement 73 (2024) 1–11
2024
-
[31]
Q. Ling, N. A. M. Isa, Printed circuit board defect detection methods based on image processing, machine learning and deep learning: A sur- vey, IEEE access 11 (2023) 15921–15944
2023
-
[32]
Y. Zhou, M. Yuan, J. Zhang, G. Ding, S. Qin, Review of vision-based defect detection research and its perspectives for printed circuit board, Journal of Manufacturing Systems 70 (2023) 557–578. 34
2023
-
[33]
X. Tao, X. Gong, X. Zhang, S. Yan, C. Adak, Deep learning for un- supervised anomaly localization in industrial images: A survey, IEEE Transactions on Instrumentation and Measurement 71 (2022) 1–21
2022
-
[34]
L. Jing, Y. Tian, Self-supervised visual feature learning with deep neural networks: A survey, IEEE transactions on pattern analysis and machine intelligence 43 (11) (2020) 4037–4058
2020
-
[35]
A. v. d. Oord, Y. Li, O. Vinyals, Representation learning with con- trastive predictive coding, arXiv preprint arXiv:1807.03748 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[36]
K. He, H. Fan, Y. Wu, S. Xie, R. Girshick, Momentum contrast for unsupervised visual representation learning, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 9729–9738
2020
-
[37]
T. Chen, S. Kornblith, M. Norouzi, G. Hinton, A simple framework for contrastive learning of visual representations, in: International confer- ence on machine learning, PmLR, 2020, pp. 1597–1607
2020
-
[38]
Grill, F
J.-B. Grill, F. Strub, F. Altché, C. Tallec, P. Richemond, E.Buchatskaya, C.Doersch, B.AvilaPires, Z.Guo, M.GheshlaghiAzar, et al., Bootstrap your own latent-a new approach to self-supervised learning, Advances in neural information processing systems 33 (2020) 21271–21284
2020
-
[39]
Caron, I
M. Caron, I. Misra, J. Mairal, P. Goyal, P. Bojanowski, A. Joulin, Unsu- pervised learning of visual features by contrasting cluster assignments, Advances in neural information processing systems 33 (2020) 9912–9924
2020
-
[40]
X. Chen, K. He, Exploring simple siamese representation learning, in: Proceedings of the IEEE/CVF conference on computer vision and pat- tern recognition, 2021, pp. 15750–15758
2021
-
[41]
Vaswani, N
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin, Attention is all you need, Advances in neural information processing systems 30 (2017)
2017
-
[42]
Devlin, M.-W
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in: Pro- ceedings of the 2019 conference of the North American chapter of the 35 association for computational linguistics: human language technologies, volume 1 (long and short papers), 2019, pp. 4171–4186
2019
-
[43]
H. Bao, L. Dong, S. Piao, F. Wei, Beit: Bert pre-training of image transformers, arXiv preprint arXiv:2106.08254 (2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[44]
K. He, X. Chen, S. Xie, Y. Li, P. Dollár, R. Girshick, Masked autoen- coders are scalable vision learners, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 16000– 16009
2022
-
[45]
9653–9663
Z.Xie, Z.Zhang, Y.Cao, Y.Lin, J.Bao, Z.Yao, Q.Dai, H.Hu, Simmim: A simple framework for masked image modeling, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 9653–9663
2022
-
[46]
L. Zhou, H. Liu, J. Bae, J. He, D. Samaras, P. Prasanna, Self pre- training with masked autoencoders for medical image classification and segmentation, in: 2023 IEEE 20th international symposium on biomed- ical imaging (ISBI), IEEE, 2023, pp. 1–6
2023
-
[47]
Hondru, F
V. Hondru, F. A. Croitoru, S. Minaee, R. T. Ionescu, N. Sebe, Masked image modeling: A survey, International Journal of Computer Vision 133 (10) (2025) 7154–7200
2025
- [48]
-
[49]
Canny, A computational approach to edge detection, IEEE Transac- tions on Pattern Analysis and Machine Intelligence PAMI-8 (6) (1986) 679–698
J. Canny, A computational approach to edge detection, IEEE Transac- tions on Pattern Analysis and Machine Intelligence PAMI-8 (6) (1986) 679–698
1986
-
[50]
D. Marr, E. Hildreth, Theory of edge detection, Proceedings of the Royal Society of London. Series B. Biological Sciences 207 (1167) (1980) 187– 217
1980
-
[51]
R. M. Haralick, K. Shanmugam, I. Dinstein, Textural features for image classification, IEEE Transactions on Systems, Man, and Cybernetics SMC-3 (6) (1973) 610–621. 36
1973
-
[52]
W. T. Freeman, E. H. Adelson, et al., The design and use of steerable filters, IEEE Transactions on Pattern analysis and machine intelligence 13 (9) (1991) 891–906
1991
-
[53]
Bigun, G
J. Bigun, G. H. Granlund, J. Wiklund, Multidimensional orientation estimation with applications to texture analysis and optical flow, IEEE Transactions on pattern analysis and machine intelligence 13 (8) (2002) 775–790
2002
-
[54]
S. Lv, B. Ouyang, Z. Deng, T. Liang, S. Jiang, K. Zhang, J. Chen, Z. Li, A dataset for deep learning based detection of printed circuit board surface defect, Scientific Data 11 (1) (2024) 811
2024
-
[55]
Chen, M.-C
P.-Y. Chen, M.-C. Chang, J.-W. Hsieh, Y.-S. Chen, Parallel residual bi- fusion feature pyramid network for accurate single-shot object detection, IEEE transactions on Image Processing 30 (2021) 9099–9111
2021
- [56]
-
[57]
Jocher, Ultralytics yolov5, gitHub repository (2020)
G. Jocher, Ultralytics yolov5, gitHub repository (2020)
2020
- [58]
- [59]
-
[60]
Y. Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y. Liu, J. Chen, Detrs beat yolos on real-time object detection, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024, pp. 16965–16974
2024
-
[61]
S. Li, F. Kong, R. Wang, T. Luo, Z. Shi, Efd-yolov4: A steel surface de- fect detection network with encoder-decoder residual block and feature alignment module, Measurement 220 (2023) 113359. 37
2023
-
[62]
Ultralytics, Yolov8 documentation, Ultralytics official documentation (2023)
2023
-
[63]
Ultralytics, Ultralytics yolo, GitHub repository (2023)
2023
- [64]
- [65]
-
[66]
J. Zhou, C. Wei, H. Wang, W. Shen, C. Xie, A. Yuille, T. Kong, ibot: Image bert pre-training with online tokenizer, arXiv preprint arXiv:2111.07832 (2021). 38
work page internal anchor Pith review Pith/arXiv arXiv 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.