FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture
Pith reviewed 2026-05-25 09:25 UTC · model grok-4.3
The pith
FPCNet uses an encoder-decoder structure with multi-dilation and SE-upsampling modules to learn multi-context crack features for fast pixel-level pavement crack detection.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that integrating the Multi-Dilation module, which synthesizes crack features of multiple context sizes via dilated convolution with multiple rates to describe cracks of different widths and topologies, with the SE-Upsampling module, which optimizes those features using Squeeze-and-Excitation learning, produces an FPCNet that continuously refines the features step-by-step to achieve fast pixel-level crack detection outperforming prior methods on the CFD and G45 datasets under varied crack types and shooting conditions.
What carries the argument
The Multi-Dilation module that synthesizes multi-context crack features via dilated convolutions at multiple rates, paired with the SE-Upsampling module that refines them through squeeze-and-excitation, inside an encoder-decoder network.
If this is right
- Crack features are learned automatically with multiple contexts instead of being manually designed.
- The network handles complicated topological structures and contextual information around cracks.
- End-to-end pixel-level detection runs faster than previous state-of-the-art methods.
- Performance gains appear across multiple public datasets with varied crack types and conditions.
Where Pith is reading between the lines
- The same encoder-decoder pattern with dilation and channel attention could apply to detecting linear defects in other materials such as concrete structures or manufactured surfaces.
- Because the method emphasizes speed, it may support real-time processing on vehicle-mounted cameras during routine road surveys.
- Combining the network output with GPS data could produce automated maintenance priority maps without additional manual annotation.
Load-bearing premise
The Multi-Dilation and SE-Upsampling modules will continue to synthesize and optimize useful crack features across unseen crack topologies, widths, and imaging conditions without requiring dataset-specific retraining.
What would settle it
Running FPCNet on a fresh dataset of pavement images with crack patterns or lighting conditions absent from the CFD and G45 sets, and finding that it no longer exceeds the accuracy or speed of conventional feature-based detectors, would falsify the claim.
Figures
read the original abstract
Timely, accurate and automatic detection of pavement cracks is necessary for making cost-effective decisions concerning road maintenance. Conventional crack detection algorithms focus on the design of single or multiple crack features and classifiers. However, complicated topological structures, varying degrees of damage and oil stains make the design of crack features difficult. In addition, the contextual information around a crack is not investigated extensively in the design process. Accordingly, these design features have limited discriminative adaptability and cannot fuse effectively with the classifiers. To solve these problems, this paper proposes a deep learning network for pavement crack detection. Using the Encoder-Decoder structure, crack characteristics with multiple contexts are automatically learned, and end-to-end crack detection is achieved. Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates. The crack MD features obtained in this module can describe cracks of different widths and topologies. Next, we propose the SE-Upsampling (SEU) module, which uses the Squeeze-and-Excitation learning operation to optimize the MD features. Finally, the above two modules are integrated to develop the fast crack detection network, namely, FPCNet. This network continuously optimizes the MD features step-by-step to realize fast pixel-level crack detection. Experiments are conducted on challenging public CFD datasets and G45 crack datasets involving various crack types under different shooting conditions. The distinct performance and speed improvements over all the datasets demonstrate that the proposed method outperforms other state-of-the-art crack detection methods.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes FPCNet, an encoder-decoder network for pixel-level pavement crack detection. It introduces a Multi-Dilation (MD) module using dilated convolutions at multiple rates to capture crack features across varying widths and topologies, and an SE-Upsampling (SEU) module that applies Squeeze-and-Excitation to re-weight and optimize those features. The integrated network is evaluated on the CFD and G45 datasets, with the central claim being that FPCNet outperforms prior state-of-the-art methods in both detection performance and inference speed.
Significance. If the empirical results are reproducible with proper controls, the work could contribute a practical, relatively lightweight architecture for automated road inspection. The combination of multi-rate dilation and channel attention is a standard but well-motivated way to address multi-scale context in segmentation; explicit credit is due for targeting both accuracy and speed, which is relevant for deployment on resource-constrained hardware.
major comments (2)
- [§4] §4 (Experiments): The abstract and reported results claim clear outperformance on CFD and G45, yet no information is supplied on training protocols, random seeds, data splits, number of runs, or statistical significance testing. Without these, it is impossible to determine whether the reported gains are robust or could be explained by post-hoc hyper-parameter choices or favorable splits.
- [§4] §4, Table 2/3 (presumed results tables): No ablation study isolating the contribution of the MD module versus the SEU module is presented. Because the central claim attributes gains to these two modules, the absence of controlled ablations leaves the load-bearing architectural novelty unverified.
minor comments (2)
- [§3] Notation for dilation rates and channel dimensions in the MD and SEU modules should be defined explicitly in §3 before the equations are used.
- [Figure 5] Figure captions should state the exact input resolution and output stride used for the reported FPS numbers.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve reproducibility and verification of the proposed modules.
read point-by-point responses
-
Referee: [§4] §4 (Experiments): The abstract and reported results claim clear outperformance on CFD and G45, yet no information is supplied on training protocols, random seeds, data splits, number of runs, or statistical significance testing. Without these, it is impossible to determine whether the reported gains are robust or could be explained by post-hoc hyper-parameter choices or favorable splits.
Authors: We agree that the manuscript does not provide these experimental details, which limits assessment of result robustness. In the revised version, we will add a dedicated subsection in §4 describing the training protocols, data splits, random seeds, number of runs, and any statistical significance tests. revision: yes
-
Referee: [§4] §4, Table 2/3 (presumed results tables): No ablation study isolating the contribution of the MD module versus the SEU module is presented. Because the central claim attributes gains to these two modules, the absence of controlled ablations leaves the load-bearing architectural novelty unverified.
Authors: We recognize that the absence of an ablation study leaves the individual contributions of the MD and SEU modules unverified. We will add an ablation study to the revised manuscript that isolates the effect of each module. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents an empirical architecture for crack detection using an encoder-decoder network with proposed MD and SEU modules. The central claim is that FPCNet outperforms prior methods on CFD and G45 datasets in accuracy and speed, supported by experimental results on held-out test sets. No derivation chain, equations, or self-citations reduce any claimed result to fitted parameters or prior self-work by construction. The modules are described as learned feature synthesizers, and performance is evaluated externally rather than forced by internal definitions or renamings. The work is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Automatic crack detection and segmentation using a hybrid algorithm for road distress analysis,
J. Tang and Y . Gu, “Automatic crack detection and segmentation using a hybrid algorithm for road distress analysis,” in Systems, Man, and Cybernetics (SMC), 2013 IEEE International Conference on . IEEE, 2013, pp. 3026–3030
work page 2013
-
[2]
Novel approach to pavement image segmentation based on neighboring difference histogram method,
Q. Li and X. Liu, “Novel approach to pavement image segmentation based on neighboring difference histogram method,” inImage and Signal Processing, 2008. CISP’08. Congress on, vol. 2. IEEE, 2008, pp. 792– 796
work page 2008
-
[3]
Automatic road crack segmentation using entropy and image dynamic thresholding,
H. Oliveira and P. L. Correia, “Automatic road crack segmentation using entropy and image dynamic thresholding,” in Signal Processing Conference, 2009 17th European . IEEE, 2009, pp. 622–626
work page 2009
-
[4]
Automatic pavement distress detection system,
H.-D. Cheng and M. Miyojim, “Automatic pavement distress detection system,” Inf. Sci., vol. 108, no. 1-4, pp. 219–240, 1998
work page 1998
-
[5]
Improvement of canny algorithm based on pavement edge detection,
H. Zhao, G. Qin, and X. Wang, “Improvement of canny algorithm based on pavement edge detection,” in Image and Signal Processing (CISP), 2010 3rd International Congress on , vol. 2. IEEE, 2010, pp. 964–967
work page 2010
-
[6]
Developing a crack inspection robot for bridge maintenance,
R. S. Lim, H. M. La, Z. Shan, and W. Sheng, “Developing a crack inspection robot for bridge maintenance,” in Robotics and Automation (ICRA), 2011 IEEE International Conference on . IEEE, 2011, pp. 6288–6293
work page 2011
-
[7]
A robotic crack inspection and mapping system for bridge deck maintenance,
R. S. Lim, H. M. La, and W. Sheng, “A robotic crack inspection and mapping system for bridge deck maintenance,” IEEE Trans. Autom. Sci. Eng., vol. 11, no. 2, pp. 367–378, 2014
work page 2014
-
[8]
Automatic bridge crack detection–a texture analysis-based approach,
S. Chanda, G. Bu, H. Guan, J. Jo, U. Pal, Y .-C. Loo, and M. Blu- menstein, “Automatic bridge crack detection–a texture analysis-based approach,” in IAPR Workshop on Artificial Neural Networks in Pattern Recognition. Springer, 2014, pp. 193–203
work page 2014
-
[9]
Enhanced automatic detection of road surface cracks by combining 2d/3d image processing techniques,
R. Medina, J. Llamas, E. Zalama, and J. G ´omez-Garc´ıa-Bermejo, “Enhanced automatic detection of road surface cracks by combining 2d/3d image processing techniques,” in Image Processing (ICIP), 2014 IEEE International Conference on . IEEE, 2014, pp. 778–782
work page 2014
-
[10]
Automation of pave- ment surface crack detection using the continuous wavelet transform,
P. Subirats, J. Dumoulin, V . Legeay, and D. Barba, “Automation of pave- ment surface crack detection using the continuous wavelet transform,” in Image Processing, 2006 IEEE International Conference on . IEEE, 2006, pp. 3037–3040
work page 2006
-
[11]
Wavelet-based pavement distress detection and evaluation,
J. Zhou, P. S. Huang, and F.-P. Chiang, “Wavelet-based pavement distress detection and evaluation,” Opt. Eng., vol. 45, no. 2, p. 027007, 2006
work page 2006
-
[12]
Asphalt surfaced pavement cracks detection based on histograms of oriented gradients,
R. Kapela, P. ´Sniatała, A. Turkot, A. Rybarczyk, A. Po ˙zarycki, P. Ry- dzewski, M. Wyczałek, and A. Błoch, “Asphalt surfaced pavement cracks detection based on histograms of oriented gradients,” in Mixed Design of Integrated Circuits & Systems (MIXDES), 2015 22nd International Conference. IEEE, 2015, pp. 579–584
work page 2015
-
[13]
F.-C. Chen and M. R. Jahanshahi, “Nb-cnn: deep learning-based crack detection using convolutional neural network and naive bayes data fusion,” IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4392–4400, 2018
work page 2018
-
[14]
J.-H. Lee, S.-S. Yoon, I.-H. Kim, and H.-J. Jung, “Diagnosis of crack damage on structures based on image processing techniques and r-cnn using unmanned aerial vehicle (uav),” in Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2018 , vol. 10598. International Society for Optics and Photonics, 2018, p. 1059811
work page 2018
-
[15]
Grid-based pavement crack analysis using deep learning,
X. Wang and Z. Hu, “Grid-based pavement crack analysis using deep learning,” in Transportation Information and Safety (ICTIS), 2017 4th International Conference on . IEEE, 2017, pp. 917–924
work page 2017
-
[16]
Deep learning-based crack damage detection using convolutional neural networks,
Y .-J. Cha, W. Choi, and O. B ¨uy¨uk¨ozt¨urk, “Deep learning-based crack damage detection using convolutional neural networks,” Comput.-Aided Civ. Infrastruct. Eng., vol. 32, no. 5, pp. 361–378, 2017
work page 2017
-
[17]
Deep learning-based concrete crack detection using hybrid images,
Y .-K. An, K.-Y . Jang, B. Kim, and S. Cho, “Deep learning-based concrete crack detection using hybrid images,” in Sensors and Smart Structures Technologies for Civil, Mechanical, and Aerospace Systems 2018, vol. 10598. International Society for Optics and Photonics, 2018, p. 1059812
work page 2018
-
[18]
Z. Fan, Y . Wu, J. Lu, and W. Li, “Automatic pavement crack detection based on structured prediction with the convolutional neural network,” arXiv preprint arXiv:1802.02208 , 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[19]
Automatic pixel-level crack detection and measurement using fully convolutional network,
X. Yang, H. Li, Y . Yu, X. Luo, T. Huang, and X. Yang, “Automatic pixel-level crack detection and measurement using fully convolutional network,” Comput.-Aided Civ. Infrastruct. Eng
-
[20]
Deep learning based image recognition for crack and leakage defects of metro shield tunnel,
H.-w. Huang, Q.-t. Li, and D.-m. Zhang, “Deep learning based image recognition for crack and leakage defects of metro shield tunnel,” Tunnelling Underground Space Technol., vol. 77, pp. 166–176, 2018
work page 2018
-
[21]
Fully convolutional networks for semantic segmentation,
J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition , 2015, pp. 3431–3440
work page 2015
-
[22]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention. Springer, 2015, pp. 234–241
work page 2015
-
[23]
A real-time algorithm for signal analysis with the help of the wavelet transform,
M. Holschneider, R. Kronland-Martinet, J. Morlet, and P. Tchamitchian, “A real-time algorithm for signal analysis with the help of the wavelet transform,” in Wavelets. Springer, 1990, pp. 286–297
work page 1990
-
[24]
Squeeze-and-Excitation Networks
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” arXiv preprint arXiv:1709.01507, vol. 7, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[25]
Automatic pavement crack detection using texture and shape descriptors,
Y . Hu, C. Zhao, and H. Wang, “Automatic pavement crack detection using texture and shape descriptors,” IETE TECH REV , vol. 27, no. 5, p. 398, 2010
work page 2010
-
[26]
Automatic road defect detection by textural pattern recognition based on adaboost,
A. Cord and S. Chambon, “Automatic road defect detection by textural pattern recognition based on adaboost,” Comput.-Aided Civ. Infrastruct. Eng., vol. 27, no. 4, pp. 244–259, 2012
work page 2012
-
[27]
Automated crack detection on concrete bridges,
P. Prasanna, K. J. Dana, N. Gucunski, B. B. Basily, H. M. La, R. S. Lim, and H. Parvardeh, “Automated crack detection on concrete bridges,” IEEE Trans. Autom. Sci. Eng. , vol. 13, no. 2, pp. 591–599, 2016
work page 2016
-
[28]
Automatic road crack detection using random structured forests,
Y . Shi, L. Cui, Z. Qi, F. Meng, and Z. Chen, “Automatic road crack detection using random structured forests,” IEEE Trans. Intell. Transp. Syst., vol. 17, no. 12, pp. 3434–3445, 2016
work page 2016
-
[29]
Gradient-based learning applied to document recognition,
Y . Lecun, L. Bottou, Y . Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proc. IEEE, vol. 86, no. 11, pp. 2278– 2324, 1998
work page 1998
-
[30]
Imagenet classification with deep convolutional neural networks,
A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural infor- mation processing systems , 2012, pp. 1097–1105
work page 2012
-
[31]
Going deeper with convolutions,
C. Szegedy, W. Liu, Y . Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V . Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” Proc. IEEE CVPR , pp. 1–9, 2015
work page 2015
-
[32]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” Proc. IEEE CVPR , pp. 770–778, 2016
work page 2016
-
[33]
Multi-scale context aggregation by dilated convolutions,
F. Yu and V . Koltun, “Multi-scale context aggregation by dilated convolutions,” International Conference on Learning Representations , 2016
work page 2016
-
[34]
L. Chen, G. Papandreou, I. Kokkinos, K. P. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 4, pp. 834–848, 2018. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, UNDER REVIEW. 11
work page 2018
-
[35]
Automatic differentiation in pytorch,
A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. DeVito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer, “Automatic differentiation in pytorch,” in NIPS-W, 2017
work page 2017
-
[36]
Automatic pavement crack detection by multi-scale image fusion,
H. Li, D. Song, Y . Liu, and B. Li, “Automatic pavement crack detection by multi-scale image fusion,” IEEE Trans. Intell. Transp. Syst. , no. 99, pp. 1–12, 2018
work page 2018
-
[37]
Automatic pixel-level pavement crack detection using information of multi-scale neighborhoods,
D. Ai, G. Jiang, L. S. Kei, and C. Li, “Automatic pixel-level pavement crack detection using information of multi-scale neighborhoods,” IEEE Access, vol. 6, pp. 24 452–24 463, 2018
work page 2018
-
[38]
Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,
K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” Proc IEEE Int Conf Comput Vis , pp. 1026–1034, 2015
work page 2015
-
[39]
Some methods of speeding up the convergence of iteration methods,
B. T. Polyak, “Some methods of speeding up the convergence of iteration methods,” Ussr Comput. Math. Math. Phys. , vol. 4, no. 5, pp. 1–17, 1964
work page 1964
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.