Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Bibesh Pyakurel (University of Wisconsin - Green Bay); Kong Pheng Yang (University of Wisconsin - Green Bay); Logan Dewick (University of Wisconsin - Green Bay); M. G. Sarwar Murshed (University of Wisconsin - Green Bay); Nazim Choudhury (University of Wisconsin - Green Bay)

arxiv: 2605.26095 · v1 · pith:QUO3ME4Cnew · submitted 2026-05-25 · 💻 cs.CV

Pixel-Level Pavement Distress Assessment Using Instance Segmentation

Logan Dewick (University of Wisconsin - Green Bay) , Bibesh Pyakurel (University of Wisconsin - Green Bay) , Kong Pheng Yang (University of Wisconsin - Green Bay) , Nazim Choudhury (University of Wisconsin - Green Bay) , M. G. Sarwar Murshed (University of Wisconsin - Green Bay) This is my paper

Pith reviewed 2026-06-29 22:34 UTC · model grok-4.3

classification 💻 cs.CV

keywords pavement distressinstance segmentationMask R-CNNcrack detectionpothole detectionroad maintenancecomputer visionUWGB-StreetCrack

0 comments

The pith

Mask R-CNN instance segmentation produces pixel-level masks that let aggregate crack area on pavement images match manual ground truth to within 0.006 percentage points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether instance segmentation can move pavement assessment from image-level labels or bounding boxes to exact per-pixel localization of thin, branching cracks and potholes. It trains five Mask R-CNN variants on a custom set of smartphone images collected from roadways and manually outlined with polygons for four distress classes. The strongest model reaches 84.23 percent precision and 90.04 percent recall under a bounding-box matching rule, then delivers a total predicted crack-area fraction of 2.164 percent that sits 0.006 points from the labeled fraction. This level of geometric agreement matters because maintenance crews need quantified extent, not just presence, to schedule repairs. The same setup also shows clear gains over a retrained YOLO detector on the identical data.

Core claim

A Mask R-CNN model with ResNet-101 FPN backbone, fine-tuned on the UWGB-StreetCrack dataset of polygon-annotated roadway images, localizes longitudinal, transverse, alligator cracks and potholes at the pixel level and yields an aggregate crack-area fraction of 2.164 percent that closely matches the 2.170 percent ground-truth fraction, while attaining 84.23 percent precision, 90.04 percent recall and 87.04 percent F1 under the project's bounding-box protocol.

What carries the argument

Mask R-CNN instance segmentation that outputs per-object binary masks rather than bounding boxes, applied to a vehicle-mounted smartphone image dataset with manual polygon labels for four distress types.

If this is right

Pixel-level masks enable direct computation of distress area instead of coarse bounding-box approximations.
Instance segmentation supplies a workable route to aggregate quantification on ordinary field photographs.
The same model family outperforms a CSPDarknet53 YOLO detector by a wide margin on the identical validation protocol.
Close numerical agreement on total crack fraction demonstrates that segmentation outputs can support maintenance-relevant measurements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The resulting area totals could be fed directly into pavement-management databases to rank road segments by total distress load.
Processing sequential frames from a moving vehicle would allow the same masks to track how individual cracks grow between inspections.
Reducing reliance on single-annotator polygons would require multi-expert consensus labels or synthetic data augmentation to improve robustness.

Load-bearing premise

The human-drawn polygon annotations on the images accurately trace the true boundaries of each distress without systematic bias or inconsistency that would affect both training and the area-fraction comparison.

What would settle it

Acquire a fresh set of pavement images, obtain independent polygon annotations from multiple experts, and test whether the model's predicted aggregate crack-area fraction still lies within 0.01 percentage points of the averaged ground-truth fraction.

Figures

Figures reproduced from arXiv: 2605.26095 by Bibesh Pyakurel (University of Wisconsin - Green Bay), Kong Pheng Yang (University of Wisconsin - Green Bay), Logan Dewick (University of Wisconsin - Green Bay), M. G. Sarwar Murshed (University of Wisconsin - Green Bay), Nazim Choudhury (University of Wisconsin - Green Bay).

**Figure 2.** Figure 2: Polygon-based pavement distress annotation in Label Studio. The [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Representative failure cases used for qualitative analysis. Red boxes [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

read the original abstract

Automated pavement distress assessment requires more than image-level classification or coarse bounding box detection, demanding precise localization of thin, branching, and irregular cracks to achieve the geometric precision necessary for maintenance-relevant quantification. This paper presents a vision-based pavement distress analysis system based on Mask R-CNN instance segmentation and evaluates it on UWGB-StreetCrack, a custom field-collected roadway image dataset acquired with a vehicle-mounted smartphone and manually annotated with polygon labels for longitudinal cracks, transverse cracks, alligator cracks, and potholes. Five Detectron2-based Mask R-CNN backbone variants were considered under a consistent fine-tuning protocol. The best-performing model, Mask R-CNN with a ResNet-101 FPN backbone, achieved 84.23% precision, 90.04% recall, and an F1 score of 87.04% under the project-specific bounding-box matching protocol. The same model produced an aggregate predicted crack-area fraction of 2.164%, closely matching the 2.170% ground-truth crack-area fraction. To contextualize the segmentation system against a detector-oriented alternative, a CSPDarknet53-based YOLO detector was also adapted and retrained on the dataset, reaching 27.5% precision and 20.7% recall on the validation protocol. The results show that instance segmentation is a practical direction for field pavement imagery and aggregate crack-area estimation, while also exposing open challenges in annotation consistency, class imbalance, confounder rejection, and mask-level benchmarking.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Routine Mask R-CNN application to a new smartphone-collected crack dataset with usable numbers but thin validation details and a potentially circular area-fraction claim.

read the letter

The main takeaway is that this is a competent but incremental engineering paper: they collected a new UWGB-StreetCrack dataset with smartphone images, annotated polygons for four distress types, and fine-tuned several Mask R-CNN backbones in Detectron2. The ResNet-101 version reaches 84% precision and 90% recall on their bounding-box protocol and produces an aggregate crack-area fraction within 0.006% of the labeled total. They also retrain a YOLO detector on the same data and show it underperforms on the area task.

What works is the practical framing. Instance segmentation is the right tool when you need pixel counts rather than just boxes, and the paper demonstrates that on real roadway imagery. The YOLO comparison is useful context even if the protocols are not perfectly matched.

The soft spots are in the evaluation. The abstract gives no train/validation split sizes, no error bars, and no inter-rater agreement numbers even though the authors themselves call annotation consistency an open issue. The headline result—the near-perfect area-fraction match—rests on the same manual polygons used for training, so it could simply reproduce whatever systematic tendencies the annotators had in drawing crack widths or boundaries. Without an external reference or repeated labeling, that 0.006% difference does not tell us much about geometric accuracy.

This is the kind of paper that belongs in an applied computer-vision or infrastructure-monitoring venue rather than a methods journal. Readers working on road-maintenance datasets will find the numbers and the dataset itself worth looking at; people chasing algorithmic novelty will not. It deserves peer review once the authors add split details, agreement metrics, and a clearer statement on whether the area comparison is meant to be a validation or just a sanity check.

Referee Report

3 major / 1 minor

Summary. The paper presents a Mask R-CNN instance segmentation system for pixel-level pavement distress assessment (longitudinal, transverse, alligator cracks, and potholes) on the custom UWGB-StreetCrack smartphone-collected dataset with manual polygon annotations. Five Detectron2 backbones are fine-tuned; the ResNet-101 FPN variant reports 84.23% precision, 90.04% recall, and 87.04% F1 under a project-specific bounding-box protocol, plus an aggregate predicted crack-area fraction of 2.164% that closely matches the 2.170% ground-truth fraction. A CSPDarknet53 YOLO detector is retrained for comparison, reaching 27.5% precision and 20.7% recall.

Significance. If the quantitative claims hold under proper validation, the work demonstrates that instance segmentation can deliver maintenance-relevant geometric metrics (pixel-area fractions) on field imagery where bounding-box detectors fall short, addressing a practical gap in automated pavement assessment.

major comments (3)

[Abstract] Abstract: the central claim that the 2.164% vs. 2.170% crack-area fraction match demonstrates 'geometric precision necessary for maintenance-relevant quantification' is undermined because the same manual polygon annotations supply both the training targets and the reference area fraction; any consistent annotator bias in crack width, branch inclusion, or boundary placement is learned and reproduced, making the 0.006% difference uninformative without inter-annotator agreement, repeated labeling, or external reference (e.g., laser scan).
[Abstract] Abstract: no validation-split details, error bars, post-hoc exclusions, or annotation inter-rater reliability are reported, so the precision/recall/F1 numbers for the ResNet-101 model cannot be assessed for robustness or generalizability.
[Abstract] Abstract: the YOLO comparison (27.5% precision, 20.7% recall) is presented without confirming that the bounding-box matching protocol, image splits, and evaluation conditions were identical to those used for Mask R-CNN, preventing a controlled assessment of segmentation versus detection.

minor comments (1)

The abstract refers to a 'project-specific bounding-box matching protocol' without definition or reference to a methods section; explicit description is needed for reproducibility.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the constructive comments. We address each major comment point by point below and indicate the revisions that will be incorporated.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the 2.164% vs. 2.170% crack-area fraction match demonstrates 'geometric precision necessary for maintenance-relevant quantification' is undermined because the same manual polygon annotations supply both the training targets and the reference area fraction; any consistent annotator bias in crack width, branch inclusion, or boundary placement is learned and reproduced, making the 0.006% difference uninformative without inter-annotator agreement, repeated labeling, or external reference (e.g., laser scan).

Authors: We agree that the reported crack-area fraction comparison is derived from the same annotations used in training and therefore cannot independently demonstrate geometric precision free of annotator bias. We will revise the abstract to remove or substantially qualify the claim regarding 'geometric precision necessary for maintenance-relevant quantification' and add a limitations paragraph discussing reliance on single-annotator polygons. revision: yes
Referee: [Abstract] Abstract: no validation-split details, error bars, post-hoc exclusions, or annotation inter-rater reliability are reported, so the precision/recall/F1 numbers for the ResNet-101 model cannot be assessed for robustness or generalizability.

Authors: We will add explicit details on the validation split (image counts and class distribution) and state that the metrics come from a single fixed split without cross-validation or error bars. Inter-rater reliability was not computed. These clarifications will be inserted in the methods and results sections. revision: yes
Referee: [Abstract] Abstract: the YOLO comparison (27.5% precision, 20.7% recall) is presented without confirming that the bounding-box matching protocol, image splits, and evaluation conditions were identical to those used for Mask R-CNN, preventing a controlled assessment of segmentation versus detection.

Authors: The same image splits were used for both models. YOLO evaluation follows a bounding-box IoU protocol while Mask R-CNN metrics are obtained by converting predicted masks to boxes; we will explicitly document these protocol differences and confirm the shared splits in the revised methods and results sections. revision: partial

standing simulated objections not resolved

We cannot supply inter-annotator agreement statistics, repeated labeling, or external ground truth (e.g., laser scans) because the dataset was annotated by a single individual without such measures.

Circularity Check

0 steps flagged

No significant circularity; standard supervised evaluation on held-out data.

full rationale

The paper reports empirical performance metrics (precision, recall, F1) and an aggregate crack-area fraction comparison from Mask R-CNN trained on manually annotated UWGB-StreetCrack images and evaluated on held-out validation images. No mathematical derivation, functional form, or parameter fitting is present that reduces to its own inputs by construction. The area-fraction match is a direct comparison of model output to ground-truth labels on the test set, which is the standard non-circular outcome of supervised learning. No self-citations, uniqueness theorems, or ansatzes are invoked as load-bearing steps for the central claims. The work is self-contained against external benchmarks (held-out image evaluation) and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical computer-vision application paper. No mathematical derivation, free parameters in an analytic sense, axioms, or invented physical entities are present.

pith-pipeline@v0.9.1-grok · 5849 in / 1101 out tokens · 43641 ms · 2026-06-29T22:34:48.455365+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

23 extracted references · 2 canonical work pages · 2 internal anchors

[1]

Mask r-cnn

Kaiming He, Georgia Gkioxari, Piotr Doll ´ar, and Ross Girshick. Mask r-cnn. In2017 IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017

2017
[2]

Cracktree: Automatic crack detection from pavement images.Pattern Recognition Letters, 33(3):227–238, 2012

Qin Zou, Yu Cao, Qingquan Li, Qingzhou Mao, and Song Wang. Cracktree: Automatic crack detection from pavement images.Pattern Recognition Letters, 33(3):227–238, 2012

2012
[3]

Automatic road crack detection using random structured forests.IEEE Transactions on Intelligent Transportation Systems, 17(12):3434–3445, 2016

Yong Shi, Limeng Cui, Zhiquan Qi, Fan Meng, and Zhensong Chen. Automatic road crack detection using random structured forests.IEEE Transactions on Intelligent Transportation Systems, 17(12):3434–3445, 2016

2016
[4]

Road crack detection using deep convolutional neural network

Lei Zhang, Fan Yang, Yimin Daniel Zhang, and Ying Julie Zhu. Road crack detection using deep convolutional neural network. In2016 IEEE international conference on image processing (ICIP), pages 3708–3712. IEEE, 2016

2016
[5]

Automatic Pavement Crack Detection Based on Structured Prediction with the Convolutional Neural Network

Zhun Fan, Yuming Wu, Jiewei Lu, and Wenji Li. Automatic pavement crack detection based on structured prediction with the convolutional neural network.arXiv preprint arXiv:1802.02208, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[6]

U-net: Con- volutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Con- volutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors,Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing

2015
[7]

Deepcrack: Learning hierarchical convolutional features for crack detection.IEEE transactions on image processing, 28(3):1498– 1512, 2018

Qin Zou, Zheng Zhang, Qingquan Li, Xianbiao Qi, Qian Wang, and Song Wang. Deepcrack: Learning hierarchical convolutional features for crack detection.IEEE transactions on image processing, 28(3):1498– 1512, 2018

2018
[8]

FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture

Wenjun Liu, Yuchun Huang, Ying Li, and Qi Chen. Fpcnet: Fast pave- ment crack detection network based on encoder-decoder architecture. arXiv preprint arXiv:1907.02248, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1907
[9]

Encoder–decoder network for pixel-level road crack detection in black- box images.Computer-Aided Civil and Infrastructure Engineering, 34(8):713–727, 2019

Seongdeok Bang, Somin Park, Hongjo Kim, and Hyoungkwan Kim. Encoder–decoder network for pixel-level road crack detection in black- box images.Computer-Aided Civil and Infrastructure Engineering, 34(8):713–727, 2019

2019
[10]

Faster r-cnn: Towards real-time object detection with region proposal networks

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015

2015
[11]

M. G. Sarwar Murshed, Keivan Bahmani, Stephanie Schuckers, and Faraz Hussain. Deep age-invariant fingerprint segmentation system. IEEE Transactions on Biometrics, Behavior, and Identity Science, 7(3):313–330, 2025

2025
[12]

Deep learning frameworks for pavement distress classification: A comparative analysis

Vishal Mandal, Abdul Rashid Mussah, and Yaw Adu-Gyamfi. Deep learning frameworks for pavement distress classification: A comparative analysis. pages 5577–5583, 2020

2020
[13]

Hu, Bao L

Guo X. Hu, Bao L. Hu, Zhong Yang, Li Huang, and Ping Li. Pavement crack detection method based on deep learning models.Wireless Communications and Mobile Computing, 2021(1):5573590, 2021

2021
[14]

The road crack detection algorithm improved based on yolov7

Xin Liang and Maoting Gao. The road crack detection algorithm improved based on yolov7. InProceedings of the 2024 International Conference on Generative Artificial Intelligence and Information Secu- rity, pages 335–339, 2024

2024
[15]

Gsbyolo: A lightweight multi-scale fusion network for road crack detection in complex environments.Scientific Reports, 15(1):26615, 2025

Yuhao Wang, Heran Zhu, Yirong Wang, Jianping Liu, Jun Xie, Bi Zhao, and Siyue Zhao. Gsbyolo: A lightweight multi-scale fusion network for road crack detection in complex environments.Scientific Reports, 15(1):26615, 2025

2025
[16]

Pavement crack detection and segmentation method based on improved deep learning fusion model.Mathematical Problems in Engineering, 2020(1):8515213, 2020

Xiaoran Feng, Liyang Xiao, Wei Li, Lili Pei, Zhaoyun Sun, Zhidan Ma, Hao Shen, and Huyan Ju. Pavement crack detection and segmentation method based on improved deep learning fusion model.Mathematical Problems in Engineering, 2020(1):8515213, 2020

2020
[17]

Automated pavement crack detection and segmentation based on two-step convolutional neural network

Jingwei Liu, Xu Yang, Stephen Lau, Xin Wang, Sang Luo, Vincent Cheng-Siong Lee, and Ling Ding. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Computer-Aided Civil and Infrastructure Engineering, 35(11):1291– 1305, 2020

2020
[18]

Feature pyramid networks for object detection

Tsung-Yi Lin, Piotr Doll ´ar, Ross Girshick, Kaiming He, Bharath Hariha- ran, and Serge Belongie. Feature pyramid networks for object detection. In2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 936–944, 2017

2017
[19]

Pavement crack instance segmentation using yolov7-wmf with connected feature fusion.Automation in Construction, 160:105331, 2024

Guanting Ye, Sai Li, Manxu Zhou, Yifei Mao, Jinsheng Qu, Tieyu Shi, and Qiang Jin. Pavement crack instance segmentation using yolov7-wmf with connected feature fusion.Automation in Construction, 160:105331, 2024

2024
[20]

Research on real- time detection algorithm for pavement cracks based on sparseinst-cdsm

Shao-Jie Wang, Ji-Kai Zhang, and Xiao-Qi Lu. Research on real- time detection algorithm for pavement cracks based on sparseinst-cdsm. Mathematics, 11(15):3277, 2023

2023
[21]

Distress identification manual for the long-term pavement performance program

John S Miller, William Y Bellinger, et al. Distress identification manual for the long-term pavement performance program. Technical report, United States. Federal Highway Administration. Office of Infrastruc- ture . . . , 2003

2003
[22]

Detectron2

Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. Detectron2. https://github.com/facebookresearch/detectron2, 2019

2019
[23]

Automatic pavement crack detection based on structured prediction with the convolutional neural network, 2018

Zhun Fan, Yuming Wu, Jiewei Lu, and Wenji Li. Automatic pavement crack detection based on structured prediction with the convolutional neural network, 2018

2018

[1] [1]

Mask r-cnn

Kaiming He, Georgia Gkioxari, Piotr Doll ´ar, and Ross Girshick. Mask r-cnn. In2017 IEEE International Conference on Computer Vision (ICCV), pages 2980–2988, 2017

2017

[2] [2]

Cracktree: Automatic crack detection from pavement images.Pattern Recognition Letters, 33(3):227–238, 2012

Qin Zou, Yu Cao, Qingquan Li, Qingzhou Mao, and Song Wang. Cracktree: Automatic crack detection from pavement images.Pattern Recognition Letters, 33(3):227–238, 2012

2012

[3] [3]

Automatic road crack detection using random structured forests.IEEE Transactions on Intelligent Transportation Systems, 17(12):3434–3445, 2016

Yong Shi, Limeng Cui, Zhiquan Qi, Fan Meng, and Zhensong Chen. Automatic road crack detection using random structured forests.IEEE Transactions on Intelligent Transportation Systems, 17(12):3434–3445, 2016

2016

[4] [4]

Road crack detection using deep convolutional neural network

Lei Zhang, Fan Yang, Yimin Daniel Zhang, and Ying Julie Zhu. Road crack detection using deep convolutional neural network. In2016 IEEE international conference on image processing (ICIP), pages 3708–3712. IEEE, 2016

2016

[5] [5]

Automatic Pavement Crack Detection Based on Structured Prediction with the Convolutional Neural Network

Zhun Fan, Yuming Wu, Jiewei Lu, and Wenji Li. Automatic pavement crack detection based on structured prediction with the convolutional neural network.arXiv preprint arXiv:1802.02208, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[6] [6]

U-net: Con- volutional networks for biomedical image segmentation

Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Con- volutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors,Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing

2015

[7] [7]

Deepcrack: Learning hierarchical convolutional features for crack detection.IEEE transactions on image processing, 28(3):1498– 1512, 2018

Qin Zou, Zheng Zhang, Qingquan Li, Xianbiao Qi, Qian Wang, and Song Wang. Deepcrack: Learning hierarchical convolutional features for crack detection.IEEE transactions on image processing, 28(3):1498– 1512, 2018

2018

[8] [8]

FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture

Wenjun Liu, Yuchun Huang, Ying Li, and Qi Chen. Fpcnet: Fast pave- ment crack detection network based on encoder-decoder architecture. arXiv preprint arXiv:1907.02248, 2019

work page internal anchor Pith review Pith/arXiv arXiv 1907

[9] [9]

Encoder–decoder network for pixel-level road crack detection in black- box images.Computer-Aided Civil and Infrastructure Engineering, 34(8):713–727, 2019

Seongdeok Bang, Somin Park, Hongjo Kim, and Hyoungkwan Kim. Encoder–decoder network for pixel-level road crack detection in black- box images.Computer-Aided Civil and Infrastructure Engineering, 34(8):713–727, 2019

2019

[10] [10]

Faster r-cnn: Towards real-time object detection with region proposal networks

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 28. Curran Associates, Inc., 2015

2015

[11] [11]

M. G. Sarwar Murshed, Keivan Bahmani, Stephanie Schuckers, and Faraz Hussain. Deep age-invariant fingerprint segmentation system. IEEE Transactions on Biometrics, Behavior, and Identity Science, 7(3):313–330, 2025

2025

[12] [12]

Deep learning frameworks for pavement distress classification: A comparative analysis

Vishal Mandal, Abdul Rashid Mussah, and Yaw Adu-Gyamfi. Deep learning frameworks for pavement distress classification: A comparative analysis. pages 5577–5583, 2020

2020

[13] [13]

Hu, Bao L

Guo X. Hu, Bao L. Hu, Zhong Yang, Li Huang, and Ping Li. Pavement crack detection method based on deep learning models.Wireless Communications and Mobile Computing, 2021(1):5573590, 2021

2021

[14] [14]

The road crack detection algorithm improved based on yolov7

Xin Liang and Maoting Gao. The road crack detection algorithm improved based on yolov7. InProceedings of the 2024 International Conference on Generative Artificial Intelligence and Information Secu- rity, pages 335–339, 2024

2024

[15] [15]

Gsbyolo: A lightweight multi-scale fusion network for road crack detection in complex environments.Scientific Reports, 15(1):26615, 2025

Yuhao Wang, Heran Zhu, Yirong Wang, Jianping Liu, Jun Xie, Bi Zhao, and Siyue Zhao. Gsbyolo: A lightweight multi-scale fusion network for road crack detection in complex environments.Scientific Reports, 15(1):26615, 2025

2025

[16] [16]

Pavement crack detection and segmentation method based on improved deep learning fusion model.Mathematical Problems in Engineering, 2020(1):8515213, 2020

Xiaoran Feng, Liyang Xiao, Wei Li, Lili Pei, Zhaoyun Sun, Zhidan Ma, Hao Shen, and Huyan Ju. Pavement crack detection and segmentation method based on improved deep learning fusion model.Mathematical Problems in Engineering, 2020(1):8515213, 2020

2020

[17] [17]

Automated pavement crack detection and segmentation based on two-step convolutional neural network

Jingwei Liu, Xu Yang, Stephen Lau, Xin Wang, Sang Luo, Vincent Cheng-Siong Lee, and Ling Ding. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Computer-Aided Civil and Infrastructure Engineering, 35(11):1291– 1305, 2020

2020

[18] [18]

Feature pyramid networks for object detection

Tsung-Yi Lin, Piotr Doll ´ar, Ross Girshick, Kaiming He, Bharath Hariha- ran, and Serge Belongie. Feature pyramid networks for object detection. In2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 936–944, 2017

2017

[19] [19]

Pavement crack instance segmentation using yolov7-wmf with connected feature fusion.Automation in Construction, 160:105331, 2024

Guanting Ye, Sai Li, Manxu Zhou, Yifei Mao, Jinsheng Qu, Tieyu Shi, and Qiang Jin. Pavement crack instance segmentation using yolov7-wmf with connected feature fusion.Automation in Construction, 160:105331, 2024

2024

[20] [20]

Research on real- time detection algorithm for pavement cracks based on sparseinst-cdsm

Shao-Jie Wang, Ji-Kai Zhang, and Xiao-Qi Lu. Research on real- time detection algorithm for pavement cracks based on sparseinst-cdsm. Mathematics, 11(15):3277, 2023

2023

[21] [21]

Distress identification manual for the long-term pavement performance program

John S Miller, William Y Bellinger, et al. Distress identification manual for the long-term pavement performance program. Technical report, United States. Federal Highway Administration. Office of Infrastruc- ture . . . , 2003

2003

[22] [22]

Detectron2

Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick. Detectron2. https://github.com/facebookresearch/detectron2, 2019

2019

[23] [23]

Automatic pavement crack detection based on structured prediction with the convolutional neural network, 2018

Zhun Fan, Yuming Wu, Jiewei Lu, and Wenji Li. Automatic pavement crack detection based on structured prediction with the convolutional neural network, 2018

2018