pith. machine review for the scientific record. sign in

arxiv: 2605.08781 · v1 · submitted 2026-05-09 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Contour-Native Bridge Defect Detection and Compact Digital Archiving with Frequency-Supervised Fourier Contours

Authors on Pith no claims yet

Pith reviewed 2026-05-12 01:37 UTC · model grok-4.3

classification 💻 cs.CV
keywords bridge defect detectionFourier series contourscontour regressionUAV imagerypolygon evaluationcompact vector archivingdefect boundary geometry
0
0 comments X

The pith

Fourier contour descriptors enable more accurate and compact representation of bridge defects than bounding boxes or raster masks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a detection method that outputs defects as Fourier series contours instead of boxes or pixel masks. It demonstrates on thousands of UAV bridge images that these contours give better polygon-level accuracy and geometric fidelity. A reader would care because the resulting records are smaller to store and transmit while remaining recoverable for detailed engineering analysis.

Core claim

Frequency-Supervised Fourier Series Detection (FS-FSD) directly regresses Fourier contour descriptors for bridge defects and, on a dataset of 3,767 UAV images containing 42,346 defect instances, produces higher polygon-space accuracy and improved matched-TP geometric quality than standard detection, segmentation, and contour methods, showing that Fourier contours preserve boundary geometry in a compact and shareable vector form.

What carries the argument

Frequency-Supervised Fourier Series Detection (FS-FSD) that regresses and supervises Fourier contour descriptors to represent defect boundaries as finite series.

If this is right

  • Fourier contour records require less storage and bandwidth than raster masks while remaining recoverable.
  • Evaluation in unified polygon space reveals better boundary geometry than box or mask outputs.
  • These vector records support downstream information workflows and engineering review.
  • Direct regression of contours avoids post-processing steps common in segmentation pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach may scale to other linear infrastructure inspections like roads or pipelines where boundary precision matters.
  • Long-term tracking of defect evolution could become feasible if contours are matched across time series of images.
  • Adopting vector contours could standardize data exchange in bridge management systems beyond current image or annotation formats.

Load-bearing premise

Complex or fragmented bridge defect boundaries can be represented accurately by a finite Fourier series with limited geometric loss.

What would settle it

A standard mask-based segmentation model that matches or exceeds the polygon IoU and boundary F1 scores of FS-FSD on the same 3,767-image dataset while using comparable or less storage per defect.

Figures

Figures reproduced from arXiv: 2605.08781 by Hongxu Pu, Hu Wang, Jin Liu, Kunming Luo, Wang Wang, Yasong Wang, Zhen Cao.

Figure 1
Figure 1. Figure 1: Architecture of FS-FSD [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Fourier prediction head. For detection level 𝑙 and grid location (𝑢, 𝑣), the prediction tensor is written as 𝐨𝑙,𝑢,𝑣 = [ 𝐬̂ 𝑙,𝑢,𝑣, ̂𝐟 𝑔𝑟𝑖𝑑 𝑙,𝑢,𝑣 ] , (7) J.Liu & W.Wang et al.: Preprint submitted to Elsevier Page 7 of 46 [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Analytical principle of CHPA. The procedure resolves starting-point ambiguity by solving the first-harmonic phase, propagating the phase to higher harmonic orders, and rotating the ground truth coefficients into an aligned supervision target. where 𝐓(𝐶𝑘 , 𝑆𝑘 ) = ⎡ ⎢ ⎢ ⎢ ⎣ 𝐶𝑘 𝑆𝑘 0 0 −𝑆𝑘 𝐶𝑘 0 0 0 0 𝐶𝑘 𝑆𝑘 0 0 −𝑆𝑘 𝐶𝑘 ⎤ ⎥ ⎥ ⎥ ⎦ . (22) The base phase is solved analytically from the first harmonic. Let 𝐪𝑡,1 = [𝑎𝑡… view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative visualization of CHPA in bridge defect images. Red indicates the original ground truth parameteriza￾tion, green indicates the prediction-implied parameterization, and yellow indicates the phase-aligned ground truth reference used to stabilize Fourier coefficient supervision. The analytical workflow of CHPA is illustrated in [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Representative bridge defect samples from the field dataset. To support bridge defect detection, segmentation, and geometric representation learning, a field dataset was constructed from real bridge inspection scenarios. The acquisition campaign covered seven bridges and four repre￾sentative inspection contexts, namely urban viaducts, mountainous bridges, railway bridges, and long-span highway bridges. Thi… view at source ↗
Figure 6
Figure 6. Figure 6: Polygon-space IoU computation. The predicted output, regardless of its native representation, is first converted into polygon space and then compared with the polygon ground truth through intersection and union areas. For each class 𝑐, predictions are sorted in descending order of confidence and greedily matched to unmatched ground-truth polygons of the same class under a polygon IoU threshold 𝜏. The corre… view at source ↗
Figure 7
Figure 7. Figure 7: mAP across Fourier orders under coefficient truncation. A trained 𝑛𝑓 = 16 FS-FSD model is evaluated by retaining only the first 𝑘 Fourier orders, while the 𝑛𝑓 = 16 ground-truth contour is kept as the fixed reference. section therefore presents a proof-of-concept workflow validation that focuses on post-prediction data handling and component-level deployability, rather than treating the evaluation as a full… view at source ↗
Figure 8
Figure 8. Figure 8: Metric and category-level AP summaries across Fourier orders under the fixed 𝑛𝑓 = 16 contour reference. The left panel reports overall 𝑚𝐴𝑃50, 𝑚𝐴𝑃50∶95, and F1-score, while the middle and right panels show category-level 𝑚𝐴𝑃50 and 𝑚𝐴𝑃50∶95 heatmaps across Fourier orders [PITH_FULL_IMAGE:figures/full_fig_p027_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative contour reconstruction under different Fourier orders. Increasing the order progressively restores elongated structures, local protrusions, curvature changes, and small boundary irregularities. record. To keep this perspective structurally coherent, the validation is organized into database-oriented archival and recovery benchmarking, followed by browser-side visualization in the subsequent wor… view at source ↗
Figure 10
Figure 10. Figure 10: Method-centric grouped-bar comparison of FS-FSD and YOLO-side storage routes under route-comparable metrics. The x-axis lists FS-FSD, RLE-full, RLE-crop, Poly-256, and Fourier-fit. Panels (a)–(d) show payload per defect, record size per defect, archive route overhead per defect excluding inference, and route-to-usable latency per defect excluding inference, respectively. The route-cost results further sho… view at source ↗
Figure 11
Figure 11. Figure 11: Storage–latency trade-offs and FS-FSD advantage map across YOLO-side storage routes under route-comparable metrics. In panels (a) and (b), the lower-left region is preferable because it indicates both lower archive footprint and lower route cost. In panel (c), the ratio direction is chosen so that values greater than 1 favor FS-FSD. The storage-latency trade-off is summarized in [PITH_FULL_IMAGE:figures/… view at source ↗
Figure 12
Figure 12. Figure 12: Workflow from archived FS-FSD image-space damage records to browser-side review. the visible image list is driven by a user-selected local image repository, and the archived defect records stored in the database are attached to those images through record-level matching. The project is available at: https: //github.com/wangzai822/Frequency-supervised-Fourier-Series-Detection. The key point of the prototyp… view at source ↗
Figure 13
Figure 13. Figure 13: WebGL prototype interface for reviewing archived FS-FSD image-space damage records. Within the limited scope of this proof-of-concept prototype, three observations can be made. First, archived FS-FSD records can be decoded and visualized directly as image-space contours in a browser environment. Second, the compact descriptor format avoids the need to transmit dense masks or maintain separate raster overl… view at source ↗
read the original abstract

AI-assisted bridge defect inspection often produces bounding boxes with crude geometry or raster masks that are costly to store, transmit, and reuse. This study investigates how detected defects can be represented as compact, recoverable contour-level vector records in image space. We propose Frequency-Supervised Fourier Series Detection (FS-FSD), which directly regresses Fourier contour descriptors and evaluates boxes, masks, and contours under a unified polygon-space protocol. On 3,767 UAV-collected bridge images with 42,346 defect instances, FS-FSD achieves higher polygon-space accuracy and better matched-TP geometric quality than representative detection, segmentation, and contour baselines. These results show that, compared with bounding boxes and raster masks, Fourier contour records preserve defect-boundary geometry in a more compact, recoverable, and shareable form for engineering review and downstream information workflows. Future work will study the modeling of multi-region, fragmented, and adjacent bridge-defect boundaries and extend the framework toward long-term bridge-defect tracking and lifecycle-oriented management.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript proposes Frequency-Supervised Fourier Series Detection (FS-FSD), a method that directly regresses Fourier contour descriptors for bridge defects detected in UAV imagery. It evaluates boxes, masks, and contours under a unified polygon-space protocol and reports that FS-FSD attains higher polygon-space accuracy and superior matched-TP geometric quality than representative detection, segmentation, and contour baselines on 3,767 images containing 42,346 defect instances. The work positions Fourier contour records as a compact, recoverable, and shareable alternative to bounding boxes or raster masks for engineering review and digital archiving, while deferring multi-region, fragmented, and adjacent boundary cases to future work.

Significance. If the empirical superiority holds under rigorous verification, the approach would demonstrate a practical contour-native representation that improves geometric fidelity and storage efficiency over conventional outputs in civil infrastructure inspection. The unified polygon-space evaluation protocol is a constructive contribution that enables direct comparison across output types.

major comments (3)
  1. [Results section] Results section (and associated tables/figures): the central claim of higher polygon-space accuracy and better matched-TP geometric quality on 42,346 instances is stated without accompanying details on the precise metrics (e.g., IoU thresholds, polygon distance measures), statistical significance tests, baseline hyperparameter search protocols, or error analysis stratified by defect complexity; this absence prevents independent assessment of whether the reported gains are robust or sensitive to post-hoc choices.
  2. [Method section] Method section (Fourier series regression): the frequency-supervised formulation treats the truncation order and supervision weights as free parameters, yet no ablation study quantifies sensitivity of the reported gains to these choices; combined with the abstract's explicit deferral of multi-region, fragmented, and adjacent boundaries, this leaves open the possibility that aggregate improvements are driven primarily by simpler defects and may not generalize to the full dataset composition.
  3. [Dataset and evaluation protocol] Dataset and evaluation protocol: the manuscript provides no information on annotation protocol for the 42,346 instances, train/validation/test splits, or how polygon ground truth was derived from UAV imagery; without these, the unified protocol's fairness and the reproducibility of the headline comparison cannot be verified.
minor comments (3)
  1. [Method section] Notation for the Fourier descriptors (e.g., coefficients, frequency supervision loss) should be defined once in a dedicated subsection and used consistently thereafter to improve readability.
  2. [Results section] Include quantitative measures of geometric approximation error (e.g., Hausdorff distance between original and reconstructed contours) alongside qualitative examples to substantiate the compactness claim.
  3. [Introduction] Add citations to prior literature on Fourier contour descriptors in object detection and medical imaging to better situate the frequency-supervision contribution.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive feedback. We address each major comment below and will revise the manuscript accordingly to improve clarity and reproducibility.

read point-by-point responses
  1. Referee: [Results section] Results section (and associated tables/figures): the central claim of higher polygon-space accuracy and better matched-TP geometric quality on 42,346 instances is stated without accompanying details on the precise metrics (e.g., IoU thresholds, polygon distance measures), statistical significance tests, baseline hyperparameter search protocols, or error analysis stratified by defect complexity; this absence prevents independent assessment of whether the reported gains are robust or sensitive to post-hoc choices.

    Authors: We agree that additional details are necessary for rigorous assessment. In the revised manuscript, we will expand the Results section to specify the exact metrics used, including IoU thresholds (0.5 and 0.75) and polygon-specific measures such as average symmetric surface distance and Hausdorff distance. We will report statistical significance using appropriate tests (e.g., McNemar's test for accuracy comparisons). Baseline hyperparameters were optimized via grid search over standard ranges, and we will include this protocol. Additionally, we will provide an error analysis stratified by defect complexity, categorizing instances based on boundary intricacy. revision: yes

  2. Referee: [Method section] Method section (Fourier series regression): the frequency-supervised formulation treats the truncation order and supervision weights as free parameters, yet no ablation study quantifies sensitivity of the reported gains to these choices; combined with the abstract's explicit deferral of multi-region, fragmented, and adjacent boundaries, this leaves open the possibility that aggregate improvements are driven primarily by simpler defects and may not generalize to the full dataset composition.

    Authors: We acknowledge the need for an ablation study. The revised version will include experiments varying the truncation order (e.g., 8, 12, 16) and supervision weights, demonstrating that the gains are stable within reasonable ranges. Regarding generalization, while we defer complex cases to future work as noted, the current dataset includes a range of defect types, and we will add a breakdown of performance by defect category to address concerns about simpler defects driving results. We do not claim generalization to all cases. revision: partial

  3. Referee: [Dataset and evaluation protocol] Dataset and evaluation protocol: the manuscript provides no information on annotation protocol for the 42,346 instances, train/validation/test splits, or how polygon ground truth was derived from UAV imagery; without these, the unified protocol's fairness and the reproducibility of the headline comparison cannot be verified.

    Authors: We will include a dedicated subsection on the dataset in the revised manuscript. This will detail the annotation protocol (performed by civil engineering experts using polygon annotation tools on UAV imagery), the train/validation/test splits (e.g., 70%/15%/15% with no image overlap), and the derivation of polygon ground truth (manual contour tracing verified for accuracy). These additions will support reproducibility and fairness of the evaluation. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents an empirical ML proposal (FS-FSD) that regresses Fourier contour descriptors from images and reports polygon-space accuracy gains versus baselines on held-out UAV data. No equations, self-citations, or derivation steps appear in the provided text that reduce any claimed result to a fitted input or self-definition by construction. Performance claims rest on standard train/eval splits and unified metrics rather than tautological renaming or load-bearing self-reference. The method is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The approach rests on the assumption that Fourier series can compactly encode defect boundaries and that frequency supervision improves regression without introducing bias; several hyperparameters for series order and loss weighting are expected but not enumerated in the abstract.

free parameters (2)
  • Fourier series order
    Number of coefficients chosen to balance compactness and fidelity for defect shapes.
  • Frequency supervision weights
    Hyperparameters controlling emphasis on different frequency components during training.
axioms (1)
  • domain assumption Defect boundaries admit accurate low-order Fourier approximation.
    Invoked to justify compact vector representation over raster masks.

pith-pipeline@v0.9.0 · 5489 in / 1211 out tokens · 58662 ms · 2026-05-12T01:37:40.552354+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

47 extracted references · 47 canonical work pages · 2 internal anchors

  1. [1]

    Moghtadernejad, Z

    S. Moghtadernejad, Z. Lounis, J. Zhang, Digitalizing bridge management: Current trends, challenges, and a practical implementation framework, Journal of Infrastructure Systems 32 (2025).https://doi.org/10.1061/JITSE4.ISENG-2757

  2. [2]

    Dorafshan, M

    S. Dorafshan, M. Maguire, Bridge inspection: Human performance, unmanned aerial systems and automation, Journal of Civil Structural Health Monitoring 8 (2018) pp. 443–476.https://doi.org/10.1007/s13349-018-0285-4

  3. [3]

    Duque, J

    L. Duque, J. Seo, J. Wacker, Synthesis of unmanned aerial vehicle applications for infrastructures, Journal of Performance of Constructed Facilities 32 (2018) pp. 04018046.https://doi.org/10.1061/(ASCE)CF.1943-5509.0001185

  4. [4]

    C. Lyu, S. Lin, A. Lynch, Y. Zou, M. Liarokapis, Uav-based deep learning applications for automated inspection of civil infrastructure, Automation in Construction 177 (2025) pp. 106285.https://doi.org/10.1016/j.autcon.2025.106285

  5. [5]

    J. Seo, L. Duque, J. Wacker, Drone-enabled bridge inspection methodology and application, Automation in Construction 94 (2018) pp. 112–126.https://doi.org/10.1016/j.autcon.2018.06.006

  6. [6]

    Agnisarman, S

    S. Agnisarman, S. Lopes, K. Chalil Madathil, K. Piratla, et al., A survey of automation-enabled human-in-the-loop systems for infrastructure visual inspection, Automation in Construction 97 (2019) pp. 52–76.https://doi.org/10.1016/j.autcon.2018.10.019

  7. [7]

    A.Chen,W.Chen,X.He,J.Zhong, Improvingbridgeinspectionperformancewithanefficientuav-basedautomateddefectdetectionsystem, Journal of Performance of Constructed Facilities 40 (2026).https://doi.org/10.1061/JPCFEV.CFENG-5441

  8. [8]

    https://doi.org/10.1038/s41597-025-05395-w

    R.Li,L.Zhao,H.Wei,G.Hu,Y.Xu,B.Ouyang,J.Tan, Multi-defecttypebeambridgedataset:Gyu-det, ScientificData12(2025)pp.1101. https://doi.org/10.1038/s41597-025-05395-w

  9. [9]

    M.Mundt,S.Majumder,S.Murali,J.Pan,D.Valles,Codebrim:Concretedefectbridgeimagedataset,2019.https://doi.org/10.5281/ zenodo.2620293

  10. [10]

    Hüthwohl, I

    P. Hüthwohl, I. Brilakis, A. Borrmann, R. Sacks, Integrating rc bridge defect information into bim models, Journal of Computing in Civil Engineering 32 (2018).https://doi.org/10.1061/(ASCE)CP.1943-5487.0000744

  11. [11]

    D. G. Broo, J. Schooling, Digital twins in infrastructure: definitions, current practices, challenges and strategies, International Journal of Construction Management 23 (2023) pp. 1254–1263.https://doi.org/10.1080/15623599.2021.1966980

  12. [12]

    Fuller, Z

    A. Fuller, Z. Fan, C. Day, C. Barlow, Digital twin: Enabling technologies, challenges and open research, IEEE Access 8 (2020) pp. 108952– 108971.https://doi.org/10.1109/ACCESS.2020.2998358. J.Liu & W.Wang et al.:Preprint submitted to ElsevierPage 40 of 46 Frequency-supervised Fourier Series Detection (FS-FSD)

  13. [13]

    Kaveh, R

    H. Kaveh, R. Alhajj, Advancing civil infrastructure with digital twins: A review of applications and challenges, Journal of Civil Engineering and Management 31 (2025) pp. 828–842.https://doi.org/10.3846/jcem.2025.24921

  14. [14]

    W. Wang, M. Xu, Z. Cao, J. Guo, C. Liu, H. Zhang, X. Zhang, Unified data synthesis for automated 3d visual inspection and digital twinning of bridges, Automation in Construction 182 (2026) pp. 106741.https://doi.org/10.1016/j.autcon.2025.106741

  15. [15]

    R. S. Adhikari, O. Moselhi, A. Bagchi, Image-based retrieval of concrete crack properties for bridge inspection, Automation in Construction 39 (2014) pp. 180–194.https://doi.org/10.1016/j.autcon.2013.06.011

  16. [16]

    Y. Tang, Z. Huang, Z. Chen, M. Chen, H. Zhou, H. Zhang, J. Sun, Novel visual crack width measurement based on backbone double-scale features for improved detection automation, Engineering Structures 274 (2023) pp. 115158.https://doi.org/10.1016/j.engstruct. 2022.115158

  17. [17]

    Y.-J. Cha, W. Choi, O. Büyüköztürk, Deep learning-based crack damage detection using convolutional neural networks, Computer-Aided Civil and Infrastructure Engineering 32 (2017) pp. 361–378.https://doi.org/10.1111/mice.12263

  18. [18]

    F. Song, J. Liu, X. Wang, H. Sun, Pixel-level crack identification for bridge concrete structures using unmanned aerial vehicle photography and deep learning, Structural Control and Health Monitoring 2024 (2024) pp. 1299095.https://doi.org/10.1155/2024/1299095

  19. [19]

    C. T. Zahn, R. Z. Roskies, Fourier descriptors for plane closed curves, IEEE Transactions on Computers C-21 (1972) pp. 269–281. https://doi.org/10.1109/TC.1972.5008949

  20. [20]

    F. P. Kuhl, C. R. Giardina, Elliptic fourier features of a closed contour, Computer Graphics and Image Processing 18 (1982) pp. 236–258. https://doi.org/10.1016/0146-664X(82)90034-X

  21. [21]

    S. Peng, W. Jiang, H. Pi, H. Bao, X. Zhou, Deep snake for real-time instance segmentation, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 8533–8542

  22. [22]

    2581–2596.https://doi.org/10.1109/TPAMI.2025.3526990

    J.Liu,Z.Lu,Y.Cen,H.Hu,Z.Shao,Y.Hong,M.Jiang,M.Xu, Enhancingobjectdetectionwithfourierseries, IEEETransactionsonPattern Analysis and Machine Intelligence 47 (2025) pp. 2581–2596.https://doi.org/10.1109/TPAMI.2025.3526990

  23. [23]

    C. Boje, A. Guerriero, S. Kubicki, Y. Rezgui, Towards a semantic construction digital twin: Directions for future research, Automation in Construction 114 (2020) pp. 103179.https://doi.org/10.1016/j.autcon.2020.103179

  24. [24]

    759–770.https://doi.org/10.1111/mice.12141

    C.M.Yeum,S.J.Dyke, Vision-basedautomatedcrackdetectionforbridgeinspection, Computer-AidedCivilandInfrastructureEngineering 30 (2015) pp. 759–770.https://doi.org/10.1111/mice.12141

  25. [25]

    S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks, in: Advances in Neural Information Processing Systems, volume 28, 2015, pp. 91–99

  26. [26]

    Girshick,K.He,P

    T.-Y.Lin, P.Goyal,R. Girshick,K.He,P. Dollár, Focal lossfordense objectdetection, in:Proceedingsof theIEEEInternational Conference on Computer Vision, 2017, pp. 2980–2988

  27. [27]

    YOLOv3: An Incremental Improvement

    J. Redmon, A. Farhadi, Yolov3: An incremental improvement, 2018.arXiv:1804.02767

  28. [28]

    3111–3122.https://doi.org/10.1109/TMM.2018.2818020

    J.Ma,W.Shao,H.Ye,L.Wang,H.Wang,Y.Zheng,X.Xue, Arbitrary-orientedscenetextdetectionviarotationproposals, IEEETransactions on Multimedia 20 (2018) pp. 3111–3122.https://doi.org/10.1109/TMM.2018.2818020

  29. [29]

    J. Ding, N. Xue, Y. Long, G.-S. Xia, Q. Lu, Learning roi transformer for oriented object detection in aerial images, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019, pp. 2849–2858

  30. [30]

    3163–3171

    X.Yang,J.Yan,Z.Feng,T.He, R3det:Refinedsingle-stagedetectorwithfeaturerefinementforrotatingobject, in:ProceedingsoftheAAAI conference on artificial intelligence, volume 35, 2021, pp. 3163–3171

  31. [31]

    Z. Tian, C. Shen, H. Chen, T. He, Fcos: Fully convolutional one-stage object detection, in: Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 9627–9636

  32. [32]

    X. Zhou, D. Wang, P. Krähenbühl, Objects as points, arXiv preprint arXiv:1904.07850

  33. [33]

    Ronneberger, P

    O. Ronneberger, P. Fischer, T. Brox, U-net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical image computing and computer-assisted intervention, Springer, 2015, pp. 234–241

  34. [34]

    K. He,G. Gkioxari, P.Dollár, R. Girshick, Mask r-cnn, in: Proceedingsof the IEEEInternational Conference onComputer Vision, 2017,pp. 2961–2969

  35. [35]

    J. Chen, W. Lu, Y. Fu, et al., Automated facility inspection using robotics and bim: a knowledge-driven approach, Advanced Engineering Informatics 55 (2023) pp. 101838.https://doi.org/10.1016/j.aei.2022.101838

  36. [36]

    J. Chen, W. Lu, J. Lou, Automatic concrete defect detection and reconstruction by aligning aerial images onto semantic-rich building information model, Computer-Aided Civil and Infrastructure Engineering 38 (2023) pp. 1079–1098.https://doi.org/10.1111/mice. 12928

  37. [37]

    K. Chen, G. Reichard, A. Akanmu, et al., Geo-registering uav-captured close-range images to gis-based spatial model for building facade inspections, Automation in Construction 122 (2021) pp. 103503.https://doi.org/10.1016/j.autcon.2020.103503

  38. [38]

    K. Chen, G. Reichard, X. Xu, et al., Gis-based information system for automated building façade assessment based on unmanned aerial vehicles and artificial intelligence, Journal of Architectural Engineering 29 (2023) pp. 04023032.https://doi.org/10.1061/JAEIED. AEENG-1635

  39. [39]

    S. M. Nair, D. Andrushia, M. J. Carmichael, J. S. Ebenezer, A. Varghese, Gis and bim integration for assessing and maintaining existing buildings:areview, ProceedingsoftheInstitutionofCivilEngineers-ForensicEngineering178(2025)pp.51–65.https://doi.org/10. 1680/jfoen.25.00020

  40. [40]

    B. Y. Mohammed, N. A. Khalifa, S. J. S. Hakim, S. B. Shahidan, S. Ali, Bibliometric analysis of gis studies for bridge management: Trends, challenges, and future directions, Applied Geomatics 18 (2026) pp. 35.https://doi.org/10.1007/s12518-025-00683-x

  41. [41]

    X. Kong, R. G. Hucks, Preserving our heritage: A photogrammetry-based digital twin framework for monitoring deteriorations of historic structures, Automation in Construction 152 (2023) pp. 104928.https://doi.org/10.1016/j.autcon.2023.104928

  42. [42]

    Y. Zhao, W. Lv, S. Xu, J. Wei, G. Wang, Q. Dang, Y. Liu, J. Chen, Detrs beat yolos on real-time object detection, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 16965–16974. J.Liu & W.Wang et al.:Preprint submitted to ElsevierPage 41 of 46 Frequency-supervised Fourier Series Detection (FS-FSD)

  43. [43]

    Jocher, A

    G. Jocher, A. Chaurasia, A. Stoken, J. Borovec, Y. Kwon, K. Michael, J. Fang, C. Wong, Z. Yifu, D. Montes, et al., ultralytics/yolov5: v6. 2-yolov5 classification models, apple m1, reproducibility, clearml and deci. ai integrations, Zenodo

  44. [44]

    YOLOv11: An Overview of the Key Architectural Enhancements

    R. Khanam, M. Hussain, Yolov11: An overview of the key architectural enhancements, arXiv preprint arXiv:2410.17725

  45. [45]

    https://doi.org/10.48550/ARXIV.2509.25164

    R. Sapkota, R. H. Cheppally, A. Sharda, M. Karkee, Yolo26: key architectural enhancements and performance benchmarking for real-time object detection, arXiv preprint arXiv:2509.25164

  46. [46]

    Cheng, I

    B. Cheng, I. Misra, A. G. Schwing, A. Kirillov, R. Girdhar, Masked-attention mask transformer for universal image segmentation, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 1290–1299

  47. [47]

    8533–8542

    S.Peng,W.Jiang,H.Pi,X.Li,H.Bao,X.Zhou, Deepsnakeforreal-timeinstancesegmentation, in:ProceedingsoftheIEEE/CVFconference on computer vision and pattern recognition, 2020, pp. 8533–8542. A. Image-space Evaluation Scope and Optional Metric Mapping This appendix clarifies the scope of the geometric evaluation used in this paper and summarizes the external inf...