From Pixels to Semantics: A Multi-Stage AI Framework for Structural Damage Detection in Satellite Imagery

Bijay Shakya; Catherine Hoier; Khandaker Mamun Ahmed

arxiv: 2603.22768 · v1 · submitted 2026-03-24 · 💻 cs.CV

From Pixels to Semantics: A Multi-Stage AI Framework for Structural Damage Detection in Satellite Imagery

Bijay Shakya , Catherine Hoier , Khandaker Mamun Ahmed This is my paper

Pith reviewed 2026-05-15 01:14 UTC · model grok-4.3

classification 💻 cs.CV

keywords structural damage detectionsatellite imagerysuper-resolutionvision-language modelsdisaster assessmentbuilding localizationreference-free evaluationxBD dataset

0 comments

The pith

A hybrid AI pipeline that upsamples satellite images, detects buildings, and uses vision-language models to classify structural damage levels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out to show that low-resolution satellite imagery can be turned into usable semantic damage assessments by chaining super-resolution enhancement, object detection, and vision-language model reasoning. A sympathetic reader would care because faster, more interpretable damage maps after disasters could guide emergency crews toward the most affected structures without waiting for high-resolution ground surveys. The work demonstrates the approach on real post-disaster events from the xBD dataset and introduces reference-free scoring methods to evaluate the outputs when no caption ground truth exists.

Core claim

The central claim is that first applying a Video Restoration Transformer to increase image resolution from 1024x1024 to 4096x4096, then using a YOLOv11 detector to locate buildings in pre-disaster frames, and finally feeding the cropped regions to vision-language models for four-level semantic damage classification produces more interpretable results than prior pipelines; the authors support this by showing improved semantic alignment via CLIPScore and reduced bias through a multi-model jury voting procedure on Moore Tornado and Hurricane Matthew subsets of the xBD dataset.

What carries the argument

The three-stage pipeline that performs super-resolution upscaling, YOLOv11-based building localization, and vision-language model semantic scoring with CLIPScore reference-free evaluation plus jury voting.

If this is right

Damage assessments become available from standard-resolution satellite passes without requiring new high-resolution captures.
First responders receive explicit severity rankings and recovery recommendations derived from the semantic analysis.
The jury-voting step reduces the impact of any single vision-language model's biases in safety-critical outputs.
Reference-free metrics such as CLIPScore allow evaluation on new disaster events where labeled captions do not exist.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same staged approach could be tested on additional disaster types such as earthquakes or floods to check whether the resolution boost and semantic step generalize.
Connecting the pipeline to streaming satellite feeds might allow near-real-time damage mapping rather than post-event batch processing.
The jury mechanism could be extended to include human-in-the-loop overrides for the highest-severity cases.

Load-bearing premise

That vision-language model outputs scored by CLIPScore and jury voting reliably reflect actual structural damage severity when no ground-truth damage labels are available for validation.

What would settle it

A side-by-side comparison of the framework's four-level damage predictions against independent expert visual annotations or on-site inspection records for the same buildings in the Moore Tornado or Hurricane Matthew image sets.

Figures

Figures reproduced from arXiv: 2603.22768 by Bijay Shakya, Catherine Hoier, Khandaker Mamun Ahmed.

**Figure 2.** Figure 2: Workflow of the proposed multi-stage structural dam [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Overview of the Multi-VLM framework for disas [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Working mechanism of the VLM-As-A-Jury metric. [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Aggregated word clouds of VLM-generated damage de [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

read the original abstract

Rapid and accurate structural damage assessment following natural disasters is critical for effective emergency response and recovery. However, remote sensing imagery often suffers from low spatial resolution, contextual ambiguity, and limited semantic interpretability, reducing the reliability of traditional detection pipelines. In this work, we propose a novel hybrid framework that integrates AI-based super-resolution, deep learning object detection, and Vision-Language Models (VLMs) for comprehensive post-disaster building damage assessment. First, we enhance pre- and post-disaster satellite imagery using a Video Restoration Transformer (VRT) to upscale images from 1024x1024 to 4096x4096 resolution, improving structural detail visibility. Next, a YOLOv11-based detector localizes buildings in pre-disaster imagery, and cropped building regions are analyzed using VLMs to semantically assess structural damage across four severity levels. To ensure robust evaluation in the absence of ground-truth captions, we employ CLIPScore for reference-free semantic alignment and introduce a multi-model VLM-as-a-Jury strategy to reduce individual model bias in safety-critical decision making. Experiments on subsets of the xBD dataset, including the Moore Tornado and Hurricane Matthew events, demonstrate that the proposed framework enhances the semantic interpretation of damaged buildings. In addition, our framework provides helpful recommendations to first responders for recovery based on damage analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper chains VRT upsampling, YOLOv11 detection, and VLM semantics on xBD subsets but evaluates with unvalidated reference-free metrics instead of the dataset's own damage labels.

read the letter

The main contribution here is a sequential pipeline: VRT super-resolves the satellite images from 1024x1024 to 4096x4096, YOLOv11 finds buildings in the pre-event imagery, and VLMs then assign one of four damage levels to the cropped regions. They run this on xBD subsets from the Moore Tornado and Hurricane Matthew events and use CLIPScore plus a multi-VLM jury to score outputs when no caption ground truth exists. The jury step is a sensible practical choice for reducing single-model bias in a safety-critical setting, and the focus on real disaster events gives the work a clear applied angle. The writing is straightforward about the stages and the motivation for reference-free metrics. The soft spot is the evaluation. xBD supplies explicit per-building labels (no/minor/major/destroyed), yet the paper reports no accuracy, kappa, or confusion numbers showing how well the VLM jury or CLIPScore tracks those labels. Without that check or any baseline comparison, the claim that the framework enhances semantic interpretation rests on an untested assumption. The abstract states the experiments demonstrate improvement, but the missing alignment data leaves the strength of that demonstration unclear. This is the kind of applied CV work that might interest groups building tools for emergency responders who need interpretable outputs. It does not introduce new techniques or shift broader understanding of the components. I would not bring it to the next reading group. I would not cite it in the next twelve months. It does not look ready for peer review until the authors add the direct comparison to the existing ground-truth labels.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a multi-stage framework for post-disaster structural damage assessment from satellite imagery. It combines VRT-based super-resolution (1024x1024 to 4096x4096), YOLOv11 object detection on pre-disaster images to localize buildings, and VLM-based semantic analysis of cropped regions to classify damage into four severity levels. In the absence of ground-truth captions, evaluation relies on reference-free CLIPScore for semantic alignment and a multi-VLM 'jury' voting strategy to mitigate bias. Experiments on xBD subsets (Moore Tornado, Hurricane Matthew) claim that the pipeline enhances semantic interpretation of damaged buildings and can provide recommendations to first responders.

Significance. If the reference-free metrics were shown to align with actual damage severity, the work would offer a practical advance in automated disaster response by improving interpretability of low-resolution imagery through combined super-resolution, detection, and language-based reasoning. The jury strategy is a reasonable approach for safety-critical applications. However, the current lack of validation against available ground-truth labels substantially reduces the demonstrated significance.

major comments (1)

[Experiments] Experiments section: xBD provides explicit per-building ground-truth damage labels (no/minor/major/destroyed), yet the evaluation reports no quantitative alignment (accuracy, Cohen's kappa, or confusion matrix) between VLM outputs / jury votes and these labels. The claim that the framework 'enhances semantic interpretation' therefore rests on the untested assumption that CLIPScore and VLM consensus track factual damage severity; this is load-bearing for the central contribution.

minor comments (1)

[Abstract] Abstract and §4: The statement that the framework 'provides helpful recommendations to first responders' is asserted without any description of how recommendations are generated from the damage scores or any example outputs.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address the major comment point-by-point below and will incorporate revisions to strengthen the validation.

read point-by-point responses

Referee: [Experiments] Experiments section: xBD provides explicit per-building ground-truth damage labels (no/minor/major/destroyed), yet the evaluation reports no quantitative alignment (accuracy, Cohen's kappa, or confusion matrix) between VLM outputs / jury votes and these labels. The claim that the framework 'enhances semantic interpretation' therefore rests on the untested assumption that CLIPScore and VLM consensus track factual damage severity; this is load-bearing for the central contribution.

Authors: We agree that direct quantitative comparison to the xBD ground-truth damage labels would provide stronger validation of the framework's semantic outputs. Our original evaluation emphasized reference-free metrics (CLIPScore and jury consensus) due to the absence of ground-truth captions for the VLM-generated descriptions, but we acknowledge that alignment with the available per-building labels (no/minor/major/destroyed) is feasible and important. In the revised manuscript, we will add accuracy, Cohen's kappa, and confusion matrix results comparing the multi-VLM jury classifications to the xBD ground-truth labels on the Moore Tornado and Hurricane Matthew subsets. This will empirically test whether the semantic interpretations track factual damage severity. revision: yes

Circularity Check

0 steps flagged

No circularity: pipeline assembles independent components and reference-free metrics without self-referential reduction

full rationale

The paper describes a sequential pipeline (VRT super-resolution on 1024x1024 imagery to 4096x4096, YOLOv11 detection on pre-disaster images, VLM semantic labeling into four damage levels) evaluated via CLIPScore and multi-VLM jury voting. No equations, fitted parameters, or derivations are presented that reduce a claimed output to the input by construction. The justification for reference-free metrics is the explicit absence of ground-truth captions, not a loop back to the framework's own outputs. No self-citations appear in the provided text as load-bearing premises. The central claim of enhanced semantic interpretation therefore rests on external model capabilities and consensus scoring rather than any enumerated circular pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The approach depends on standard assumptions in AI about model generalization to damage assessment tasks and the utility of reference-free metrics.

axioms (2)

domain assumption Vision-Language Models can reliably assess structural damage severity from image crops
Invoked in the semantic assessment stage without ground truth.
domain assumption CLIPScore provides a valid proxy for semantic alignment in damage classification
Used for reference-free evaluation.

pith-pipeline@v0.9.0 · 5544 in / 1313 out tokens · 63273 ms · 2026-05-15T01:14:32.971086+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hybrid framework that integrates AI-based super-resolution, deep learning object detection, and Vision-Language Models (VLMs) ... CLIPScore ... VLM-as-a-Jury
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

CLIPScore for reference-free semantic alignment ... four severity levels

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

[1]

Computer vision framework for crack detec- tion of civil infrastructure—a review.Engineering Applica- tions of Artificial Intelligence, 117:105478, 2023

Dihao Ai, Guiyuan Jiang, Siew-Kei Lam, Peilan He, and Chengwu Li. Computer vision framework for crack detec- tion of civil infrastructure—a review.Engineering Applica- tions of Artificial Intelligence, 117:105478, 2023. 2

work page 2023
[2]

Integrating machine learn- ing and remote sensing in disaster management: A decadal review of post-disaster building damage assessment.Build- ings, 14(8):2344, 2024

Sultan Al Shafian and Da Hu. Integrating machine learn- ing and remote sensing in disaster management: A decadal review of post-disaster building damage assessment.Build- ings, 14(8):2344, 2024. 2

work page 2024
[3]

Shuai Bai, Yuxuan Cai, and Keming et. al. Zhu. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 5

work page internal anchor Pith review Pith/arXiv arXiv 2025
[4]

Mask-to-height: A yolov11-based archi- tecture for joint building instance segmentation and height classification from satellite imagery

Mahmoud El Hussieni, Bahadır K G ¨unt¨urk, Hasan F Ates ¸, and O˘guz Hano˘glu. Mask-to-height: A yolov11-based archi- tecture for joint building instance segmentation and height classification from satellite imagery. In2025 Innovations in Intelligent Systems and Applications Conference (ASYU), pages 1–6. IEEE, 2025. 4

work page 2025
[5]

Toward faster and accu- rate post-disaster damage assessment: Development of end- to-end building damage detection framework with super- resolution architecture

Xuanchao Fu, Toru Kouyama, Hang Yang, Ryosuke Naka- mura, and Ichiro Yoshikawa. Toward faster and accu- rate post-disaster damage assessment: Development of end- to-end building damage detection framework with super- resolution architecture. InIGARSS 2022-2022 IEEE Inter- national Geoscience and Remote Sensing Symposium, pages 1588–1591. IEEE, 2022. 2

work page 2022
[6]

xBD: A Dataset for Assessing Building Damage from Satellite Imagery, November 2019

Ritwik Gupta, Richard Hosfelt, Sandra Sajeev, Nirav Patel, Bryce Goodman, Jigar Doshi, Eric Heim, Howie Choset, and Matthew Gaston. xbd: A dataset for assessing building dam- age from satellite imagery.arXiv preprint arXiv:1911.09296,

work page arXiv 1911
[7]

xbd: A dataset for assessing building dam- age from satellite imagery, 2019

Ritwik Gupta, Richard Hosfelt, Sandra Sajeev, Nirav Patel, Bryce Goodman, Jigar Doshi, Eric Heim, Howie Choset, and Matthew Gaston. xbd: A dataset for assessing building dam- age from satellite imagery, 2019. 3

work page 2019
[8]

Clipscore: A reference-free evaluation met- ric for image captioning, 2022

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation met- ric for image captioning, 2022. 5

work page 2022
[9]

Super-resolution images methodology applied to uav datasets to road pavement mon- itoring.Drones, 6(7):171, 2022

Laura Inzerillo, Francesco Acuto, Gaetano Di Mino, and Mohammed Zeeshan Uddin. Super-resolution images methodology applied to uav datasets to road pavement mon- itoring.Drones, 6(7):171, 2022. 2

work page 2022
[10]

Building damage detection via superpixel-based belief fu- sion of space-borne sar and optical images.IEEE Sensors Journal, 20(4):2008–2022, 2019

Xiao Jiang, You He, Gang Li, Yu Liu, and Xiao-Ping Zhang. Building damage detection via superpixel-based belief fu- sion of space-borne sar and optical images.IEEE Sensors Journal, 20(4):2008–2022, 2019. 2

work page 2008
[11]

Zeshot-vqa: Zero-shot visual question answering framework with an- swer mapping for natural disaster damage assessment.arXiv preprint arXiv:2506.00238, 2025

Ehsan Karimi and Maryam Rahnemoonfar. Zeshot-vqa: Zero-shot visual question answering framework with an- swer mapping for natural disaster damage assessment.arXiv preprint arXiv:2506.00238, 2025. 2, 3

work page arXiv 2025
[12]

Jin Kim, Seungbo Shim, Seok-Jun Kang, and Gye-Chun Cho. Learning structure for concrete crack detection us- ing robust super-resolution with generative adversarial net- work.Structural Control and Health Monitoring, 2023(1): 8850290, 2023. 2

work page 2023
[13]

Umut Lagap and Saman Ghaffarian. Enhancing post-disaster damage detection and recovery monitoring by addressing class imbalance in satellite imagery using enhanced super- resolution gans (esrgan).The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 48:853–860, 2025. 2

work page 2025
[14]

Build- ing damage detection from post-event aerial imagery using single shot multibox detector.Applied Sciences, 9(6):1128,

Yundong Li, Wei Hu, Han Dong, and Xueyan Zhang. Build- ing damage detection from post-event aerial imagery using single shot multibox detector.Applied Sciences, 9(6):1128,

work page
[15]

Vrt: A video restoration transformer.arXiv preprint arXiv:2201.12288, 2022

Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, and Luc Van Gool. Vrt: A video restoration transformer.arXiv preprint arXiv:2201.12288, 2022. 4

work page arXiv 2022
[16]

Sentinel-1 change detection analysis for cyclone damage assessment in urban environments.Remote Sensing, 12(15):2409, 2020

David Malmgren-Hansen, Thomas Sohnesen, Peter Fisker, and Javier Baez. Sentinel-1 change detection analysis for cyclone damage assessment in urban environments.Remote Sensing, 12(15):2409, 2020. 2

work page 2020
[17]

Damage detection and localisation using uav/drone with object detection.Procedia Computer Science, 225:118– 127, 2023

Fabio Martinelli, Francesco Mercaldo, and Antonella San- tone. Damage detection and localisation using uav/drone with object detection.Procedia Computer Science, 225:118– 127, 2023. 2, 3

work page 2023
[18]

2024 tornado activity reached near-historic levels across the U.S.https://www.weather.gov/news/250703_ tornado_activity, 2024

National Oceanic and Atmospheric Administration. 2024 tornado activity reached near-historic levels across the U.S.https://www.weather.gov/news/250703_ tornado_activity, 2024. Accessed: Feb. 4, 2026. 1

work page 2024
[19]

National Oceanic and Atmospheric Administration and US National Weather Service. Number of lives lost due to tornadoes in the united states from 1995 to 2023.https://www.statista.com/statistics/ 203694 / number - of - fatalities - caused - by - tornadoes-in-the-us/, 2024. Statista (release date: May 2024). Accessed: 2026-02-28. 2

work page 1995
[20]

February 19 tor- nadoes and severe storms.https://www.weather

National Weather Service, Indianapolis, IN. February 19 tor- nadoes and severe storms.https://www.weather. gov/ind/feb192026severe, 2026. Accessed: 2026- 02-28. 1

work page 2026
[21]

Deep learning framework for infrastructure maintenance: Crack detection and high- resolution imaging of infrastructure surfaces.arXiv preprint arXiv:2505.03974, 2025

Nikhil M Pawar, Jorge A Prozzi, Feng Hong, and Surya Sarat Chandra Congress. Deep learning framework for infrastructure maintenance: Crack detection and high- resolution imaging of infrastructure surfaces.arXiv preprint arXiv:2505.03974, 2025. 2

work page arXiv 2025
[22]

Disaster recovery lessons learned from an occupational health and human resources perspective

Karen H Perce. Disaster recovery lessons learned from an occupational health and human resources perspective. AAOHN journal, 55(6):235–240, 2007. 2

work page 2007
[23]

Improving road damage detection accuracy using deep learning image enhancement models

Van Vung Pham. Improving road damage detection accuracy using deep learning image enhancement models. Technical report, Institute for Homeland Security, 2024. 2

work page 2024
[24]

Deep learning-based yolo network model for de- tecting surface cracks during structural health monitoring

Kumari Pratibha, Mayank Mishra, GV Ramana, and Paulo B Lourenc ¸o. Deep learning-based yolo network model for de- tecting surface cracks during structural health monitoring. In International Conference on Structural Analysis of Histori- cal Constructions, pages 179–187. Springer, 2023. 3

work page 2023
[25]

55 km nnw of kota belud, malaysia (event id: us6000sasz).https : / / www

QuakePulse. 55 km nnw of kota belud, malaysia (event id: us6000sasz).https : / / www . quakepulse . com/earthquake/us6000sasz/55-km-nnw-of- kota-belud-malaysia, 2026. Accessed: 2026-02-28. 1

work page 2026
[26]

Vlce: A knowledge- 9 enhanced framework for image description in disaster assess- ment.arXiv preprint arXiv:2509.21609, 2025

Md Mahfuzur Rahman, Kishor Datta Gupta, Marufa Ka- mal, Fahad Rahman, Sunzida Siddique, Ahmed Rafi Hasan, Mohd Ariful Haque, and Roy George. Vlce: A knowledge- 9 enhanced framework for image description in disaster assess- ment.arXiv preprint arXiv:2509.21609, 2025. 7

work page arXiv 2025
[27]

Damage detection in concrete structures with multi-feature backgrounds using the yolo network family.Automation in Construction, 170:105887, 2025

Rakesh Raushan, Vaibhav Singhal, and Rajib Kumar Jha. Damage detection in concrete structures with multi-feature backgrounds using the yolo network family.Automation in Construction, 170:105887, 2025. 2

work page 2025
[28]

Vqa-aid: Visual question answering for post-disaster damage assessment and analysis

Argho Sarkar and Maryam Rahnemoonfar. Vqa-aid: Visual question answering for post-disaster damage assessment and analysis. In2021 IEEE International Geoscience and Re- mote Sensing Symposium IGARSS, pages 8660–8663. IEEE,

work page
[29]

Gemma 3 Technical Report

Gemma Team and Google DeepMind. Gemma 3 technical report.arXiv preprint arXiv:2503.19786, 2025. 5

work page internal anchor Pith review Pith/arXiv arXiv 2025
[30]

Ac- celerating post-tornado disaster assessment using advanced deep learning models

Robinson Umeike, Thang Dao, and Shane Crawford. Ac- celerating post-tornado disaster assessment using advanced deep learning models. In2024 IEEE MetroCon, pages 1–3. IEEE, 2024. 3

work page 2024
[31]

Myanmar earthquake: One-month impact report (march–april 2025)

UNHCR. Myanmar earthquake: One-month impact report (march–april 2025). Impact report, United Nations High Commissioner for Refugees (UNHCR), United Nations in Myanmar, 2025. Accessed: 2026-02-28. 1

work page 2025
[32]

GAR 2025 hazard explorations: Earthquakes.https : / / www

United Nations Office for Disaster Risk Reduction. GAR 2025 hazard explorations: Earthquakes.https : / / www . undrr . org / gar / gar2025 / hazard - exploration/earthquakes, 2025. Accessed: Feb. 4,

work page 2025
[33]

Disasterm3: A remote sensing vision-language dataset for disaster damage assessment and response.arXiv preprint arXiv:2505.21089, 2025

Junjue Wang, Weihao Xuan, Heli Qi, Zhihao Liu, Kunyi Liu, Yuhan Wu, Hongruixuan Chen, Jian Song, Junshi Xia, Zhuo Zheng, et al. Disasterm3: A remote sensing vision-language dataset for disaster damage assessment and response.arXiv preprint arXiv:2505.21089, 2025. 3

work page arXiv 2025
[34]

Tropical cyclone gezani hits madagascar and threatens mozambique

World Meteorological Organization (WMO). Tropical cyclone gezani hits madagascar and threatens mozambique. https : / / wmo . int / media / news / tropical - cyclone - gezani - hits - madagascar - and - threatens- mozambique, 2026. Accessed: 2026-02-

work page 2026
[35]

Super-resolution reconstruction method of pavement crack images based on an improved generative ad- versarial network.Sensors, 22(23):9092, 2022

Bo Yuan, Zhaoyun Sun, Lili Pei, Wei Li, Minghang Ding, and Xueli Hao. Super-resolution reconstruction method of pavement crack images based on an improved generative ad- versarial network.Sensors, 22(23):9092, 2022. 2

work page 2022
[36]

Mar-yolo: multi-scale fea- ture adaptive selection and asymptotic pyramid for oriented building detection in remote sensing images.Scientific Re- ports, 2025

Yuzhe Zhao and Haizhong Qian. Mar-yolo: multi-scale fea- ture adaptive selection and asymptotic pyramid for oriented building detection in remote sensing images.Scientific Re- ports, 2025. 4

work page 2025
[37]

Sr- gan based super-resolution reconstruction of power inspec- tion images.Discover Applied Sciences, 6(12):639, 2024

Jianjun Zhou, Jianbo Zhang, Jiangang Jia, and Jie Liu. Sr- gan based super-resolution reconstruction of power inspec- tion images.Discover Applied Sciences, 6(12):639, 2024. 2 10

work page 2024

[1] [1]

Computer vision framework for crack detec- tion of civil infrastructure—a review.Engineering Applica- tions of Artificial Intelligence, 117:105478, 2023

Dihao Ai, Guiyuan Jiang, Siew-Kei Lam, Peilan He, and Chengwu Li. Computer vision framework for crack detec- tion of civil infrastructure—a review.Engineering Applica- tions of Artificial Intelligence, 117:105478, 2023. 2

work page 2023

[2] [2]

Integrating machine learn- ing and remote sensing in disaster management: A decadal review of post-disaster building damage assessment.Build- ings, 14(8):2344, 2024

Sultan Al Shafian and Da Hu. Integrating machine learn- ing and remote sensing in disaster management: A decadal review of post-disaster building damage assessment.Build- ings, 14(8):2344, 2024. 2

work page 2024

[3] [3]

Shuai Bai, Yuxuan Cai, and Keming et. al. Zhu. Qwen3-vl technical report.arXiv preprint arXiv:2511.21631, 2025. 5

work page internal anchor Pith review Pith/arXiv arXiv 2025

[4] [4]

Mask-to-height: A yolov11-based archi- tecture for joint building instance segmentation and height classification from satellite imagery

Mahmoud El Hussieni, Bahadır K G ¨unt¨urk, Hasan F Ates ¸, and O˘guz Hano˘glu. Mask-to-height: A yolov11-based archi- tecture for joint building instance segmentation and height classification from satellite imagery. In2025 Innovations in Intelligent Systems and Applications Conference (ASYU), pages 1–6. IEEE, 2025. 4

work page 2025

[5] [5]

Toward faster and accu- rate post-disaster damage assessment: Development of end- to-end building damage detection framework with super- resolution architecture

Xuanchao Fu, Toru Kouyama, Hang Yang, Ryosuke Naka- mura, and Ichiro Yoshikawa. Toward faster and accu- rate post-disaster damage assessment: Development of end- to-end building damage detection framework with super- resolution architecture. InIGARSS 2022-2022 IEEE Inter- national Geoscience and Remote Sensing Symposium, pages 1588–1591. IEEE, 2022. 2

work page 2022

[6] [6]

xBD: A Dataset for Assessing Building Damage from Satellite Imagery, November 2019

Ritwik Gupta, Richard Hosfelt, Sandra Sajeev, Nirav Patel, Bryce Goodman, Jigar Doshi, Eric Heim, Howie Choset, and Matthew Gaston. xbd: A dataset for assessing building dam- age from satellite imagery.arXiv preprint arXiv:1911.09296,

work page arXiv 1911

[7] [7]

xbd: A dataset for assessing building dam- age from satellite imagery, 2019

Ritwik Gupta, Richard Hosfelt, Sandra Sajeev, Nirav Patel, Bryce Goodman, Jigar Doshi, Eric Heim, Howie Choset, and Matthew Gaston. xbd: A dataset for assessing building dam- age from satellite imagery, 2019. 3

work page 2019

[8] [8]

Clipscore: A reference-free evaluation met- ric for image captioning, 2022

Jack Hessel, Ari Holtzman, Maxwell Forbes, Ronan Le Bras, and Yejin Choi. Clipscore: A reference-free evaluation met- ric for image captioning, 2022. 5

work page 2022

[9] [9]

Super-resolution images methodology applied to uav datasets to road pavement mon- itoring.Drones, 6(7):171, 2022

Laura Inzerillo, Francesco Acuto, Gaetano Di Mino, and Mohammed Zeeshan Uddin. Super-resolution images methodology applied to uav datasets to road pavement mon- itoring.Drones, 6(7):171, 2022. 2

work page 2022

[10] [10]

Building damage detection via superpixel-based belief fu- sion of space-borne sar and optical images.IEEE Sensors Journal, 20(4):2008–2022, 2019

Xiao Jiang, You He, Gang Li, Yu Liu, and Xiao-Ping Zhang. Building damage detection via superpixel-based belief fu- sion of space-borne sar and optical images.IEEE Sensors Journal, 20(4):2008–2022, 2019. 2

work page 2008

[11] [11]

Zeshot-vqa: Zero-shot visual question answering framework with an- swer mapping for natural disaster damage assessment.arXiv preprint arXiv:2506.00238, 2025

Ehsan Karimi and Maryam Rahnemoonfar. Zeshot-vqa: Zero-shot visual question answering framework with an- swer mapping for natural disaster damage assessment.arXiv preprint arXiv:2506.00238, 2025. 2, 3

work page arXiv 2025

[12] [12]

Jin Kim, Seungbo Shim, Seok-Jun Kang, and Gye-Chun Cho. Learning structure for concrete crack detection us- ing robust super-resolution with generative adversarial net- work.Structural Control and Health Monitoring, 2023(1): 8850290, 2023. 2

work page 2023

[13] [13]

Umut Lagap and Saman Ghaffarian. Enhancing post-disaster damage detection and recovery monitoring by addressing class imbalance in satellite imagery using enhanced super- resolution gans (esrgan).The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 48:853–860, 2025. 2

work page 2025

[14] [14]

Build- ing damage detection from post-event aerial imagery using single shot multibox detector.Applied Sciences, 9(6):1128,

Yundong Li, Wei Hu, Han Dong, and Xueyan Zhang. Build- ing damage detection from post-event aerial imagery using single shot multibox detector.Applied Sciences, 9(6):1128,

work page

[15] [15]

Vrt: A video restoration transformer.arXiv preprint arXiv:2201.12288, 2022

Jingyun Liang, Jiezhang Cao, Yuchen Fan, Kai Zhang, Rakesh Ranjan, Yawei Li, Radu Timofte, and Luc Van Gool. Vrt: A video restoration transformer.arXiv preprint arXiv:2201.12288, 2022. 4

work page arXiv 2022

[16] [16]

Sentinel-1 change detection analysis for cyclone damage assessment in urban environments.Remote Sensing, 12(15):2409, 2020

David Malmgren-Hansen, Thomas Sohnesen, Peter Fisker, and Javier Baez. Sentinel-1 change detection analysis for cyclone damage assessment in urban environments.Remote Sensing, 12(15):2409, 2020. 2

work page 2020

[17] [17]

Damage detection and localisation using uav/drone with object detection.Procedia Computer Science, 225:118– 127, 2023

Fabio Martinelli, Francesco Mercaldo, and Antonella San- tone. Damage detection and localisation using uav/drone with object detection.Procedia Computer Science, 225:118– 127, 2023. 2, 3

work page 2023

[18] [18]

2024 tornado activity reached near-historic levels across the U.S.https://www.weather.gov/news/250703_ tornado_activity, 2024

National Oceanic and Atmospheric Administration. 2024 tornado activity reached near-historic levels across the U.S.https://www.weather.gov/news/250703_ tornado_activity, 2024. Accessed: Feb. 4, 2026. 1

work page 2024

[19] [19]

National Oceanic and Atmospheric Administration and US National Weather Service. Number of lives lost due to tornadoes in the united states from 1995 to 2023.https://www.statista.com/statistics/ 203694 / number - of - fatalities - caused - by - tornadoes-in-the-us/, 2024. Statista (release date: May 2024). Accessed: 2026-02-28. 2

work page 1995

[20] [20]

February 19 tor- nadoes and severe storms.https://www.weather

National Weather Service, Indianapolis, IN. February 19 tor- nadoes and severe storms.https://www.weather. gov/ind/feb192026severe, 2026. Accessed: 2026- 02-28. 1

work page 2026

[21] [21]

Deep learning framework for infrastructure maintenance: Crack detection and high- resolution imaging of infrastructure surfaces.arXiv preprint arXiv:2505.03974, 2025

Nikhil M Pawar, Jorge A Prozzi, Feng Hong, and Surya Sarat Chandra Congress. Deep learning framework for infrastructure maintenance: Crack detection and high- resolution imaging of infrastructure surfaces.arXiv preprint arXiv:2505.03974, 2025. 2

work page arXiv 2025

[22] [22]

Disaster recovery lessons learned from an occupational health and human resources perspective

Karen H Perce. Disaster recovery lessons learned from an occupational health and human resources perspective. AAOHN journal, 55(6):235–240, 2007. 2

work page 2007

[23] [23]

Improving road damage detection accuracy using deep learning image enhancement models

Van Vung Pham. Improving road damage detection accuracy using deep learning image enhancement models. Technical report, Institute for Homeland Security, 2024. 2

work page 2024

[24] [24]

Deep learning-based yolo network model for de- tecting surface cracks during structural health monitoring

Kumari Pratibha, Mayank Mishra, GV Ramana, and Paulo B Lourenc ¸o. Deep learning-based yolo network model for de- tecting surface cracks during structural health monitoring. In International Conference on Structural Analysis of Histori- cal Constructions, pages 179–187. Springer, 2023. 3

work page 2023

[25] [25]

55 km nnw of kota belud, malaysia (event id: us6000sasz).https : / / www

QuakePulse. 55 km nnw of kota belud, malaysia (event id: us6000sasz).https : / / www . quakepulse . com/earthquake/us6000sasz/55-km-nnw-of- kota-belud-malaysia, 2026. Accessed: 2026-02-28. 1

work page 2026

[26] [26]

Vlce: A knowledge- 9 enhanced framework for image description in disaster assess- ment.arXiv preprint arXiv:2509.21609, 2025

Md Mahfuzur Rahman, Kishor Datta Gupta, Marufa Ka- mal, Fahad Rahman, Sunzida Siddique, Ahmed Rafi Hasan, Mohd Ariful Haque, and Roy George. Vlce: A knowledge- 9 enhanced framework for image description in disaster assess- ment.arXiv preprint arXiv:2509.21609, 2025. 7

work page arXiv 2025

[27] [27]

Damage detection in concrete structures with multi-feature backgrounds using the yolo network family.Automation in Construction, 170:105887, 2025

Rakesh Raushan, Vaibhav Singhal, and Rajib Kumar Jha. Damage detection in concrete structures with multi-feature backgrounds using the yolo network family.Automation in Construction, 170:105887, 2025. 2

work page 2025

[28] [28]

Vqa-aid: Visual question answering for post-disaster damage assessment and analysis

Argho Sarkar and Maryam Rahnemoonfar. Vqa-aid: Visual question answering for post-disaster damage assessment and analysis. In2021 IEEE International Geoscience and Re- mote Sensing Symposium IGARSS, pages 8660–8663. IEEE,

work page

[29] [29]

Gemma 3 Technical Report

Gemma Team and Google DeepMind. Gemma 3 technical report.arXiv preprint arXiv:2503.19786, 2025. 5

work page internal anchor Pith review Pith/arXiv arXiv 2025

[30] [30]

Ac- celerating post-tornado disaster assessment using advanced deep learning models

Robinson Umeike, Thang Dao, and Shane Crawford. Ac- celerating post-tornado disaster assessment using advanced deep learning models. In2024 IEEE MetroCon, pages 1–3. IEEE, 2024. 3

work page 2024

[31] [31]

Myanmar earthquake: One-month impact report (march–april 2025)

UNHCR. Myanmar earthquake: One-month impact report (march–april 2025). Impact report, United Nations High Commissioner for Refugees (UNHCR), United Nations in Myanmar, 2025. Accessed: 2026-02-28. 1

work page 2025

[32] [32]

GAR 2025 hazard explorations: Earthquakes.https : / / www

United Nations Office for Disaster Risk Reduction. GAR 2025 hazard explorations: Earthquakes.https : / / www . undrr . org / gar / gar2025 / hazard - exploration/earthquakes, 2025. Accessed: Feb. 4,

work page 2025

[33] [33]

Disasterm3: A remote sensing vision-language dataset for disaster damage assessment and response.arXiv preprint arXiv:2505.21089, 2025

Junjue Wang, Weihao Xuan, Heli Qi, Zhihao Liu, Kunyi Liu, Yuhan Wu, Hongruixuan Chen, Jian Song, Junshi Xia, Zhuo Zheng, et al. Disasterm3: A remote sensing vision-language dataset for disaster damage assessment and response.arXiv preprint arXiv:2505.21089, 2025. 3

work page arXiv 2025

[34] [34]

Tropical cyclone gezani hits madagascar and threatens mozambique

World Meteorological Organization (WMO). Tropical cyclone gezani hits madagascar and threatens mozambique. https : / / wmo . int / media / news / tropical - cyclone - gezani - hits - madagascar - and - threatens- mozambique, 2026. Accessed: 2026-02-

work page 2026

[35] [35]

Super-resolution reconstruction method of pavement crack images based on an improved generative ad- versarial network.Sensors, 22(23):9092, 2022

Bo Yuan, Zhaoyun Sun, Lili Pei, Wei Li, Minghang Ding, and Xueli Hao. Super-resolution reconstruction method of pavement crack images based on an improved generative ad- versarial network.Sensors, 22(23):9092, 2022. 2

work page 2022

[36] [36]

Mar-yolo: multi-scale fea- ture adaptive selection and asymptotic pyramid for oriented building detection in remote sensing images.Scientific Re- ports, 2025

Yuzhe Zhao and Haizhong Qian. Mar-yolo: multi-scale fea- ture adaptive selection and asymptotic pyramid for oriented building detection in remote sensing images.Scientific Re- ports, 2025. 4

work page 2025

[37] [37]

Sr- gan based super-resolution reconstruction of power inspec- tion images.Discover Applied Sciences, 6(12):639, 2024

Jianjun Zhou, Jianbo Zhang, Jiangang Jia, and Jie Liu. Sr- gan based super-resolution reconstruction of power inspec- tion images.Discover Applied Sciences, 6(12):639, 2024. 2 10

work page 2024