pith. sign in

arxiv: 2606.21819 · v1 · pith:J2376PDInew · submitted 2026-06-20 · 💻 cs.CV

RAPID: A Reproducible Multi-Agent Pipeline for Interpretable Disaster Damage Assessment from Satellite and Street-View Imagery

Pith reviewed 2026-06-26 12:43 UTC · model grok-4.3

classification 💻 cs.CV
keywords multi-agent systemsdisaster damage assessmentzero-shot learningsatellite imagerystreet-view imagerycross-view analysisinterpretable AIemergency response
0
0 comments X

The pith

A multi-agent pipeline performs zero-shot disaster damage assessment by coordinating agents across satellite and street-view images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces RAPID as a pipeline that uses multiple specialized agents to analyze pre- and post-disaster imagery without any task-specific training or fine-tuning. It integrates cross-view data to classify disaster types such as hurricanes and floods, predict damage severity levels, and generate interpretable reports with actionable suggestions. The approach addresses limitations of supervised methods that fail under domain shifts and heterogeneous data sources. A reader would care if the method allows scalable, autonomous assessment in new disaster scenarios where labeled data is scarce.

Core claim

RAPID coordinates specialized agents to perform cross-view understanding, image restoration, structured damage recognition, and geographical reasoning across heterogeneous modalities including pre- and post-disaster street-view images and post-disaster remote sensing imagery. Without task-specific fine-tuning, the system supports zero-shot damage assessment for multiple disaster types and produces fine-grained, interpretable outputs along with location-specific reports.

What carries the argument

A multi-agent pipeline that coordinates agents for cross-view understanding, image restoration, structured damage recognition, and geographical reasoning to fuse satellite and street-view data.

If this is right

  • The system produces assessments for hurricanes, floods, wildfires, and earthquakes using mixed pre- and post-event imagery.
  • It generates location-specific reports that combine damage type, severity, and response suggestions.
  • Accuracy reaches 0.92 for classifying disaster types and 0.627 for cross-view severity prediction across tested scenarios.
  • The pipeline operates on heterogeneous sources including remote sensing and ground-level views without retraining.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the coordination mechanism holds, similar agent structures could be tested on other multimodal geospatial tasks such as change detection in urban areas.
  • The zero-shot property suggests the approach may lower data collection costs for new regions, though this requires separate validation on unseen geographies.
  • Extending the pipeline to include temporal sequences of images could be examined to track damage progression over time.

Load-bearing premise

The specialized agents can coordinate effectively to integrate information from different image types and reason about damage without any additional training on disaster-specific data.

What would settle it

Running the pipeline on a held-out disaster event with new image pairs and finding that multi-disaster classification accuracy falls to near-random levels while damage severity predictions show no correlation with ground truth would falsify the zero-shot claim.

Figures

Figures reproduced from arXiv: 2606.21819 by Hao Li, Kaili Zhang, Lei Zou, Wenjing Gong, Xinyue Ye, Yifan Yang, Zhengzhong Tu, Zongrong Li.

Figure 1
Figure 1. Figure 1: RAPID: an autonomous multi-agent framework for disaster damage assessment. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Geographic locations of disasters used in the evalu [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Key statistics of the dataset. post-disaster street-view images from multiple hazards and sce￾narios. These datasets provide complementary viewing structures, temporal structures, damage levels, and disaster scenarios for the multi-agent disaster diagnosis pipeline [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: Confusion matrices of the best-performing model [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visual comparison of restoration outputs produced [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Object detection results of Gemini-3-Pro on a post [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparative cross-view disaster assessment and reasoning results across Gemini-3-Pro, Gemini-2.5-Pro, and ChatGPT [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: LLM and human evaluation of disaster reasoning. [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗
read the original abstract

Due to the increasing frequency and intensity of extreme climate events, there is a clear demand for intelligent, scalable, and autonomous approaches to disaster damage assessment. Existing methods, largely based on supervised learning and task-specific fine-tuning, struggle to generalize under domain shifts, long-tailed data distributions, and heterogeneous geospatial data sources, especially in disaster scenarios. They also often lack the ability to integrate and reason across multimodal geospatial information, such as satellite images and street-view images. In this paper, we introduce RAPID, a reproducible multi-agent pipeline for interpretable disaster damage assessment, including damage-level assessment, damage-type interpretation, and actionable suggestions for response, remediation, and recovery. RAPID coordinates specialized agents to perform cross-view understanding, image restoration, structured damage recognition, and geographical reasoning across heterogeneous data modalities. Without task-specific fine-tuning, RAPID supports zero-shot damage assessment by jointly using complementary information from remote sensing and ground-level perspectives. The system produces fine-grained, interpretable assessments and automatically generates location-specific, decision-relevant disaster reports to support early-stage emergency response. We evaluate RAPID across hurricanes, floods, wildfires, and earthquakes using multiple cross-view imagery inputs, including pre- and post-disaster street-view images, post-disaster remote sensing imagery, and street-view image pairs. Experiments show that RAPID achieves 0.92 overall accuracy for multi-disaster type classification and up to 0.627 for cross-view damage severity prediction, highlighting its potential as a foundational framework for autonomous disaster intelligence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces RAPID, a reproducible multi-agent pipeline for interpretable zero-shot disaster damage assessment from satellite and street-view imagery. Specialized agents handle cross-view understanding, image restoration, structured damage recognition, and geographical reasoning across heterogeneous modalities without task-specific fine-tuning. The system generates fine-grained assessments and location-specific reports. Experiments on hurricanes, floods, wildfires, and earthquakes report 0.92 overall accuracy for multi-disaster type classification and up to 0.627 for cross-view damage severity prediction.

Significance. If the zero-shot results are shown to arise from genuine agent-based cross-view reasoning rather than base-model leakage, the work could offer a useful framework for scalable, interpretable autonomous disaster intelligence that addresses domain-shift issues in supervised geospatial methods. The explicit emphasis on reproducibility and the multi-agent design for integrating remote-sensing and ground-level data are strengths that could support follow-on research.

major comments (1)
  1. [Abstract] Abstract: The central zero-shot generalization claim (0.92 classification accuracy and 0.627 severity prediction) is load-bearing for the contribution, yet the abstract supplies no details on the specific base VLMs employed, their training cutoffs, or any verification that the evaluation imagery (pre/post street-view and satellite pairs) does not overlap with publicly scraped disaster datasets commonly used in VLM pre-training. Without this information the reported numbers cannot be confirmed to demonstrate the claimed pipeline-driven generalization.
minor comments (2)
  1. A diagram or pseudocode listing the agent coordination protocol and information flow would clarify how the specialized agents jointly perform cross-view reasoning.
  2. The evaluation description should explicitly state the number of samples, class balance, and train/test split protocol for each disaster type to allow reproduction.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback on the abstract and the zero-shot claims. We address the point below and will revise the manuscript to improve transparency.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central zero-shot generalization claim (0.92 classification accuracy and 0.627 severity prediction) is load-bearing for the contribution, yet the abstract supplies no details on the specific base VLMs employed, their training cutoffs, or any verification that the evaluation imagery (pre/post street-view and satellite pairs) does not overlap with publicly scraped disaster datasets commonly used in VLM pre-training. Without this information the reported numbers cannot be confirmed to demonstrate the claimed pipeline-driven generalization.

    Authors: We agree the abstract should be more explicit to support the zero-shot claims. In revision we will name the base VLMs (e.g., the specific models powering each agent), note their publicly reported training cutoffs, and briefly state that evaluation imagery is drawn from post-cutoff disaster events. We will also add a short methods paragraph clarifying that performance derives from the multi-agent cross-view reasoning pipeline rather than single-model memorization. These additions preserve abstract length while addressing the concern directly. revision: yes

standing simulated objections not resolved
  • Complete verification that no evaluation images appear in the proprietary training sets of the base VLMs, as full training corpora are not disclosed by model providers.

Circularity Check

0 steps flagged

No derivation chain or equations; purely empirical pipeline evaluation

full rationale

The paper presents RAPID as an empirical multi-agent system for zero-shot disaster assessment using existing VLMs, with reported accuracies (0.92 classification, 0.627 severity) obtained via evaluation on cross-view imagery datasets. No mathematical derivations, equations, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The central claims rest on experimental results rather than any chain that reduces to its own inputs by construction. This is a standard non-circular empirical systems paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities are identifiable from the provided text.

pith-pipeline@v0.9.1-grok · 5832 in / 1095 out tokens · 34151 ms · 2026-06-26T12:43:20.013464+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 30 canonical work pages · 2 internal anchors

  1. [1]

    Kyeongjin Ahn, Sungwon Han, Sungwon Park, Jihee Kim, Sangyoon Park, and Meeyoung Cha. 2025. Generalizable Disaster Damage Assessment via Change Detection with Vision Foundation Model.Proceedings of the AAAI Conference on Artificial Intelligence39, 27 (2025), 27784–27792. doi:10.1609/aaai.v39i27.34994

  2. [2]

    Semantic enhancement and change consistency network for semantic change detection in remote sensing images,

    Temitope Akinboyewa, Zhenlong Li, Huan Ning, and M. Naser Lessani. 2025. GIS Copilot: Towards an Autonomous GIS Agent for Spatial Analysis.International Journal of Digital Earth18, 1 (2025), 2497489. doi:10.1080/17538947.2025.2497489

  3. [3]

    Irene Alisjahbana, Jiawei Li, Ben Strong, and Yue Zhang. 2024. DeepDama- geNet: A Two-Step Deep-Learning Model for Multi-Disaster Building Dam- age Segmentation and Classification Using Satellite Imagery. arXiv:2405.04800 doi:10.48550/arXiv.2405.04800

  4. [4]

    Aaron Bell, Amit Aides, Amr Helmy, Arbaaz Muslim, Aviad Barzilai, Aviv Slo- bodkin, Bolous Jaber, David Schottlander, George Leifman, Joydeep Paul, Mimi Sun, Nadav Sherman, Natalie Williams, Per Bjornsson, Roy Lee, Ruth Alcantara, Thomas Turnbull, Tomer Shekel, Vered Silverman, et al. 2025. Earth AI: Unlock- ing Geospatial Insights with Foundation Models ...

  5. [5]

    Hongruixuan Chen, Jian Song, Oskar Dietrich, Cynthia Broni-Bediako, Wei- hao Xuan, Junjue Wang, Xiaoyan Shao, Yuxuan Wei, Junshi Xia, Chao Lan, Konrad Schindler, and Naoto Yokoya. 2025. Bright: A Globally Distributed Multimodal Building Damage Assessment Dataset with Very-High-Resolution for All-Weather Disaster Response.Earth System Science Data17, 11 (2...

  6. [6]

    Zhaohui Chen, Elyas Asadi Shamsabadi, Sheng Jiang, Luming Shen, and Daniel Dias-da Costa. 2024. Integration of Large Vision Language Models for Efficient Post-Disaster Damage Assessment and Reporting. arXiv:2411.01511 doi:10.48550/ arXiv.2411.01511

  7. [7]

    Ke Cui and Ziqiang Han. 2019. Association between Disaster Experience and Quality of Life: The Mediating Role of Disaster Risk Perception.Quality of Life Research28, 2 (2019), 509–513. doi:10.1007/s11136-018-2011-4

  8. [8]

    Zane Durante, Qiuyuan Huang, Naoki Wake, Ran Gong, Jae Sung Park, Bidipta Sarkar, Rohan Taori, Yusuke Noda, Demetri Terzopoulos, Yejin Choi, Katsushi Ikeuchi, Hoi Vo, Li Fei-Fei, and Jianfeng Gao. 2024. Agent AI: Surveying the Horizons of Multimodal Interaction. arXiv:2401.03568 doi:10.48550/arXiv.2401. 03568

  9. [9]

    Çiçek Ediz and Derya Yanik. 2024. Disaster Preparedness Perception, Psycholog- ical Resiliences and Empathy Levels of Nurses after 2023 Great Turkiye Earth- quake: Are Nurses Prepared for Disasters: A Risk Management Study.Public Health Nursing41, 1 (2024), 164–174. doi:10.1111/phn.13267

  10. [10]

    Google AI for Developers. 2026. Gemini API Rate Limits. https://ai.google.dev/ gemini-api/docs/rate-limits Accessed May 17, 2026

  11. [11]

    Google AI for Developers. 2026. Gemini Developer API Pricing. https://ai.google. dev/gemini-api/docs/pricing Accessed May 17, 2026

  12. [12]

    D. L. Gu, Q. W. Shuai, N. Zhang, N. Jin, Z. X. Zheng, Z. Xu, and Y. J. Xu. 2025. Multi-View Street View Image Fusion for City-Scale Assessment of Wind Damage to Building Clusters.Computer-Aided Civil and Infrastructure Engineering40, 2 (2025), 198–214. doi:10.1111/mice.13324

  13. [13]

    Yaoyao Han, Jiping Liu, An Luo, Yong Wang, and Shuai Bao. 2025. Fine-Tuning LLM-Assisted Chinese Disaster Geospatial Intelligence Extraction and Case Studies.ISPRS International Journal of Geo-Information14, 2 (2025), 79. doi:10. 3390/ijgi14020079

  14. [14]

    Ming-Chou Ho, Daigee Shaw, Shuyeu Lin, and Yao-Chu Chiu. 2008. How Do Disaster Characteristics Influence Risk Perception?Risk Analysis28, 3 (2008), 635–643. doi:10.1111/j.1539-6924.2008.01040.x

  15. [15]

    Norman Kerle. 2024. Disasters. InRemote Sensing Handbook, Volume VI(2 ed.), Prasad S. Thenkabail (Ed.). CRC Press, Boca Raton, FL, 153–198. doi:10.1201/ 9781003541417-8

  16. [16]

    Saad Mazhar Khan, Imran Shafi, Wasi Haider Butt, Isabel de la Torre Diez, Miguel Angel López Flores, Juan Castanedo Galán, and Imran Ashraf. 2023. A Systematic Review of Disaster Management Systems: Approaches, Challenges, and Future Directions.Land12, 8 (2023), 1514. doi:10.3390/land12081514

  17. [17]

    Chandni Kirpalani. 2024. Technology-Driven Approaches to Enhance Disaster Re- sponse and Recovery. InGeospatial Technology for Natural Resource Management, Shruti Kanga, Gowhar Meraj, Suraj Kumar Singh, Majid Farooq, and Mahendra S. Nathawat (Eds.). Wiley, Hoboken, NJ, 25–81. doi:10.1002/9781394167494.ch2

  18. [18]

    Zhenyu Lei, Yushun Dong, Weiyu Li, Rong Ding, Qi Wang, and Jundong Li

  19. [19]

    arXiv:2501.06932 doi:10.48550/arXiv.2501.06932

    Harnessing Large Language Models for Disaster Management: A Survey. arXiv:2501.06932 doi:10.48550/arXiv.2501.06932

  20. [20]

    Hao Li, Fabian Deuser, Wenping Yin, Xuanshu Luo, Paul Walther, Gengchen Mai, Wei Huang, and Martin Werner. 2025. Cross-View Geolocalization and Disaster Mapping with Street-View and VHR Satellite Imagery: A Case Study of Hurricane Ian.ISPRS Journal of Photogrammetry and Remote Sensing220 (2025), 841–854. doi:10.1016/j.isprsjprs.2025.01.003

  21. [21]

    Zongyue Li, Hui Li, Yifan Yang, Siqin Wang, and Yao Zhu. 2025. Integrating Earth Observation Data into the Tri-Environmental Evaluation of the Economic Cost of Natural Disasters: A Case Study of 2025 LA Wildfire. arXiv:2505.01721 doi:10.48550/arXiv.2505.01721

  22. [22]

    Los Angeles County Fire Department. 2025. DINS 2025 Pal- isades Public View. https://www.arcgis.com/home/item.html?id= c336759e45764c45861a1e62c4c5e2db

  23. [23]

    Zihui Ma, Lingyao Li, Juan Li, Wenyue Hua, Jingxiao Liu, Qingyuan Feng, and Yuki Miura. 2025. A Multimodal, Multilingual, and Multidimensional Pipeline for Fine-Grained Crowdsourcing Earthquake Damage Evaluation. arXiv:2506.03360 doi:10.48550/arXiv.2506.03360

  24. [24]

    Renato Miceli, Igor Sotgiu, and Michele Settanni. 2008. Disaster Preparedness and Perception of Flood Risk: A Study in an Alpine Valley in Italy.Journal of Environmental Psychology28, 2 (2008), 164–173. doi:10.1016/j.jenvp.2007.10.006

  25. [25]

    OpenAI. 2026. OpenAI API Pricing. https://openai.com/api/pricing/ Accessed May 17, 2026

  26. [26]

    OpenAI. 2026. Rate Limits. https://developers.openai.com/api/docs/guides/rate- limits Accessed May 17, 2026

  27. [27]

    Behnood Rasti, Yi Chang, Emanuele Dalsasso, Loic Denis, and Pedram Ghamisi

  28. [28]

    Image restoration for remote sensing: Overview and toolbox.IEEE Geo- science and Remote Sensing Magazine10, 2 (2021), 201–230

  29. [29]

    Rajat Rawat. 2024. DisasterQA: A Benchmark for Assessing the Performance of LLMs in Disaster Response. arXiv:2410.20707 doi:10.48550/arXiv.2410.20707

  30. [30]

    Yimin Sun, Chao Wang, and Yan Peng. 2023. Unleashing the Potential of Large Language Model: Zero-Shot VQA for Flood Disaster Scenario. arXiv:2312.01882 doi:10.48550/arXiv.2312.01882

  31. [31]

    Junjue Wang, Weihao Xuan, Heli Qi, Zhihao Liu, Kunyi Liu, Yuhan Wu, Hon- gruixuan Chen, Jian Song, Junshi Xia, Zhuo Zheng, and Naoto Yokoya. 2025. DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response. arXiv:2505.21089 doi:10.48550/arXiv.2505.21089

  32. [32]

    Qinqin Wang, Hao Gu, Xuyang Zang, Mengfei Zuo, and Haifeng Li. 2025. Flood Resilience in Cities and Urban Agglomerations: A Systematic Review of Hazard Causes, Assessment Frameworks, and Recovery Strategies Based on LLM Tools. Natural Hazards121, 11 (2025), 12391–12426. doi:10.1007/s11069-025-07285-1

  33. [33]

    Papadopoulos, Aritro Biswas, Agata Lapedriza, Ferda Ofli, Muhammad Imran, and Antonio Torralba

    Ethan Weber, Nuria Marzo, Dim P. Papadopoulos, Aritro Biswas, Agata Lapedriza, Ferda Ofli, Muhammad Imran, and Antonio Torralba. 2020. Detecting Natural Disasters, Damage, and Incidents in the Wild. InComputer Vision - ECCV 2020. Springer International Publishing, Cham, 331–350. doi:10.1007/978-3-030-58529- 7_20

  34. [34]

    Yifan Yang, Siqin Wang, Daoyang Li, Shuju Sun, and Qingyang Wu. 2024. Ge- oLocator: A location-integrated large multimodal model (LMM) for inferring geo-privacy.Applied Sciences14, 16 (2024), 7091

  35. [35]

    Yifan Yang, Lei Zou, Wenjing Gong, Kani Fu, Zongrong Li, Siqin Wang, Bing Zhou, Heng Cai, and Hao Tian. 2026. DamageArbiter: A CLIP-Enhanced Multimodal Arbitration Framework for Hurricane Damage Assessment from Street-View Imagery. arXiv:2603.14837 doi:10.48550/arXiv.2603.14837

  36. [36]

    Yifan Yang, Lei Zou, and Wendy Jepson. 2026. Satellite-to-Street: Synthesiz- ing Post-Disaster Views from Satellite Imagery via Generative Vision Models. arXiv:2603.20697 doi:10.48550/arXiv.2603.20697

  37. [37]

    Yifan Yang, Lei Zou, Bing Zhou, Daoyang Li, Binbin Lin, Joynal Abedin, and Mingzheng Yang. 2025. Hyperlocal Disaster Damage Assessment Using Bi- Temporal Street-View Imagery and Pre-Trained Vision Models.Computers, Environment and Urban Systems121 (2025), 102335. doi:10.1016/j.compenvurbsys. 2025.102335

  38. [38]

    Wenping Yin, Yang Xue, Zhiyong Liu, Hao Li, and Martin Werner. 2025. LLM- Enhanced Disaster Geolocalization Using Implicit Geoinformation from Multi- modal Data: A Case Study of Hurricane Harvey.International Journal of Applied Earth Observation and Geoinformation137 (2025), 104423. doi:10.1016/j.jag.2025. 104423

  39. [39]

    Miao Yu, Chaowei Yang, and Yun Li. 2018. Big Data in Natural Disaster Manage- ment: A Review.Geosciences8, 5 (2018), 165. doi:10.3390/geosciences8050165

  40. [40]

    Bing Zhou, Lei Zou, Mingzheng Yang, Binbin Lin, Debayan Mandal, Joynal Abe- din, Heng Cai, Shuiwang Ji, Andrew Klein, and Hao Tian. 2026. Rapid Disaster Re- sponse and Damage Estimation with Social Media and Pretrained Large Language Models: Insights from Multiple Hurricanes.Annals of the American Association of Geographers116, 3 (2026), 501–523. doi:10.1...

  41. [41]

    Lei Zou, Donghui Liao, Nina S. N. Lam, Michelle A. Meyer, Nasir G. Gharaibeh, Heng Cai, Bing Zhou, and Daoyang Li. 2023. Social Media for Emergency Rescue: An Analysis of Rescue Requests on Twitter during Hurricane Harvey. International Journal of Disaster Risk Reduction85 (2023), 103513. doi:10.1016/j. ijdrr.2022.103513

  42. [42]

    Lei Zou, Ali Mostafavi, Bing Zhou, Binbin Lin, Debayan Mandal, Mingzheng Yang, Joynal Abedin, and Heng Cai. 2023. GeoAI for Disaster Response. In Handbook of Geospatial Artificial Intelligence. CRC Press, Boca Raton, FL

  43. [43]

    4kagent: agentic any image to 4k super-resolution.arXiv preprint arXiv:2507.07105,

    Yushen Zuo, Qi Zheng, Mingyang Wu, Xinrui Jiang, Renjie Li, Jian Wang, Yide Zhang, Gengchen Mai, Lihong V. Wang, James Zou, Xiaoyu Wang, Ming-Hsuan Yang, and Zhengzhong Tu. 2025. 4KAgent: Agentic Any Image to 4K Super- Resolution. arXiv:2507.07105 doi:10.48550/arXiv.2507.07105 11 Yifan Yang, Wenjing Gong, Kaili Zhang, Lei Zou, Zhengzhong Tu, Hao Li, Zongr...