DisasterBench is a new multi-stage multimodal reasoning benchmark for UAV disaster response with 14 scenes and 9 tasks; the accompanying 2B DisasterVL model outperforms open-source MLLMs and approaches GPT-4o efficiency.
Rawat, Disasterqa: A benchmark for assessing the performance of llms in disaster response, arXiv preprint arXiv:2410.20707 (2024)
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
RAPID is a multi-agent pipeline for zero-shot interpretable damage assessment and reporting from cross-view satellite and street-view imagery across multiple disaster types.
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.
citing papers explorer
-
DisasterBench: A Multimodal Benchmark for UAV-Based Disaster Response in Complex Environments
DisasterBench is a new multi-stage multimodal reasoning benchmark for UAV disaster response with 14 scenes and 9 tasks; the accompanying 2B DisasterVL model outperforms open-source MLLMs and approaches GPT-4o efficiency.
-
RAPID: A Reproducible Multi-Agent Pipeline for Interpretable Disaster Damage Assessment from Satellite and Street-View Imagery
RAPID is a multi-agent pipeline for zero-shot interpretable damage assessment and reporting from cross-view satellite and street-view imagery across multiple disaster types.
-
A Survey of Scaling in Large Language Model Reasoning
A survey categorizing scaling in LLM reasoning across input size, steps, rounds, training, and future directions, noting that scaling can negatively affect performance.