pith. sign in

arxiv: 1906.10607 · v1 · pith:X6JASD2Fnew · submitted 2019-06-25 · 💻 cs.IR · cs.CL· cs.SI

Newswire versus Social Media for Disaster Response and Recovery

Pith reviewed 2026-05-25 16:00 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.SI
keywords disaster responsesocial medianewswiresituational awarenessNepal earthquakestext summarizationtimelinesscomplementary sources
0
0 comments X

The pith

Tweets and newswire articles provide complementary perspectives that together form a holistic view of disaster situations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks whether first responders can obtain adequate situational awareness from Twitter alone, newswire alone, or both during disasters, using the 2015 Nepal Earthquakes as the test case. It introduces a linking method to match public tweets with news articles, evaluates several unsupervised summarization techniques on the longer articles, and compares the two sources on timeliness and content. Tweets are also treated as candidate summaries of their linked articles. The main result is that the sources differ in speed and focus, so their combination supplies a more complete picture than either provides separately.

Core claim

In the 2015 Nepal Earthquakes, tweets written by the public and newswire articles supply complementary perspectives that together form a holistic view of the disaster situation. The linking method permits direct comparison showing differences in when information appears and what aspects it covers. Treating matching tweets as summaries of the corresponding news articles further demonstrates how each medium captures distinct elements of events, needs, and impacts.

What carries the argument

A method for linking tweets to newswire articles that enables comparison of timeliness and content, with relevant tweets viewed as summaries of the matched articles.

Load-bearing premise

The method for linking tweets to newswire articles correctly identifies corresponding content without introducing substantial matching errors that would distort the timeliness and content comparisons.

What would settle it

A manual audit of the linked tweet-article pairs that finds a high rate of incorrect matches, or data from another disaster event in which the two sources show largely overlapping rather than complementary information.

Figures

Figures reproduced from arXiv: 1906.10607 by Azadeh Shakery, Daniel Lee, Omprakash Gnawali, Rakesh Verma, Samaneh Karimi.

Figure 1
Figure 1. Figure 1: The number of tweets written about the Nepal earthquake over time. [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Task Prompt for Human Annotators. 3.2.3. Preprocessing For consistency all data analyzed in conjunction with the News Dataset used the following preprocessing steps: • News article content was parsed into sentences, and then each sentence into word tokens. • Stopwords (words with low information value) were removed. 3Annotator H extractive summaries missing for 3 documents due to unforeseen issues. 10 [PI… view at source ↗
Figure 3
Figure 3. Figure 3: The word clouds representing summaries generated by PKUSUMSUM-Centroid [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The word clouds representing summaries generated by PKUSUMSUM-Lead method [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The histogram of the temporal distances of the matched pairs annotated as relevant. [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: The histogram of the temporal distances of the matched pairs annotated as partially [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: The events obtained by GSDMM method using the tweets that appeared before [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The word cloud representation of the news articles’ human annotated summaries. [PITH_FULL_IMAGE:figures/full_fig_p023_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The word cloud representation of the news articles’ content. [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The word cloud representation of the tweets’ content that were annotated as [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The word cloud representation of the tweets’ content that were annotated as [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The word cloud representation of the news set of words subtracted by the tweets [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The word cloud representation of the tweets set of words subtracted by the news [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗
read the original abstract

In a disaster situation, first responders need to quickly acquire situational awareness and prioritize response based on the need, resources available and impact. Can they do this based on digital media such as Twitter alone, or newswire alone, or some combination of the two? We examine this question in the context of the 2015 Nepal Earthquakes. Because newswire articles are longer, effective summaries can be helpful in saving time yet giving key content. We evaluate the effectiveness of several unsupervised summarization techniques in capturing key content. We propose a method to link tweets written by the public and newswire articles, so that we can compare their key characteristics: timeliness, whether tweets appear earlier than their corresponding news articles, and content. A novel idea is to view relevant tweets as a summary of the matching news article and evaluate these summaries. Whenever possible, we present both quantitative and qualitative evaluations. One of our main findings is that tweets and newswire articles provide complementary perspectives that form a holistic view of the disaster situation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper studies the 2015 Nepal Earthquakes to compare newswire articles and tweets for disaster situational awareness. It evaluates several unsupervised summarization methods on newswire, proposes an (unspecified) linking procedure between tweets and articles, compares the two sources on timeliness and content overlap, treats relevant tweets as summaries of matched articles, and concludes that the sources supply complementary perspectives that together yield a holistic view.

Significance. If the linking procedure is shown to be accurate, the work supplies a concrete, data-driven argument for combining social media and traditional news in crisis informatics, together with an empirical test of tweet-as-summary that could be reused in other event-monitoring settings. The absence of any reported link-validation statistics, however, leaves the central complementarity claim unsupported.

major comments (1)
  1. [linking procedure (abstract, §4)] The linking procedure is the load-bearing component for all timeliness and content-comparison results (abstract and §4). No precision, recall, or manual-validation figures are supplied for the generated tweet–article pairs; without them the reported differences in timeliness and the tweet-as-summary evaluation cannot be interpreted.
minor comments (2)
  1. [abstract] The abstract states that both quantitative and qualitative evaluations are presented “whenever possible,” yet the reader is given no indication of which results fall into each category.
  2. [summarization evaluation] The unsupervised summarization baselines are introduced without reference to the specific implementations or parameter settings used.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for highlighting the importance of validating the linking procedure. We agree that this is a critical point and will strengthen the manuscript accordingly.

read point-by-point responses
  1. Referee: [linking procedure (abstract, §4)] The linking procedure is the load-bearing component for all timeliness and content-comparison results (abstract and §4). No precision, recall, or manual-validation figures are supplied for the generated tweet–article pairs; without them the reported differences in timeliness and the tweet-as-summary evaluation cannot be interpreted.

    Authors: We acknowledge that the manuscript does not report precision, recall, or manual validation statistics for the tweet–article linking procedure described in Section 4. This omission limits the interpretability of the timeliness and content-overlap results. In the revised version we will (1) provide a more explicit description of the linking criteria, (2) report the results of a manual validation study on a random sample of 200 candidate pairs (including precision and recall), and (3) discuss any limitations of the linking method. These additions will directly support the complementarity claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison with independent data analysis

full rationale

The paper conducts an empirical study on the 2015 Nepal Earthquakes, proposing a linking method between tweets and newswire articles, evaluating unsupervised summarization techniques, and comparing timeliness and content characteristics. No equations, derivations, or first-principles results are claimed that reduce to inputs by construction, fitted parameters renamed as predictions, or load-bearing self-citations. The complementarity finding emerges from direct data observations rather than self-referential definitions or ansatzes smuggled via prior work. The analysis is self-contained against external benchmarks with no reduction of outputs to the linking procedure itself.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are described in the abstract; the work rests on standard assumptions of information-retrieval evaluation (relevance judgments, summary quality metrics) that are not detailed here.

pith-pipeline@v0.9.0 · 5716 in / 994 out tokens · 29948 ms · 2026-05-25T16:00:44.235169+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 37 canonical work pages · 2 internal anchors

  1. [1]

    Petrovic, M

    S. Petrovic, M. Osborne, R. McCreadie, C. Macdonald, I. Ounis, L. Shrimp- ton, Can twitter replace newswire for breaking news?, in: Seventh interna- tional AAAI conference on weblogs and social media, 2013

  2. [2]

    Mills, R

    A. Mills, R. Chen, J. Lee, H. Raghav Rao, Web 2.0 emergency applications: How useful can twitter be for emergency response?, Journal of Information Privacy and Security 5 (3) (2009) 3–26

  3. [3]

    C. A. Cassa, R. Chunara, K. Mandl, J. S. Brownstein, Twitter as a sentinel 29 in emergency situations: lessons from the boston marathon explosions, PLoS currents 5

  4. [4]

    P. R. Spence, K. A. Lachlan, X. Lin, M. del Greco, Variability in twit- ter content across the stages of a natural disaster: Implications for crisis communication, Communication Quarterly 63 (2) (2015) 171–186

  5. [5]

    Doan, B.-K

    S. Doan, B.-K. H. Vo, N. Collier, An analysis of twitter messages in the 2011 tohoku earthquake, in: International conference on electronic healthcare, Springer, 2011, pp. 58–66

  6. [6]

    Martinez-Rojas, M

    M. Martinez-Rojas, M. del Carmen Pardo-Ferreira, J. C. Rubio-Romero, Twitter as a tool for the management and analysis of emergency situa- tions: A systematic literature review, International Journal of Information Management 43 (2018) 196–208

  7. [7]

    F. Alam, S. Joty, M. Imran, Graph based semi-supervised learning with convolution neural networks to classify crisis related tweets, in: Twelfth International AAAI Conference on Web and Social Media, 2018

  8. [8]

    H¨ urriyetoglu, C

    A. H¨ urriyetoglu, C. Gudehus, N. Oostdijk, Using relevancer to detect rele- vant tweets: The nepal earthquake case, in: booktitle = Working notes of FIRE 2016 - Forum for Information Retrieval Evaluation, Kolkata, India, December 7-10, 2016., 2016, pp. 76–78

  9. [9]

    Radianti, S

    J. Radianti, S. R. Hiltz, L. Labaka, An overview of public concerns during the recovery period after a major earthquake: Nepal twitter analysis, in: 49th Hawaii International Conference on System Sciences, HICSS 2016, Koloa, HI, USA, January 5-8, 2016, 2016, pp. 136–145

  10. [10]

    Subba, T

    R. Subba, T. Bui, Online convergence behavior, social media communica- tions and crisis response: An empirical study of the 2015 nepal earthquake police twitter project, in: 50th Hawaii International Conference on System Sciences, HICSS 2017, Hilton Waikoloa Village, Hawaii, USA, January 4-7, 2017, 2017, pp. 1–10. 30

  11. [11]

    Y. Su, Z. Lan, Y. Lin, L. K. Comfort, J. Joshi, Tracking disaster response and relief following the 2015 nepal earthquake, in: 2nd IEEE International Conference on Collaboration and Internet Computing, CIC 2016, Pitts- burgh, PA, USA, November 1-3, 2016, 2016, pp. 495–499

  12. [12]

    Tsagkias, M

    M. Tsagkias, M. De Rijke, W. Weerkamp, Linking online news and social media, in: Proceedings of the fourth ACM international conference on Web search and data mining, ACM, 2011, pp. 565–574

  13. [13]

    Linking Tweets with Monolingual and Cross-Lingual News using Transformed Word Embeddings

    A. Mogadala, D. Jung, A. Rettinger, Linking tweets with monolingual and cross-lingual news using transformed word embeddings, arXiv preprint arXiv:1710.09137

  14. [14]

    J. Wang, W. Tong, H. Yu, M. Li, X. Ma, H. Cai, T. Hanratty, J. Han, Mining multi-aspect reflection of news events in twitter: Discovery, link- ing and presentation, in: Data Mining (ICDM), 2015 IEEE International Conference on, IEEE, 2015, pp. 429–438

  15. [15]

    Ahmad, A

    T. Ahmad, A. Ramsay, Linking tweets to news: Is all news of interest?, in: International Conference on Artificial Intelligence: Methodology, Systems, and Applications, Springer, 2016, pp. 151–161

  16. [16]

    Mazoyer, J

    B. Mazoyer, J. Cag´ e, C. Hudelot, M.-L. Viaud, Real-time collection of reliable and representative tweets datasets related to news events, in: Pro- ceedings of the first International Workshop on Analysis of Broad Dynamic Topics over Social Media: BroDyn, Vol. 18, 2018

  17. [17]

    X. Lin, Y. Gu, R. Zhang, J. Fan, Linking news and tweets, in: Australasian Database Conference, Springer, 2016, pp. 467–470

  18. [18]

    W. Guo, H. Li, H. Ji, M. Diab, Linking tweets to news: A framework to enrich short text data in social media, in: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1, 2013, pp. 239–249. 31

  19. [19]

    B. Shi, G. Ifrim, N. Hurley, Be in the know: Connecting news articles to relevant twitter conversations, arXiv preprint arXiv:1405.3117

  20. [20]

    F. Abel, Q. Gao, G.-J. Houben, K. Tao, Semantic enrichment of twitter posts for user profile construction on the social web, in: Extended semantic web conference, Springer, 2011, pp. 375–389

  21. [21]

    McCreadie, C

    R. McCreadie, C. Macdonald, I. Ounis, News vertical search: when and what to display to users, in: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval, ACM, 2013, pp. 253–262

  22. [22]

    Z. Wei, W. Gao, Gibberish, assistant, or master?: Using tweets linking to news for extractive single-document summarization, in: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, 2015, pp. 1003–1006

  23. [23]

    Kulcu, E

    S. Kulcu, E. Dogdu, A scalable approach for sentiment analysis of turkish tweets and linking tweets to news, in: Semantic Computing (ICSC), 2016 IEEE Tenth International Conference on, IEEE, 2016, pp. 471–476

  24. [24]

    H. Li, H. Ji, Cross-genre event extraction with knowledge enrichment, in: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2016, pp. 1158–1162

  25. [25]

    Z. Wei, W. Gao, Gibberish, assistant, or master?: Using tweets linking to news for extractive single-document summarization, in: Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR ’15, ACM, New York, NY, USA, 2015, pp. 1003–1006. doi:10.1145/2766462.2767835. URL http://doi.acm.org/10.1145...

  26. [26]

    C. Lin, E. H. Hovy, Automatic evaluation of summaries using n-gram co- occurrence statistics, in: Human Language Technology Conference of the 32 North American Chapter of the Association for Computational Linguistics, HLT-NAACL 2003, Edmonton, Canada, May 27 - June 1, 2003, 2003. URL http://aclweb.org/anthology/N/N03/N03-1020.pdf

  27. [27]

    Rudra, S

    K. Rudra, S. Banerjee, N. Ganguly, P. Goyal, M. Imran, P. Mitra, Summarizing situational tweets in crisis scenario, in: Proceedings of the 27th ACM Conference on Hypertext and Social Media, HT ’16, ACM, New York, NY, USA, 2016, pp. 137–147. doi:10.1145/2914586.2914600. URL http://doi.acm.org.ezproxy.lib.uh.edu/10.1145/2914586. 2914600

  28. [28]

    F. Alam, S. Joty, M. Imran, Domain adaptation with adversarial training and graph embeddings, 2018

  29. [29]

    S. Bird, E. Klein, E. Loper, Natural language processing with Python, ” O’Reilly Media, Inc.”, 2009

  30. [30]

    G. A. Miller, Wordnet: A lexical database for english, Communications of the ACM 38 (11) (1995) 39–41

  31. [31]

    W. Guo, H. Li, H. Ji, M. Diab, Linking tweets to news: A framework to enrich short text data in social media, in: Proceedings of the 51th Annual Meeting of the Association for Computational Linguistics, 2013

  32. [32]

    R. M. Verma, D. Lee, Extractive summarization: Limits, compression, generalized model and heuristics, Computaci´ on y Sistemas 21 (4)

  33. [33]

    Zhang, T

    J. Zhang, T. Wang, X. Wan, PKUSUMSUM : A java platform for mul- tilingual document summarization, in: COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference System Demonstrations, December 11-16, 2016, Osaka, Japan, 2016, pp. 287–291. URL http://aclweb.org/anthology/C/C16/C16-2060.pdf 33

  34. [34]

    Lin, ROUGE: A Package for Automatic Evaluation of Summaries, in: Proceedings of Workshop on Text Summarization Post-Conference Work- shop, (ACL 2004) Barcelona, Spain, 2004

    C. Lin, ROUGE: A Package for Automatic Evaluation of Summaries, in: Proceedings of Workshop on Text Summarization Post-Conference Work- shop, (ACL 2004) Barcelona, Spain, 2004

  35. [35]

    Owczarzak, J

    K. Owczarzak, J. M. Conroy, H. T. Dang, A. Nenkova, An assessment of the accuracy of automatic evaluation in summarization, in: Proceedings of Workshop on Evaluation Metrics and System Comparison for Automatic Summarization@NACCL-HLT 2012, Montr` eal, Canada, June 2012, 2012, 2012, pp. 1–9. URL https://aclanthology.info/papers/W12-2601/w12-2601

  36. [36]

    Y. Graham, Re-evaluating automatic summarization with BLEU and 192 shades of ROUGE, in: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015, Lisbon, Portugal, September 17-21, 2015, 2015, pp. 128–137. URL http://aclweb.org/anthology/D/D15/D15-1013.pdf

  37. [37]

    J. Yin, J. Wang, A dirichlet multinomial mixture model-based approach for short text clustering, in: Proceedings of the 20th ACM SIGKDD inter- national conference on Knowledge discovery and data mining, ACM, 2014, pp. 233–242. 34