Geolocating News about Extreme Climate Events: A Comparative Analysis of Off-the-Shelf Tools for Toponym Identification in German
Pith reviewed 2026-05-07 16:48 UTC · model grok-4.3
The pith
Different NER tools extract different place names from German climate news, producing inconsistent pictures of affected countries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that contrasts between the NER tools Flair, Spacy, and Stanza in toponym identification for German news lead to distinct outcomes in downstream country assignment tasks, which can alter conclusions about countries' prominence in media reports on extreme climate events.
What carries the argument
The pipeline from NER toponym extraction to country-level geolocation decisions using three extrinsic assignment methods.
Load-bearing premise
That the extrinsic country assignment methods accurately reflect the true event locations and that observed differences stem primarily from the NER tools rather than other factors.
What would settle it
Manual verification of event locations in a set of articles and comparison to the automated country assignments from each tool to see if they agree or diverge systematically.
Figures
read the original abstract
Determining the geolocation of extreme climate events and disasters in texts is a common problem in climate impact and adaptation research. Named-entity recognition (NER) tools are typically used to identify a pool of toponyms that serve as candidate event locations. In this study, we conduct a comparative analysis of three off-the-shelf NER tools, namely Flair, Spacy and Stanza. We describe and quantify differences between their outputs for German news articles and evaluate them extrinsically based on three methods to determine the country where events took place. We show how their contrasts are propagated into downstream tasks and can yield distinct decisions about a document's geographical focus, which, in turn, can impact conclusions about countries' prominence in German media.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript conducts a comparative analysis of three off-the-shelf NER tools (Flair, spaCy, and Stanza) for toponym identification in German news articles about extreme climate events. It quantifies differences in their toponym outputs and evaluates them extrinsically via three country-assignment methods, showing that tool variance propagates to alter document-level geographical focus and downstream statistics on countries' prominence in media coverage.
Significance. If the reported differences hold under scrutiny, the work is significant for climate informatics and applied NLP, as it illustrates the sensitivity of geolocation pipelines to upstream NER choices in a domain-specific setting. The direct side-by-side comparison on a shared corpus and the explicit tracing of effects into downstream country-prominence metrics constitute a practical strength, providing evidence that is independent of any claim about the absolute accuracy of the assignment heuristics.
major comments (1)
- [extrinsic evaluation] The extrinsic evaluation section does not report the total number of articles in the corpus, the distribution of toponyms per tool, or any statistical tests (e.g., McNemar or chi-squared) on the frequency of differing country assignments. These omissions make it impossible to assess whether the observed propagation to distinct geographical-focus decisions is frequent enough to materially affect conclusions about country prominence.
minor comments (3)
- [methods] The three country-assignment methods are referenced but not fully formalized (e.g., tie-breaking rules when multiple toponyms map to different countries); adding pseudocode or explicit decision trees would improve reproducibility.
- [results] No error analysis or example articles are provided showing concrete cases where the three tools produce divergent country labels; including 2-3 such cases would strengthen the propagation claim.
- [abstract] The abstract states that differences 'can yield distinct decisions' but does not preview any quantitative measure of divergence; a single sentence with effect size would better orient readers.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. We address the single major comment below and will incorporate the suggested additions into a revised manuscript.
read point-by-point responses
-
Referee: [extrinsic evaluation] The extrinsic evaluation section does not report the total number of articles in the corpus, the distribution of toponyms per tool, or any statistical tests (e.g., McNemar or chi-squared) on the frequency of differing country assignments. These omissions make it impossible to assess whether the observed propagation to distinct geographical-focus decisions is frequent enough to materially affect conclusions about country prominence.
Authors: We agree that these details are necessary for a complete assessment of the practical impact of tool variance. In the revised manuscript we will (1) state the total number of articles in the corpus, (2) add a table or figure reporting the number and distribution of toponyms returned by each tool, and (3) include the results of appropriate statistical tests (chi-squared or McNemar) comparing the frequency of differing country assignments. These additions will allow readers to judge how often the observed propagation materially affects downstream country-prominence statistics. revision: yes
Circularity Check
No significant circularity; purely empirical comparison
full rationale
The paper conducts a direct empirical comparison of three off-the-shelf NER tools (Flair, Spacy, Stanza) on German news articles, quantifying output differences in toponym pools and propagating them through three fixed extrinsic country-assignment methods. No derivations, equations, fitted parameters, predictions, or self-citations appear in the load-bearing steps; all contrasts are measured against the same corpus and independent assignment rules, so divergences are attributable to tool variance by construction. The central claim requires no internal reduction to inputs and is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Off-the-shelf NER models trained on general German text can reliably identify toponyms in climate-related news articles.
- domain assumption The three methods for mapping a set of toponyms to a single country label produce meaningful proxies for event location.
Reference graph
Works this paper leans on
-
[1]
Language Resources and Evaluation , author =
M. Gritta, M. T. Pilehvar, N. Collier, A pragmatic guide to geoparsing evaluation: Toponyms, Named Entity Recognition and pragmatics, Language Resources and Evaluation 54 (2020) 683–712. URL: http://link.springer.com/10.1007/s10579-019-09475-3. doi:10.1007/s10579-019-09475-3
-
[2]
X. Hu, Z. Zhou, H. Li, Y. Hu, F. Gu, J. Kersten, H. Fan, F. Klan, Location Reference Recognition from Texts: A Survey and Comparison, ACM Computing Surveys 56 (2024) 1–37. URL: https: //dl.acm.org/doi/10.1145/3625819. doi:10.1145/3625819
-
[3]
D. Otto, M. Pfeiffer, M. M. de Brito, M. Gross, Fixed Amidst Change: 20 Years of Media Coverage on Carbon Capture and Storage in Germany, Sustainability 14 (2022). URL: https://www.mdpi. com/2071-1050/14/12/7342. doi:10.3390/su14127342
-
[4]
J. Sodoge, C. Kuhlicke, M. M. d. Brito, Automatized spatio-temporal detection of drought impacts from newspaper articles using natural language processing and machine learning, Weather and Climate Extremes 41 (2023) 100574. URL: https://www.sciencedirect.com/science/article/pii/ S2212094723000270. doi:https://doi.org/10.1016/j.wace.2023.100574
-
[5]
J. H. Lochner, A. Stechemesser, L. Wenz, Climate summits and protests have a strong impact on climate change media coverage in Germany, Communications Earth & Environment 5 (2024) 279. URL: https://doi.org/10.1038/s43247-024-01434-3. doi:10.1038/s43247-024-01434-3
-
[6]
P. H. L. Alencar, J. Sodoge, E. Nora Paton, M. Madruga De Brito, Flash droughts and their impacts—using newspaper articles to assess the perceived consequences of rapidly emerging droughts, Environmental Research Letters 19 (2024) 074048. URL: https://iopscience.iop.org/ article/10.1088/1748-9326/ad58fa. doi:10.1088/1748-9326/ad58fa
-
[7]
I. Kong, R. S. Purves, Analyzing Geographic Bias of Newspaper Articles Reporting Global Climate Disasters, Annals of the American Association of Geographers (2025) 1–19. URL: https://www.tandfonline.com/doi/full/10.1080/24694452.2025.2564220. doi:10.1080/24694452. 2025.2564220
-
[8]
E. Amitay, N. Har’El, R. Sivan, A. Soffer, Web-a-where: geotagging web content, in: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, Sheffield United Kingdom, 2004, pp. 273–280. URL: https://dl.acm.org/doi/10.1145/ 1008992.1009040. doi:10.1145/1008992.1009040
-
[9]
G. Andogah, G. Bouma, J. Nerbonne, Every document has a geographical scope, Data & Knowledge Engineering 81-82 (2012) 1–20. URL: https://linkinghub.elsevier.com/retrieve/pii/ S0169023X12000687. doi:10.1016/j.datak.2012.07.002
-
[10]
B. R. Monteiro, C. A. Davis, F. Fonseca, A survey on the geographic scope of textual documents, Computers & Geosciences 96 (2016) 23–34. URL: https://linkinghub.elsevier.com/retrieve/pii/ S0098300416301972. doi:10.1016/j.cageo.2016.07.017
-
[11]
F. Melo, B. Martins, Automated Geocoding of Textual Documents: A Survey of Current Approaches, Transactions in GIS 21 (2017) 3–38. URL: https://onlinelibrary.wiley.com/doi/10.1111/tgis.12212. doi:10.1111/tgis.12212
-
[12]
W. Zong, D. Wu, A. Sun, E.-P. Lim, D. H.-L. Goh, On assigning place names to geography related web pages, in: Proceedings of the 5th ACM/IEEE-CS joint conference on Digital libraries, ACM, Denver CO USA, 2005, pp. 354–362. URL: https://dl.acm.org/doi/10.1145/1065385.1065464. doi:10.1145/1065385.1065464
-
[13]
S. J. Lee, H. Liu, M. D. Ward, Lost in Space: Geolocation in Event Data, Political Science Research and Methods 7 (2019) 871–888. URL: https://www.cambridge.org/core/product/identifier/ S2049847018000237/type/journal_article. doi:10.1017/psrm.2018.23
-
[14]
Benikova, C
D. Benikova, C. Biemann, M. Reznicek, NoSta-D named entity annotation for German: Guidelines and dataset, in: N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), European Language Resources As...
2014
-
[15]
M. Riedl, S. Padó, A named entity recognition shootout for German, in: I. Gurevych, Y. Miyao (Eds.), Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 120–125. URL: https://aclanthology.org/P18-2020/. doi:10.18653/v1/P18-2020
-
[16]
Labusch, C
K. Labusch, C. Neudecker, D. Zellhöfer, Bert for named entity recognition in contemporary and historic german, in: Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019): Long Papers, German Society for Computational Linguistics & Language Technology, Erlangen, Germany, 2019, pp. 1–9. URL: https://konvens.org/proceedings/2019/pap...
2019
-
[17]
Leitner, G
E. Leitner, G. Rehm, J. Moreno-Schneider, A dataset of German legal documents for named entity recognition, in: N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis (Eds.), Proceedings of the Twelfth Language Resources and Evaluation Conference, Eur...
2020
-
[18]
Ortmann, A
K. Ortmann, A. Roussel, S. Dipper, Evaluating Off-the-Shelf NLP Tools for German, in: Proceedings of the 15th Conference on Natural Language Processing (KONVENS 2019): Long Papers, German Society for Computational Linguistics & Language Technology, Erlangen, Germany, 2019, pp. 212–
2019
-
[19]
URL: https://sfb1102.uni-saarland.de/sfbunisb/uploads/2020/10/KONVENS2019_paper_55.pdf
2020
-
[20]
Scheible, R
S. Scheible, R. J. Whitt, M. Durrell, P. Bennett, Evaluating an ‘off-the-shelf’ POS-tagger on early Modern German text, in: K. Zervanou, P. Lendvai (Eds.), Proceedings of the 5th ACL- HLT Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, Association for Computational Linguistics, Portland, OR, USA, 2011, pp. 19–23. UR...
2011
-
[21]
Laarmann-Quante, L
R. Laarmann-Quante, L. Prepens, T. Zesch, Evaluating automatic spelling correction tools on German primary school children’s misspellings, in: D. Alfter, E. Volodina, T. François, P. Desmet, F. Cornillie, A. Jönsson, E. Rennes (Eds.), Proceedings of the 11th Workshop on NLP for Computer Assisted Language Learning, LiU Electronic Press, Louvain-la-Neuve, B...
2022
-
[22]
M. Gritta, M. T. Pilehvar, N. Limsopatham, N. Collier, What’s missing in geographical parsing?, Language Resources and Evaluation 52 (2018) 603–623. URL: http://link.springer.com/10.1007/ s10579-017-9385-8. doi:10.1007/s10579-017-9385-8
-
[23]
J. Wang, Y. Hu, Enhancing spatial and textual analysis with EUPEG: An extensible and unified platform for evaluating geoparsers, Transactions in GIS 23 (2019) 1393–1419. URL: https://onlinelibrary.wiley.com/doi/10.1111/tgis.12579. doi:10.1111/tgis.12579
-
[24]
Z. Liu, K. Janowicz, L. Cai, R. Zhu, G. Mai, M. Shi, Geoparsing: Solved or Biased? An Evaluation of Geographic Biases in Geoparsing, AGILE: GIScience Series 3 (2022) 1–13. URL: https://agile-giss. copernicus.org/articles/3/9/2022/. doi:10.5194/agile-giss-3-9-2022
-
[25]
N. Doms, T. Schlachter, L. Hahn-Woernle, A Geo-Parser for German Documents with En- vironmental Context, in: V. Wohlgemuth, H. Kandil, A. Ramzy (Eds.), Advances and New Trends in Environmental Informatics, Springer Nature Switzerland, Cham, 2025, pp. 21–33. URL: https://link.springer.com/10.1007/978-3-031-85284-8_2. doi:10.1007/978-3-031-85284-8_2, series...
-
[26]
M. Won, P. Murrieta-Flores, B. Martins, Ensemble Named Entity Recognition (NER): Evaluat- ing NER Tools in the Identification of Place Names in Historical Corpora, Frontiers in Digital Humanities 5 (2018) 2. URL: http://journal.frontiersin.org/article/10.3389/fdigh.2018.00002/full. doi:10.3389/fdigh.2018.00002
-
[27]
Kriesch, S
L. Kriesch, S. Losacker, A geolocated dataset of German news articles, Scientific Data 12 (2025) 1128. URL: https://www.nature.com/articles/s41597-025-05422-w. doi:10.1038/ s41597-025-05422-w
2025
-
[28]
J. L. Leidner, G. Sinclair, B. Webber, Grounding spatial named entities for information extraction and question answering, in: Proceedings of the HLT-NAACL 2003 Workshop on Analysis of Geographic References, 2003, pp. 31–38. URL: https://aclanthology.org/W03-0105/
2003
-
[29]
M. Badieh Habib Morgan, M. van Keulen, Named entity extraction and disambiguation: the missing link, ESAIR ’13, Association for Computing Machinery, New York, NY, USA, 2013, p. 37–40. URL: https://doi.org/10.1145/2513204.2513217. doi:10.1145/2513204.2513217
-
[30]
A. Akbik, T. Bergmann, D. Blythe, K. Rasul, S. Schweter, R. Vollgraf, FLAIR: An easy-to-use framework for state-of-the-art NLP, in: W. Ammar, A. Louis, N. Mostafazadeh (Eds.), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations), Association for Computational Linguistics, Minnea...
-
[31]
doi:10.5281/zenodo.1212303 , interhash =
M. Honnibal, I. Montani, S. Van Landeghem, A. Boyd, spaCy: Industrial-strength Natural Language Processing in Python (2020). URL: https://spacy.io/. doi:10.5281/zenodo.1212303
-
[32]
P. Qi, Y. Zhang, Y. Zhang, J. Bolton, C. D. Manning, Stanza: A python natural language processing toolkit for many human languages, in: A. Celikyilmaz, T.-H. Wen (Eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Association for Computational Linguistics, Online, 2020, pp. 101–108. URL: ...
-
[33]
D. A. Smith, G. Crane, Disambiguating Geographic Names in a Historical Digital Library, in: G. Goos, J. Hartmanis, J. Van Leeuwen, P. Constantopoulos, I. T. Sølvberg (Eds.), Research and Advanced Technology for Digital Libraries, volume 2163, Springer Berlin Heidelberg, Berlin, Heidelberg, 2001, pp. 127–136. URL: http://link.springer.com/10.1007/3-540-447...
-
[34]
R. C. Pasley, P. D. Clough, M. Sanderson, Geo-tagging for imprecise regions of different sizes, in: Proceedings of the 4th ACM workshop on Geographical information retrieval, ACM, Lisbon Portugal, 2007, pp. 77–82. URL: https://dl.acm.org/doi/10.1145/1316948.1316969. doi: 10.1145/ 1316948.1316969
-
[35]
M. A. Radke, N. Gautam, A. Tambi, U. A. Deshpande, Z. Syed, Geotagging Text Data on the Web—A Geometrical Approach, IEEE Access 6 (2018) 30086–30099. URL: https://ieeexplore.ieee. org/document/8371593/. doi:10.1109/ACCESS.2018.2843814
-
[36]
C. Spearman, The Proof and Measurement of Association between Two Things, The American Journal of Psychology 15 (1904) 72. URL: https://www.jstor.org/stable/1412159?origin=crossref. doi:10.2307/1412159
-
[37]
M. G. Kendall, A new measure of rank correlation, Biometrika 30 (1938) 81–93. URL: https: //doi.org/10.1093/biomet/30.1-2.81
-
[38]
N. Li, S. Zahra, M. Brito, C. Flynn, O. Görnerup, K. Worou, M. Kurfali, C. Meng, W. Thiery, J. Zscheischler, G. Messori, J. Nivre, Using LLMs to build a database of climate extreme impacts, in: D. Stammbach, J. Ni, T. Schimanski, K. Dutia, A. Singh, J. Bingler, C. Christiaen, N. Kushwaha, V. Muccione, S. A. Vaghefi, M. Leippold (Eds.), Proceedings of the ...
-
[39]
M. Madruga de Brito, J. Sodoge, H. Kreibich, C. Kuhlicke, Comprehensive assessment of flood socioeconomic impacts through text-mining, Water Resources Research 61 (2025) e2024WR037813. URL: https://agupubs.onlinelibrary.wiley.com/doi/abs/10.1029/2024WR037813. doi:https://doi. org/10.1029/2024WR037813
-
[40]
T. M. N. Carvalho, A. Niekler, C. Kuhlicke, J. Zscheischler, M. M. de Brito, Global synthesis of peer-reviewed articles reveals blind spots in climate impacts research (2025). URL: http://dx.doi. org/10.21203/rs.3.rs-6095740/v1. doi:10.21203/rs.3.rs-6095740/v1
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.