Topic models can act as binary classifiers for retrieving news on extreme climate events in German media, with performance varying by hazard type and boosted by keyword probabilities.
Retrieving Floods without Floodlights: Topic Models as Binary Classifiers for Extreme Climate Events in German News
2 Pith papers cite this work. Polarity classification is still indexing.
abstract
In studies of media coverage of extreme climate events, NLP methods have become indispensable for identifying relevant texts in large news databases. Still, enough annotated data to train accurate deep learning-based classifiers from scratch is often not available. Topic Models have the advantage of being both unsupervised and interpretable, but are typically used only for exploratory analysis or data characterisation. In this study, we investigate how to employ Topic Models as binary classifiers for refining the retrieval of relevant news about seven types of extreme climate events in the German media. Our method relies on the posterior distributions estimated by Topic Models to select relevant documents, without modifying their training procedure. Using an annotated sample to guide the evaluation, we show that the probabilities assigned to keywords used to query news databases can also be informative for selecting relevant topics and improve sample precision. We compare our results to a fine-tuned text embedding classifier and an open-weight LLM, discussing observed trade-offs, e.g. the LLM's lowest precision. Moreover, we show that results are hazard-dependent, which speaks against considering climate events as a single category in NLP tasks.
fields
cs.CL 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
The study applies time series peak detection to German news on Brazilian disasters to assess temporal alignment with actual disaster events in national and global databases.
citing papers explorer
-
Retrieving Floods without Floodlights: Topic Models as Binary Classifiers for Extreme Climate Events in German News
Topic models can act as binary classifiers for retrieving news on extreme climate events in German media, with performance varying by hazard type and boosted by keyword probabilities.
-
The Newsworthiness of Brazilian Distress: A Peak Analysis on Time Series of International Media Attention to Disasters in Brazil
The study applies time series peak detection to German news on Brazilian disasters to assess temporal alignment with actual disaster events in national and global databases.