pith. sign in

arxiv: 2606.26489 · v1 · pith:6Y44BTSKnew · submitted 2026-06-25 · 💻 cs.CL

Comparing BERT Sentence-Pair Classification and Few-Shot LLM Prompting for Detecting Threat and Solution Framing in German Climate News

Pith reviewed 2026-06-26 05:30 UTC · model grok-4.3

classification 💻 cs.CL
keywords BERTLLM promptingclimate newsframing detectionGerman languagesentence classificationthreat framingsolution framing
0
0 comments X

The pith

Fine-tuned BERT classifiers reach 0.83 F1 on threat and solution framing in German climate news, beating few-shot LLM prompting at 0.78 F1.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper compares two ways to automatically label sentences in German climate news as threat-oriented, solution-oriented, both, or neither. One method fine-tunes a German BERT model on sentence pairs that include the preceding sentence for context. The other uses few-shot prompting with chain-of-thought reasoning on an open-weights LLM. On a set of 440 manually labeled Austrian articles, the BERT models score higher, showing that task-specific fine-tuning can still outperform general prompting for this framing task in computational social science.

Core claim

The fine-tuned BERT classifiers achieve an F1 score of 0.83 for both the threat and solution tasks, while the LLM-based classifiers reach an F1 of 0.78. An ablation study shows that providing the preceding sentence as context improves BERT performance substantially compared to single-sentence input.

What carries the argument

Sentence-pair classification in a fine-tuned German BERT model, where the preceding sentence supplies context for the target sentence, run as two independent binary classifiers for threat framing and solution framing.

If this is right

  • Providing the preceding sentence as context improves BERT classification performance substantially compared to single-sentence input.
  • Both methods classify sentences into threat-oriented, solution-oriented, both, or neither categories.
  • The approaches make automated analysis of large German-language climate news corpora feasible where manual coding is impractical.
  • The comparison adds to research on fine-tuned encoder models versus prompted generative models for text classification in computational social science.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same sentence-pair fine-tuning approach could be tested on framing detection tasks in other languages where modest amounts of labeled data exist.
  • If the performance advantage of BERT persists, it may favor fine-tuning over prompting when the goal is stable classification on domain-specific text.
  • Applying the classifiers to track how threat versus solution framing changes across time periods or news outlets would allow measurement of shifts in climate coverage.

Load-bearing premise

The manually coded corpus of 440 articles, developed with domain experts following a detailed coding scheme, provides reliable ground-truth labels for evaluating classifier performance.

What would settle it

A new collection of German climate news sentences labeled independently by different coders using the same scheme, on which both models achieve F1 scores below 0.70, would indicate the reported performance does not hold.

read the original abstract

News media play a central role in shaping public perceptions of climate change, and whether coverage emphasizes threats or solutions has measurable effects on audience engagement and policy support. Automated detection of these framing patterns at the sentence level would allow researchers to analyze large corpora that are infeasible to code manually. We present a systematic comparison of two approaches for classifying sentences from German-language climate news articles as threat-oriented, solution-oriented, both, or neither. The first approach uses few-shot prompting with an open-weights large language model (Llama 4 Maverick), employing chain-of-thought reasoning and structured output with confidence scoring. The second approach fine-tunes a German BERT model (deepset/gbert-large) for sentence-pair classification, where the preceding sentence provides contextual information for the target sentence. Both approaches implement two independent binary classifiers, one for threat framing and one for solution framing. We evaluate both methods on a corpus of 440 Austrian newspaper articles that were manually coded following a detailed coding scheme developed with domain experts. The fine-tuned BERT classifiers achieve an F1 score of 0.83 for both the threat and solution tasks, while the LLM-based classifiers reach an F1 of 0.78. An ablation study confirms that providing the preceding sentence as context improves BERT classification performance substantially compared to single-sentence input. These results contribute to the growing body of work comparing fine-tuned encoder models with prompted generative models for text classification in computational social science.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The paper compares two approaches for sentence-level detection of threat and solution framing in German climate news: (1) few-shot chain-of-thought prompting with Llama 4 Maverick using structured output and confidence scoring, and (2) fine-tuning a German BERT (gbert-large) for sentence-pair classification that incorporates the preceding sentence as context. Both are implemented as independent binary classifiers. Evaluation on a manually coded corpus of 440 Austrian newspaper articles yields F1=0.83 for the BERT models on both tasks versus F1=0.78 for the LLM, with an ablation confirming the value of context for BERT.

Significance. If the labels are reliable, the work supplies concrete, reproducible evidence that fine-tuned encoder models outperform few-shot prompting for this framing task in computational social science, while the ablation isolates the contribution of sentence context. The use of an open-weights model and a domain-expert coding scheme are additional strengths that support replicability.

major comments (1)
  1. [Corpus and annotation description] Corpus and annotation description: no inter-annotator agreement statistic (Cohen’s kappa, percentage agreement, or equivalent) is reported for the binary threat and solution labels on the 440 articles. Because framing annotations can involve subjective boundary cases, the absence of reliability metrics renders the absolute F1 values (0.83 vs. 0.78) and their comparison difficult to interpret, which is load-bearing for the central empirical claim.
minor comments (1)
  1. [Abstract and Evaluation] The abstract and evaluation section would benefit from explicit mention of the train/validation/test split sizes and any statistical significance testing on the F1 differences.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for highlighting the importance of annotation reliability metrics. We address the single major comment below.

read point-by-point responses
  1. Referee: Corpus and annotation description: no inter-annotator agreement statistic (Cohen’s kappa, percentage agreement, or equivalent) is reported for the binary threat and solution labels on the 440 articles. Because framing annotations can involve subjective boundary cases, the absence of reliability metrics renders the absolute F1 values (0.83 vs. 0.78) and their comparison difficult to interpret, which is load-bearing for the central empirical claim.

    Authors: We agree that reporting inter-annotator agreement is essential for subjective tasks such as framing detection, as it directly affects interpretability of the F1 scores. The 440 articles were annotated by a single domain expert in climate communication following a detailed coding scheme developed in consultation with additional experts. Because only one annotator performed the final coding, standard IAA statistics (Cohen’s kappa or equivalent) were not applicable and therefore not reported. In the revised manuscript we will expand the corpus and annotation description section with: (1) the annotator’s qualifications and prior experience, (2) the iterative process used to develop and refine the coding scheme (including pilot rounds), and (3) explicit discussion of why IAA could not be computed. This added context should allow readers to assess the reliability of the labels without overstating the evidence. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical evaluation on external labels

full rationale

The paper conducts a direct empirical comparison of BERT fine-tuning versus few-shot LLM prompting for binary threat/solution framing classification. It reports F1 scores (0.83 vs 0.78) on a fixed, externally manually-coded corpus of 440 articles. No mathematical derivations, parameter-fitting steps presented as predictions, self-citation chains, or ansatzes are present. The evaluation uses standard train/test splits against human-provided ground truth; results are falsifiable by re-labeling or re-running on the same corpus. No load-bearing step reduces to the paper's own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Standard supervised ML evaluation assumptions; no free parameters, invented entities, or ad-hoc axioms beyond reliance on human annotations as ground truth.

axioms (1)
  • domain assumption Human annotations following the expert-developed coding scheme constitute accurate ground-truth labels.
    F1 scores are computed directly against these labels on the 440-article corpus.

pith-pipeline@v0.9.1-grok · 5798 in / 1306 out tokens · 17465 ms · 2026-06-26T05:30:01.589426+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

60 extracted references · 4 linked inside Pith

  1. [1]

    , title =

    Entman, Robert M. , title =. Journal of Communication , volume =

  2. [2]

    , title =

    Nisbet, Matthew C. , title =. Environment: Science and Policy for Sustainable Development , volume =

  3. [3]

    Sol and Feldman, Lauren , title =

    Hart, P. Sol and Feldman, Lauren , title =. Science Communication , volume =

  4. [4]

    Sol and Milosevic, Tijana , title =

    Feldman, Lauren and Hart, P. Sol and Milosevic, Tijana , title =. Public Understanding of Science , volume =

  5. [5]

    Sol , title =

    Feldman, Lauren and Hart, P. Sol , title =. Risk Analysis , volume =

  6. [6]

    From Global Doom to Sustainable Solutions: International News Magazines' Multimodal Framing of Our Future with Climate Change , journal =

    Guenther, Lars and Br. From Global Doom to Sustainable Solutions: International News Magazines' Multimodal Framing of Our Future with Climate Change , journal =

  7. [7]

    Environmental Communication , volume =

    Thier, Katja and Lin, Tai-Tse , title =. Environmental Communication , volume =

  8. [8]

    Environmental Communication , volume =

    Thier, Katja and Wu, Xue , title =. Environmental Communication , volume =

  9. [9]

    and Colvin, Rebecca M

    Badullovich, Natalia and Grant, Will J. and Colvin, Rebecca M. , title =. Environmental Research Letters , volume =

  10. [10]

    Framing as a Bridging Concept for Climate Change Communication: A Systematic Review Based on 25 Years of Literature , journal =

    Guenther, Lars and J. Framing as a Bridging Concept for Climate Change Communication: A Systematic Review Based on 25 Years of Literature , journal =

  11. [11]

    Frame Analysis in Climate Change Communication , booktitle =

    Sch. Frame Analysis in Climate Change Communication , booktitle =

  12. [12]

    Climate Change in News Media across the Globe: An Automated Analysis of Issue Attention and Themes in Climate Change Coverage in 10 Countries (2006--2018) , journal =

    Hase, Valerie and Mahl, Daniela and Sch. Climate Change in News Media across the Globe: An Automated Analysis of Issue Attention and Themes in Climate Change Coverage in 10 Countries (2006--2018) , journal =

  13. [13]

    Climatic Change , volume =

    Dablander, Fabian and Wimmer, Sophia and others , title =. Climatic Change , volume =

  14. [14]

    From Disruptive Protests to Disrupted News Frames: Comparing

    Meyer, Hendrik and Farjam, Mike and Rauxloh, Hannah and Br. From Disruptive Protests to Disrupted News Frames: Comparing. Journalism , year =

  15. [15]

    Zeitschrift f

    Adam, Raven and Kogler, Marie and Scholger, Martina , title =. Zeitschrift f

  16. [16]

    and Kaiser,

    Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser,. Attention is All You Need , booktitle =

  17. [17]

    Proceedings of NAACL-HLT 2019 , pages =

    Devlin, Jacob and Chang, Ming-Wei and Lee, Kenton and Toutanova, Kristina , title =. Proceedings of NAACL-HLT 2019 , pages =

  18. [18]

    arXiv preprint arXiv:1907.11692 , year =

    Liu, Yinhan and Ott, Myle and Goyal, Naman and Du, Jingfei and Joshi, Mandar and Chen, Danqi and Levy, Omer and Lewis, Mike and Zettlemoyer, Luke and Stoyanov, Veselin , title =. arXiv preprint arXiv:1907.11692 , year =

  19. [19]

    Proceedings of COLING 2020 , pages =

    Chan, Branden and Schweter, Stefan and M. Proceedings of COLING 2020 , pages =

  20. [20]

    Unsupervised Cross-Lingual Representation Learning at Scale , booktitle =

    Conneau, Alexis and Khandelwal, Kartikay and Goyal, Naman and Chaudhary, Vishrav and Wenzek, Guillaume and Guzm. Unsupervised Cross-Lingual Representation Learning at Scale , booktitle =

  21. [21]

    Proceedings of EMNLP-IJCNLP 2019 , pages =

    Cohan, Arman and Beltagy, Iz and King, Daniel and Dalvi, Bhavana and Weld, Daniel , title =. Proceedings of EMNLP-IJCNLP 2019 , pages =

  22. [22]

    Proceedings of COLING 2020 , pages =

    Luoma, Jouni and Pyysalo, Sampo , title =. Proceedings of COLING 2020 , pages =

  23. [23]

    Proceedings of EMNLP-IJCNLP 2019 , pages =

    Reimers, Nils and Gurevych, Iryna , title =. Proceedings of EMNLP-IJCNLP 2019 , pages =

  24. [24]

    Chinese Computational Linguistics (CCL 2019) , series =

    Sun, Chi and Qiu, Xipeng and Xu, Yige and Huang, Xuanjing , title =. Chinese Computational Linguistics (CCL 2019) , series =

  25. [25]

    Proceedings of the AAAI Fall Symposium , year =

    Webersinke, Nicolas and Kraus, Mathias and Bingler, Julia Anna and Leippold, Markus , title =. Proceedings of the AAAI Fall Symposium , year =

  26. [26]

    Findings of ACL 2023 , year =

    Stammbach, Dominik and Webersinke, Nicolas and Bingler, Julia Anna and Kraus, Mathias and Leippold, Markus , title =. Findings of ACL 2023 , year =

  27. [27]

    Proceedings of EMNLP 2020: System Demonstrations , pages =

    Wolf, Thomas and Debut, Lysandre and Sanh, Victor and others , title =. Proceedings of EMNLP 2020: System Demonstrations , pages =

  28. [28]

    Political Analysis , volume =

    Laurer, Moritz and Van Atteveldt, Wouter and Casas, Andreu and Welbers, Kasper , title =. Political Analysis , volume =

  29. [29]

    and Mann, Benjamin and Ryder, Nick and others , title =

    Brown, Tom B. and Mann, Benjamin and Ryder, Nick and others , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  30. [30]

    Advances in Neural Information Processing Systems (NeurIPS) , volume =

    Wei, Jason and Wang, Xuezhi and Schuurmans, Dale and Bosma, Maarten and Ichter, Brian and Xia, Fei and Chi, Ed and Le, Quoc and Zhou, Denny , title =. Advances in Neural Information Processing Systems (NeurIPS) , volume =

  31. [31]

    arXiv preprint arXiv:2507.13334 , year =

    Mei, Liang and Yao, Jing and Ge, Yutao and Wang, Yan and Bi, Bin and Cai, Yunhai and Liu, Jia and Li, Minghua and Li, Zhiwei and Zhang, Dongsheng and others , title =. arXiv preprint arXiv:2507.13334 , year =

  32. [32]

    arXiv preprint arXiv:2406.06608 , year =

    Schulhoff, Sander and Ilie, Michael and Balepur, Nishant and others , title =. arXiv preprint arXiv:2406.06608 , year =

  33. [33]

    arXiv preprint arXiv:2402.07927 , year =

    Sahoo, Pranab and Singh, Ayush Kumar and Saha, Sriparna and Jain, Vinija and Mondal, Samrat Saha and Chadha, Aman , title =. arXiv preprint arXiv:2402.07927 , year =

  34. [34]

    Proceedings of LoResMT 2025 (ACL Workshop) , pages =

    Mondshine, Itay and Paz-Argaman, Tamar and Tsarfaty, Reut , title =. Proceedings of LoResMT 2025 (ACL Workshop) , pages =

  35. [35]

    Bucher, Manuel J. J. and Martini, Mario , title =. arXiv preprint arXiv:2406.08660 , year =

  36. [36]

    Findings of EMNLP 2025 , year =

    Zhang, Jiacheng and Huang, Yiding and Liu, Shanshan and Gao, Yongqiang and Hu, Xueqi , title =. Findings of EMNLP 2025 , year =

  37. [37]

    Proceedings of LREC-COLING 2024 , year =

    Edwards, Alexander and Camacho-Collados, Jose , title =. Proceedings of LREC-COLING 2024 , year =

  38. [38]

    GenAI for E-Commerce Workshop , year =

    Hovsepian, Karen and Liu, David and Murugesan, Siddhardhan , title =. GenAI for E-Commerce Workshop , year =

  39. [39]

    , title =

    Dunivin, Zackary O. , title =. EPJ Data Science , volume =

  40. [40]

    Proceedings of the National Academy of Sciences (PNAS) , volume =

    Gilardi, Fabrizio and Alizadeh, Meysam and Kubli, Ma. Proceedings of the National Academy of Sciences (PNAS) , volume =

  41. [41]

    Open-Source

    Alizadeh, Meysam and Kubli, Ma. Open-Source. Journal of Computational Social Science , volume =

  42. [42]

    Computational Linguistics , volume =

    Ziems, Caleb and Held, William and Shaikh, Omar and Chen, Jiaao and Zhang, Zhehao and Yang, Diyi , title =. Computational Linguistics , volume =

  43. [43]

    Sociological Methods & Research , year =

    Chae, Youngjin and Davidson, Thomas , title =. Sociological Methods & Research , year =

  44. [44]

    and Wei, Hanying , title =

    Egami, Naoki and Hinck, Musashi and Stewart, Brandon M. and Wei, Hanying , title =. American Journal of Political Science , year =

  45. [45]

    , title =

    Grimmer, Justin and Stewart, Brandon M. , title =. Political Analysis , volume =

  46. [46]

    and Trilling, Damian , title =

    Boumans, Jelle W. and Trilling, Damian , title =. Digital Journalism , volume =

  47. [47]

    The Content Analysis of Media Frames: Toward Improving Reliability and Validity , journal =

    Matthes, J. The Content Analysis of Media Frames: Toward Improving Reliability and Validity , journal =

  48. [48]

    and Gross, Justin H

    Card, Dallas and Boydstun, Amber E. and Gross, Justin H. and Resnik, Philip and Smith, Noah A. , title =. Proceedings of ACL-IJCNLP 2015 , pages =

  49. [49]

    Proceedings of CoNLL 2019 , pages =

    Liu, Siyi and Guo, Lei and Mays, Kate and Betke, Margrit and Wijaya, Derry Tanti , title =. Proceedings of CoNLL 2019 , pages =

  50. [50]

    Human Communication Research , volume =

    Krippendorff, Klaus , title =. Human Communication Research , volume =

  51. [51]

    and Krippendorff, Klaus , title =

    Hayes, Andrew F. and Krippendorff, Klaus , title =. Communication Methods and Measures , volume =

  52. [52]

    Social Science Computer Review , year =

    Farjam, Mike and Meyer, Hendrik and Lohkamp, Moritz , title =. Social Science Computer Review , year =

  53. [53]

    Proceedings of SemEval-2023 , pages =

    Piskorski, Jakub and Stefanovitch, Nicolas and Da San Martino, Giovanni and Nakov, Preslav , title =. Proceedings of SemEval-2023 , pages =

  54. [54]

    Journal of Big Data , volume =

    Kuang, Zhixuan and others , title =. Journal of Big Data , volume =

  55. [55]

    Proceedings of EMNLP 2022 , year =

    Ali, Muskan Nawaz and Hassan, Naeemul , title =. Proceedings of EMNLP 2022 , year =

  56. [56]

    arXiv preprint arXiv:2204.03954v6 , year =

    Galke, Lukas and Scherp, Ansgar , title =. arXiv preprint arXiv:2204.03954v6 , year =

  57. [57]

    Sociological Methods & Research , volume =

    Do, Salomé and Ollion, Étienne and Shen, Rubing , title =. Sociological Methods & Research , volume =

  58. [58]

    Strategic Management Journal , year =

    Carlson, David and others , title =. Strategic Management Journal , year =

  59. [59]

    Preprint, alphaXiv , year =

    Barez, Fazl and Wu, Tze Yee and Arcuschin, Iago and Lan, Michael and Wang, Vienna and Siegel, Natalie and Collignon, Novam and Neo, Clement and Lee, Isaac and Paren, Adam and Bibi, Adel , title =. Preprint, alphaXiv , year =

  60. [60]

    2025 , type =

    Maier, David , title =. 2025 , type =