Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse
Pith reviewed 2026-05-22 06:57 UTC · model grok-4.3
The pith
Conflict-oriented Arabic Facebook posts receive two to four times more engagement than resolution-oriented ones.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper presents Cohesion-6K as a new annotated dataset for studying social cohesion and conflict, with posts categorized along a spectrum including Conflict, Resolution, Community Engagement, Supportive Interactions, and Shared Values. It reports that conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones, a pattern that illustrates the disproportionate visibility of divisive discourse in Arabic social media.
What carries the argument
The five-category discourse annotation scheme applied to the 6,000-post Cohesion-6K dataset, which maps posts to a continuum from conflict to cohesion and allows measurement of engagement differences.
If this is right
- Divisive content tends to attract disproportionate attention on Arabic social media platforms.
- Resolution-focused posts receive less visibility, potentially limiting their reach.
- The dataset enables systematic study of cohesion-building narratives in online discussions.
- Similar engagement patterns may exist in other languages or topics beyond the Israeli Occupation of Palestine.
Where Pith is reading between the lines
- Platform algorithms might be adjusted to balance visibility between conflict and resolution content.
- Future studies could apply the annotation approach to measure cohesion in real-time discussions.
- Training data from this dataset could improve models for detecting subtle polarization in Arabic text.
Load-bearing premise
The five discourse categories accurately capture a meaningful continuum from conflict to cohesion for these posts and that the annotation process yields reliable labels.
What would settle it
A replication on an independent sample of similar posts showing no significant difference in user engagement between conflict and resolution categories.
read the original abstract
The study of online discourse has become central to understanding societal polarization. While much research has focused on detecting overt toxicity, the subtle dynamics of social cohesion, meaning the interaction between divisive and unifying narratives, remain computationally underexplored (Bail, 2021; Gonzalez-Bailon and Lelkes, 2023). This paper presents Cohesion-6K, a manually and ChatGPT-assisted annotated dataset of six thousand Arabic public Facebook posts related to the Israeli Occupation of Palestine. Each post is assigned to one of five discourse categories that represent a continuum from conflict to cohesion: Conflict, Resolution, Community Engagement, Supportive Interactions, and Shared Values. The annotation process combines expert human judgment with model-assisted pre-labeling verified by trained annotators, achieving substantial inter-annotator agreement (Cohens kappa = 0.85). Quantitative analysis reveals a consistent engagement gap, where conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones (p < 0.01). This pattern illustrates how divisive discourse tends to attract disproportionate visibility in Arabic social media spaces. Cohesion-6K provides a transparent and reproducible resource for the study of online cohesion and polarization. The dataset, annotation guidelines, and preprocessing code will be released for research use under an open license, supporting future work in computational social science, digital communication, and Arabic natural language processing.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Cohesion-6K, a dataset of 6,000 Arabic Facebook posts concerning the Israeli Occupation of Palestine. Posts are labeled into one of five discourse categories forming a continuum from conflict to cohesion (Conflict, Resolution, Community Engagement, Supportive Interactions, Shared Values) via a hybrid annotation process combining ChatGPT pre-labeling with human verification. The work reports Cohen's kappa of 0.85 for inter-annotator agreement and presents a quantitative finding that conflict-oriented posts receive 2–4 times more user engagement than resolution-oriented posts (p < 0.01). The dataset, guidelines, and code are to be released openly.
Significance. If the category labels prove reliable, Cohesion-6K would constitute a useful, openly available resource for computational social science and Arabic NLP research on polarization and cohesion in online discourse, especially for a geopolitically sensitive topic. The reported engagement gap, if robustly supported, would provide empirical evidence of visibility advantages for divisive content. The explicit commitment to releasing data, annotation guidelines, and preprocessing code is a clear strength for reproducibility.
major comments (2)
- [§3.2] §3.2 (Annotation Methodology): The overall Cohen's kappa of 0.85 is presented as evidence of reliable labeling, yet no per-class agreement scores, confusion matrix, or breakdown by category (particularly Conflict vs. Resolution) is provided. Because the central engagement-gap result is computed directly from these five-category labels on a politically contested topic, the lack of these diagnostics leaves open the possibility that systematic differences in annotator or model handling of nuanced Arabic framing could artifactually widen the observed 2–4× gap.
- [§5] §5 (Quantitative Results): The engagement-gap claim (2–4× higher interaction for conflict posts, p < 0.01) is load-bearing for the paper's empirical contribution, but the manuscript does not specify the exact statistical test, whether engagement metrics were normalized for post length or follower count, or the number of posts per category. Without these details it is impossible to evaluate whether the reported significance holds after standard controls.
minor comments (2)
- [Abstract] The abstract cites Bail (2021) and Gonzalez-Bailon & Lelkes (2023) without full bibliographic details; these should appear in the references section.
- [§4] Table 1 or the category definitions section would benefit from one or two concrete post examples per label to clarify the boundary between Community Engagement and Supportive Interactions.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate additional diagnostics and clarifications where these strengthen the work.
read point-by-point responses
-
Referee: [§3.2] The overall Cohen's kappa of 0.85 is presented as evidence of reliable labeling, yet no per-class agreement scores, confusion matrix, or breakdown by category (particularly Conflict vs. Resolution) is provided. Because the central engagement-gap result is computed directly from these five-category labels on a politically contested topic, the lack of these diagnostics leaves open the possibility that systematic differences in annotator or model handling of nuanced Arabic framing could artifactually widen the observed 2–4× gap.
Authors: We agree that per-class agreement metrics and a confusion matrix would enhance transparency and allow readers to evaluate potential systematic differences in handling nuanced categories on this sensitive topic. In the revised manuscript we will add a new table reporting per-class Cohen's kappa values together with the inter-annotator confusion matrix. We note that the overall kappa of 0.85 already indicates substantial agreement, but the additional breakdown will directly address the concern about possible artifacts in the Conflict–Resolution distinction. revision: yes
-
Referee: [§5] The engagement-gap claim (2–4× higher interaction for conflict posts, p < 0.01) is load-bearing for the paper's empirical contribution, but the manuscript does not specify the exact statistical test, whether engagement metrics were normalized for post length or follower count, or the number of posts per category. Without these details it is impossible to evaluate whether the reported significance holds after standard controls.
Authors: We appreciate the request for greater statistical transparency. The revised §5 will explicitly state that a Mann–Whitney U test was applied to the raw engagement counts (likes + comments + shares) after confirming non-normality via Shapiro–Wilk tests. Engagement metrics were not normalized for post length or follower count, as the analysis intentionally examines platform visibility as observed; we will add a brief discussion of this choice and its implications. A table reporting the exact number of posts per discourse category will also be included. These additions clarify the analysis without altering the reported 2–4× gap or significance level. revision: yes
Circularity Check
No circularity: empirical dataset analysis is self-contained observational result
full rationale
The paper creates and annotates the Cohesion-6K dataset using a hybrid human-ChatGPT process, reports Cohen's kappa of 0.85, and then directly computes engagement statistics across the five discourse categories to observe the 2-4x gap (p<0.01). This is a straightforward empirical measurement from the labeled posts rather than any derivation, prediction, or first-principles result that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, self-citation load-bearing uniqueness claims, or ansatzes appear in the provided text. The analysis stands on the dataset itself and external benchmarks for agreement, with no reduction of the central claim to a self-referential loop.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The five categories (Conflict, Resolution, Community Engagement, Supportive Interactions, Shared Values) form a valid continuum from divisive to unifying discourse.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Quantitative analysis reveals a consistent engagement gap, where conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones (p < 0.01).
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The annotation process combines expert human judgment with model-assisted pre-labeling verified by trained annotators, achieving substantial inter-annotator agreement (Cohen’s κ = 0.85).
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
MARASTA: A multi -dialectal arabic cross-domain stance corpus. In Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Re - sources and Evaluation (LREC-COLING 2024), pages 11060–11069. John W. Creswell and J. David Creswell. 2018. Research design: Qualitative, quantitative, and mixed methods approaches , 5 editi...
work page 2024
-
[2]
Social Issues and Policy Review, 17(1):155–180
Do social media undermine social cohe - sion? a critical review. Social Issues and Policy Review, 17(1):155–180. Ahmad Hamad Kareem and Yaseen Mahmood Najm. 2024. A critical discourse analysis of the biased role of western media in the israeli - palestinian conflict. JOURNAL OF LANGUAGE STUDIES, 8(6):200–215. T. Khaund, B. Kirdemir, N. Agarwal, H. Liu, an...
work page 2024
-
[3]
European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715
Changing personal values through value- manipulation tasks: A systematic literature re - view based on schwartz’s theory of basic human values. European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715. Alexey Shestakov and Wajdi Zaghouani. 2024. An- alyzing conflict through data: A dataset on the digital framing of Sheikh Jar...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.