Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse

Aisha Ali Al-Athba; Wajdi Zaghouani

arxiv: 2605.22447 · v1 · pith:FS7BAMRYnew · submitted 2026-05-21 · 💻 cs.CL

Cohesion-6K: An Arabic Dataset for Analyzing Social Cohesion and Conflict in Online Discourse

Aisha Ali Al-Athba , Wajdi Zaghouani This is my paper

Pith reviewed 2026-05-22 06:57 UTC · model grok-4.3

classification 💻 cs.CL

keywords Arabic datasetsocial cohesiononline discourseconflict and resolutionFacebook postsengagement analysisannotation processpolarization

0 comments

The pith

Conflict-oriented Arabic Facebook posts receive two to four times more engagement than resolution-oriented ones.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates and releases Cohesion-6K, a dataset of six thousand Arabic public Facebook posts about the Israeli Occupation of Palestine. Posts are annotated into five categories forming a continuum from conflict to cohesion using expert humans and ChatGPT-assisted pre-labeling, with high inter-annotator agreement. The analysis finds that conflict posts attract substantially more user interaction, such as likes and comments, than posts focused on resolution. This engagement gap highlights how divisive narratives may gain more visibility in these online spaces. The open release of the dataset, guidelines, and code aims to enable more research on social cohesion and polarization in Arabic discourse.

Core claim

The paper presents Cohesion-6K as a new annotated dataset for studying social cohesion and conflict, with posts categorized along a spectrum including Conflict, Resolution, Community Engagement, Supportive Interactions, and Shared Values. It reports that conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones, a pattern that illustrates the disproportionate visibility of divisive discourse in Arabic social media.

What carries the argument

The five-category discourse annotation scheme applied to the 6,000-post Cohesion-6K dataset, which maps posts to a continuum from conflict to cohesion and allows measurement of engagement differences.

If this is right

Divisive content tends to attract disproportionate attention on Arabic social media platforms.
Resolution-focused posts receive less visibility, potentially limiting their reach.
The dataset enables systematic study of cohesion-building narratives in online discussions.
Similar engagement patterns may exist in other languages or topics beyond the Israeli Occupation of Palestine.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Platform algorithms might be adjusted to balance visibility between conflict and resolution content.
Future studies could apply the annotation approach to measure cohesion in real-time discussions.
Training data from this dataset could improve models for detecting subtle polarization in Arabic text.

Load-bearing premise

The five discourse categories accurately capture a meaningful continuum from conflict to cohesion for these posts and that the annotation process yields reliable labels.

What would settle it

A replication on an independent sample of similar posts showing no significant difference in user engagement between conflict and resolution categories.

read the original abstract

The study of online discourse has become central to understanding societal polarization. While much research has focused on detecting overt toxicity, the subtle dynamics of social cohesion, meaning the interaction between divisive and unifying narratives, remain computationally underexplored (Bail, 2021; Gonzalez-Bailon and Lelkes, 2023). This paper presents Cohesion-6K, a manually and ChatGPT-assisted annotated dataset of six thousand Arabic public Facebook posts related to the Israeli Occupation of Palestine. Each post is assigned to one of five discourse categories that represent a continuum from conflict to cohesion: Conflict, Resolution, Community Engagement, Supportive Interactions, and Shared Values. The annotation process combines expert human judgment with model-assisted pre-labeling verified by trained annotators, achieving substantial inter-annotator agreement (Cohens kappa = 0.85). Quantitative analysis reveals a consistent engagement gap, where conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones (p < 0.01). This pattern illustrates how divisive discourse tends to attract disproportionate visibility in Arabic social media spaces. Cohesion-6K provides a transparent and reproducible resource for the study of online cohesion and polarization. The dataset, annotation guidelines, and preprocessing code will be released for research use under an open license, supporting future work in computational social science, digital communication, and Arabic natural language processing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives us a new 6K Arabic Facebook dataset on discourse about the Israeli Occupation of Palestine, labeled into five cohesion-to-conflict categories, plus a basic finding that conflict posts draw more engagement.

read the letter

The main thing here is a new annotated Arabic dataset of 6000 public Facebook posts on the Israeli Occupation of Palestine. The authors define five categories along a continuum from Conflict to Shared Values, use ChatGPT for pre-labeling followed by human verification, report Cohen's kappa of 0.85, and note that conflict-oriented posts get two to four times the engagement of resolution ones at p less than 0.01. They plan to release the data, guidelines, and code openly.

Referee Report

2 major / 2 minor

Summary. The paper introduces Cohesion-6K, a dataset of 6,000 Arabic Facebook posts concerning the Israeli Occupation of Palestine. Posts are labeled into one of five discourse categories forming a continuum from conflict to cohesion (Conflict, Resolution, Community Engagement, Supportive Interactions, Shared Values) via a hybrid annotation process combining ChatGPT pre-labeling with human verification. The work reports Cohen's kappa of 0.85 for inter-annotator agreement and presents a quantitative finding that conflict-oriented posts receive 2–4 times more user engagement than resolution-oriented posts (p < 0.01). The dataset, guidelines, and code are to be released openly.

Significance. If the category labels prove reliable, Cohesion-6K would constitute a useful, openly available resource for computational social science and Arabic NLP research on polarization and cohesion in online discourse, especially for a geopolitically sensitive topic. The reported engagement gap, if robustly supported, would provide empirical evidence of visibility advantages for divisive content. The explicit commitment to releasing data, annotation guidelines, and preprocessing code is a clear strength for reproducibility.

major comments (2)

[§3.2] §3.2 (Annotation Methodology): The overall Cohen's kappa of 0.85 is presented as evidence of reliable labeling, yet no per-class agreement scores, confusion matrix, or breakdown by category (particularly Conflict vs. Resolution) is provided. Because the central engagement-gap result is computed directly from these five-category labels on a politically contested topic, the lack of these diagnostics leaves open the possibility that systematic differences in annotator or model handling of nuanced Arabic framing could artifactually widen the observed 2–4× gap.
[§5] §5 (Quantitative Results): The engagement-gap claim (2–4× higher interaction for conflict posts, p < 0.01) is load-bearing for the paper's empirical contribution, but the manuscript does not specify the exact statistical test, whether engagement metrics were normalized for post length or follower count, or the number of posts per category. Without these details it is impossible to evaluate whether the reported significance holds after standard controls.

minor comments (2)

[Abstract] The abstract cites Bail (2021) and Gonzalez-Bailon & Lelkes (2023) without full bibliographic details; these should appear in the references section.
[§4] Table 1 or the category definitions section would benefit from one or two concrete post examples per label to clarify the boundary between Community Engagement and Supportive Interactions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below and have revised the manuscript to incorporate additional diagnostics and clarifications where these strengthen the work.

read point-by-point responses

Referee: [§3.2] The overall Cohen's kappa of 0.85 is presented as evidence of reliable labeling, yet no per-class agreement scores, confusion matrix, or breakdown by category (particularly Conflict vs. Resolution) is provided. Because the central engagement-gap result is computed directly from these five-category labels on a politically contested topic, the lack of these diagnostics leaves open the possibility that systematic differences in annotator or model handling of nuanced Arabic framing could artifactually widen the observed 2–4× gap.

Authors: We agree that per-class agreement metrics and a confusion matrix would enhance transparency and allow readers to evaluate potential systematic differences in handling nuanced categories on this sensitive topic. In the revised manuscript we will add a new table reporting per-class Cohen's kappa values together with the inter-annotator confusion matrix. We note that the overall kappa of 0.85 already indicates substantial agreement, but the additional breakdown will directly address the concern about possible artifacts in the Conflict–Resolution distinction. revision: yes
Referee: [§5] The engagement-gap claim (2–4× higher interaction for conflict posts, p < 0.01) is load-bearing for the paper's empirical contribution, but the manuscript does not specify the exact statistical test, whether engagement metrics were normalized for post length or follower count, or the number of posts per category. Without these details it is impossible to evaluate whether the reported significance holds after standard controls.

Authors: We appreciate the request for greater statistical transparency. The revised §5 will explicitly state that a Mann–Whitney U test was applied to the raw engagement counts (likes + comments + shares) after confirming non-normality via Shapiro–Wilk tests. Engagement metrics were not normalized for post length or follower count, as the analysis intentionally examines platform visibility as observed; we will add a brief discussion of this choice and its implications. A table reporting the exact number of posts per discourse category will also be included. These additions clarify the analysis without altering the reported 2–4× gap or significance level. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical dataset analysis is self-contained observational result

full rationale

The paper creates and annotates the Cohesion-6K dataset using a hybrid human-ChatGPT process, reports Cohen's kappa of 0.85, and then directly computes engagement statistics across the five discourse categories to observe the 2-4x gap (p<0.01). This is a straightforward empirical measurement from the labeled posts rather than any derivation, prediction, or first-principles result that reduces to its own inputs by construction. No equations, fitted parameters renamed as predictions, self-citation load-bearing uniqueness claims, or ansatzes appear in the provided text. The analysis stands on the dataset itself and external benchmarks for agreement, with no reduction of the central claim to a self-referential loop.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central contribution rests on the validity of the chosen discourse categories and the reliability of the annotation process; no free parameters or invented entities are introduced.

axioms (1)

domain assumption The five categories (Conflict, Resolution, Community Engagement, Supportive Interactions, Shared Values) form a valid continuum from divisive to unifying discourse.
This modeling choice underpins the entire annotation and analysis but is presented without independent validation beyond the annotation agreement score.

pith-pipeline@v0.9.0 · 5787 in / 1267 out tokens · 53147 ms · 2026-05-22T06:57:26.492928+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Quantitative analysis reveals a consistent engagement gap, where conflict-oriented posts receive between two and four times more user interaction than resolution-oriented ones (p < 0.01).
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The annotation process combines expert human judgment with model-assisted pre-labeling verified by trained annotators, achieving substantial inter-annotator agreement (Cohen’s κ = 0.85).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

In Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Re - sources and Evaluation (LREC-COLING 2024), pages 11060–11069

MARASTA: A multi -dialectal arabic cross-domain stance corpus. In Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Re - sources and Evaluation (LREC-COLING 2024), pages 11060–11069. John W. Creswell and J. David Creswell. 2018. Research design: Qualitative, quantitative, and mixed methods approaches , 5 editi...

work page 2024
[2]

Social Issues and Policy Review, 17(1):155–180

Do social media undermine social cohe - sion? a critical review. Social Issues and Policy Review, 17(1):155–180. Ahmad Hamad Kareem and Yaseen Mahmood Najm. 2024. A critical discourse analysis of the biased role of western media in the israeli - palestinian conflict. JOURNAL OF LANGUAGE STUDIES, 8(6):200–215. T. Khaund, B. Kirdemir, N. Agarwal, H. Liu, an...

work page 2024
[3]

European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715

Changing personal values through value- manipulation tasks: A systematic literature re - view based on schwartz’s theory of basic human values. European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715. Alexey Shestakov and Wajdi Zaghouani. 2024. An- alyzing conflict through data: A dataset on the digital framing of Sheikh Jar...

work page 2024

[1] [1]

In Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Re - sources and Evaluation (LREC-COLING 2024), pages 11060–11069

MARASTA: A multi -dialectal arabic cross-domain stance corpus. In Proceed- ings of the 2024 Joint International Conference on Computational Linguistics, Language Re - sources and Evaluation (LREC-COLING 2024), pages 11060–11069. John W. Creswell and J. David Creswell. 2018. Research design: Qualitative, quantitative, and mixed methods approaches , 5 editi...

work page 2024

[2] [2]

Social Issues and Policy Review, 17(1):155–180

Do social media undermine social cohe - sion? a critical review. Social Issues and Policy Review, 17(1):155–180. Ahmad Hamad Kareem and Yaseen Mahmood Najm. 2024. A critical discourse analysis of the biased role of western media in the israeli - palestinian conflict. JOURNAL OF LANGUAGE STUDIES, 8(6):200–215. T. Khaund, B. Kirdemir, N. Agarwal, H. Liu, an...

work page 2024

[3] [3]

European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715

Changing personal values through value- manipulation tasks: A systematic literature re - view based on schwartz’s theory of basic human values. European Journal of Investigation in Health, Psychology and Education , 12(7):692– 715. Alexey Shestakov and Wajdi Zaghouani. 2024. An- alyzing conflict through data: A dataset on the digital framing of Sheikh Jar...

work page 2024