pith. sign in

arxiv: 2604.21034 · v2 · submitted 2026-04-22 · 💻 cs.HC

White Paper: Human-AI Collaboration in Conflict Analysis: Text Classifier Development with Peacebuilders

Pith reviewed 2026-05-09 23:02 UTC · model grok-4.3

classification 💻 cs.HC
keywords participatory AItext classificationconflict monitoringhate speech detectionpolarization analysishuman-AI collaborationhumanitarian technology
0
0 comments X

The pith

Peacebuilders and data scientists in Kenya and Sudan jointly built text classifiers that reduce cultural misclassifications in polarization and hate speech monitoring.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a process in which peacebuilders and data scientists collaborated on defining problems, annotating data, and evaluating models for detecting online polarization in Kenya and hate speech in Sudan. Fine-tuned BERT classifiers trained on this jointly labeled data showed stronger alignment with local contexts, fewer errors from overlooked cultural details, and greater acceptance among the practitioners who would use the tools. A sympathetic reader would see this as evidence that including domain experts throughout AI development can address multiple practical shortcomings at once in high-stakes humanitarian settings.

Core claim

The authors establish that a participatory process—where peacebuilders contribute to problem definition, annotation design, iterative validation, and model evaluation—produces BERT-based classifiers that achieve enhanced contextual alignment, reduced misclassification from cultural nuance, and increased practitioner ownership compared with standard development approaches, as measured on held-out test sets from the Kenya and Sudan cases.

What carries the argument

The participatory annotation process in which practitioners and domain experts jointly label data, refine guidelines, and validate model outputs across iterative rounds.

Load-bearing premise

That the gains in contextual accuracy and reduced errors come from the participatory steps themselves rather than from larger datasets, different model choices, or how the test sets were built.

What would settle it

A side-by-side trial on the same raw data where one team follows the participatory annotation steps and another does not, then measuring whether the participatory version still shows lower cultural misclassification rates and higher user acceptance.

read the original abstract

This paper documents a collaborative research process involving peacebuilders and data scientists in Kenya and Sudan to develop AI-based text classifiers for monitoring online polarization and hatespeech. The method describes a participatory annotation process in which practitioners and domain experts contributed to problem definition, annotation design, iterative validation, and model evaluation. Fine-tuned BERT-based classifiers were trained on collaboratively annotated datasets and evaluated against held-out test sets. In each case, the models produced enhanced contextual alignment, reduced misclassification driven by cultural nuance, and increased practitioner ownership of AI tools. The resulting models (Kenya-polarization and Sudan-hate speech) are open-source and accessible via HuggingFace. The study contributes empirical evidence that participatory AI development can simultaneously improve technical robustness, contextual validity, and normative alignment in sensitive humanitarian domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. This white paper documents a participatory collaboration between peacebuilders in Kenya and Sudan and data scientists to develop BERT-based text classifiers for monitoring online polarization and hate speech. Practitioners contributed to problem definition, annotation design, iterative validation, and model evaluation. The resulting fine-tuned models are claimed to show enhanced contextual alignment, reduced misclassifications due to cultural nuance, and increased practitioner ownership, with the Kenya-polarization and Sudan-hate speech models released as open-source on Hugging Face. The work positions this as empirical evidence that participatory AI can improve technical robustness, contextual validity, and normative alignment in humanitarian domains.

Significance. If the participatory annotation process can be shown to produce causally superior models in these sensitive domains, the paper would provide a useful case study for ethical AI development in conflict monitoring, highlighting benefits for contextual validity and stakeholder ownership. The open-sourcing of the models is a clear strength that enables reproducibility and extension by others.

major comments (3)
  1. [Abstract] Abstract: The central claims of 'enhanced contextual alignment' and 'reduced misclassification driven by cultural nuance' are presented without any quantitative support such as F1 scores, precision/recall, error rates, or direct comparisons to non-participatory baselines on the same held-out sets.
  2. [Evaluation] Evaluation description: No details are given on test-set construction, controls for data leakage from the participatory loop, ablation on annotation volume or iteration count, or how cultural nuance was operationalized and measured, making it impossible to attribute improvements to the participatory mechanism rather than confounders like dataset size or model choice.
  3. [Methods] Methods: The participatory process is described at a high level but lacks specifics on inter-annotator agreement, number of validation iterations, how practitioner feedback altered guidelines or labels, or any statistical tests for the reported improvements.
minor comments (2)
  1. [Overall] The manuscript would benefit from explicit limitations section discussing generalizability beyond the two case studies and potential biases in practitioner selection.
  2. [Abstract] Ensure consistent terminology for 'contextual alignment' and 'normative alignment' across the text, with operational definitions.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their constructive comments on our white paper. We address each major point below and commit to revisions that improve clarity, reproducibility, and appropriate qualification of claims without overstating the study's scope as a process-oriented white paper rather than a controlled benchmark experiment.

read point-by-point responses
  1. Referee: [Abstract] The central claims of 'enhanced contextual alignment' and 'reduced misclassification driven by cultural nuance' are presented without any quantitative support such as F1 scores, precision/recall, error rates, or direct comparisons to non-participatory baselines on the same held-out sets.

    Authors: We agree the abstract overstates the evidential basis. The claims derive from practitioner-identified error patterns and qualitative validation rather than formal metrics or baseline comparisons, which were outside the white paper's scope. We will revise the abstract to remove or qualify these phrases, explicitly noting that improvements are supported by participatory feedback and observed contextual errors rather than quantitative superiority. No new experiments will be added. revision: yes

  2. Referee: [Evaluation] No details are given on test-set construction, controls for data leakage from the participatory loop, ablation on annotation volume or iteration count, or how cultural nuance was operationalized and measured, making it impossible to attribute improvements to the participatory mechanism rather than confounders like dataset size or model choice.

    Authors: We accept this critique. The revised manuscript will expand the evaluation section with specifics on held-out test set construction (random split after final annotation round, with practitioner review for independence), examples of how cultural nuance was identified through practitioner disagreement cases, and explicit discussion of potential leakage risks from the iterative loop. We did not conduct ablations or formal controls for confounders, as the study prioritized documenting the collaborative workflow over isolating causal factors; a limitations paragraph will be added to acknowledge this. revision: partial

  3. Referee: [Methods] The participatory process is described at a high level but lacks specifics on inter-annotator agreement, number of validation iterations, how practitioner feedback altered guidelines or labels, or any statistical tests for the reported improvements.

    Authors: We will revise the methods section to include available details: inter-annotator agreement metrics (e.g., percentage agreement and any kappa values computed during annotation), the number of validation rounds (typically 2-3 per dataset), and concrete examples of how peacebuilder feedback revised label guidelines and corrected initial annotations. No statistical tests were performed given the qualitative emphasis; we will clarify this and avoid implying statistical significance. revision: yes

standing simulated objections not resolved
  • Direct quantitative comparisons to non-participatory baselines and ablations on annotation volume/iterations were never conducted and cannot be retroactively added without new data collection and experiments, which exceed the scope of this white paper.

Circularity Check

0 steps flagged

No circularity; empirical process description with no derivations

full rationale

The paper documents a participatory annotation workflow with peacebuilders, describes training fine-tuned BERT classifiers on the resulting datasets, and reports held-out evaluation outcomes such as contextual alignment. No equations, parameter fits, or derivation chains exist that could reduce claims to self-defined quantities. The central contribution is framed as empirical evidence from a described process rather than a mathematical prediction or uniqueness theorem. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results appear in the provided text. This is self-contained empirical reporting against external benchmarks (open-source models on HuggingFace), consistent with the default non-circular finding for such work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that participatory annotation produces measurable gains in contextual validity without introducing new biases; no free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5461 in / 1065 out tokens · 17037 ms · 2026-05-09T23:02:46.144218+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

  1. [1]

    B., & Kulesza, T

    Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120. Dellermann, D., Ebel, P., Söllner, M., & Leimeister, J. M. (2019). The future of human–AI collaboration: A taxonomy and research agenda. International Journal of Information Management, 49, 82–92...

  2. [2]

    Spang, and Sebastian Möller

    Data, Annotation, and Meaning-Making: The Politics of Categorization in Annotating a Dataset of Faith-based Communal Violence. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24). Association for Computing Machinery, New York, NY, USA, 2148–2156. https://doi.org/10.1145/3630106.3659030 Pauls, Evelyn

  3. [3]

    Policy Brief

    Participatory Methods in Peacebuilding Work. Policy Brief. Berlin: Berghof Foundation. Puig Larrauri, H. (2023). How to find evidence of divisive behavior on social media. Build Up Blog. Retrieved September 29, 2023, from https://howtobuildup.medium.com/how-to-find-evidence-of-divisive-behavior-on-social-media-7b5322d9d65b Rathje, S., Mirea, D.-M., Sucholut...