White Paper: Human-AI Collaboration in Conflict Analysis: Text Classifier Development with Peacebuilders

Allan Kipyator Kipkemboi Cheboi; Andrew Sutjahjo; Daniel Burkhardt Cerigo; Hussam Abualfatah; Julie Hawke; Rachael Olpengs; William OBrien

arxiv: 2604.21034 · v2 · submitted 2026-04-22 · 💻 cs.HC

White Paper: Human-AI Collaboration in Conflict Analysis: Text Classifier Development with Peacebuilders

Allan Kipyator Kipkemboi Cheboi , Julie Hawke , Hussam Abualfatah , Andrew Sutjahjo , Daniel Burkhardt Cerigo , Rachael Olpengs , William OBrien This is my paper

Pith reviewed 2026-05-09 23:02 UTC · model grok-4.3

classification 💻 cs.HC

keywords participatory AItext classificationconflict monitoringhate speech detectionpolarization analysishuman-AI collaborationhumanitarian technology

0 comments

The pith

Peacebuilders and data scientists in Kenya and Sudan jointly built text classifiers that reduce cultural misclassifications in polarization and hate speech monitoring.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper describes a process in which peacebuilders and data scientists collaborated on defining problems, annotating data, and evaluating models for detecting online polarization in Kenya and hate speech in Sudan. Fine-tuned BERT classifiers trained on this jointly labeled data showed stronger alignment with local contexts, fewer errors from overlooked cultural details, and greater acceptance among the practitioners who would use the tools. A sympathetic reader would see this as evidence that including domain experts throughout AI development can address multiple practical shortcomings at once in high-stakes humanitarian settings.

Core claim

The authors establish that a participatory process—where peacebuilders contribute to problem definition, annotation design, iterative validation, and model evaluation—produces BERT-based classifiers that achieve enhanced contextual alignment, reduced misclassification from cultural nuance, and increased practitioner ownership compared with standard development approaches, as measured on held-out test sets from the Kenya and Sudan cases.

What carries the argument

The participatory annotation process in which practitioners and domain experts jointly label data, refine guidelines, and validate model outputs across iterative rounds.

Load-bearing premise

That the gains in contextual accuracy and reduced errors come from the participatory steps themselves rather than from larger datasets, different model choices, or how the test sets were built.

What would settle it

A side-by-side trial on the same raw data where one team follows the participatory annotation steps and another does not, then measuring whether the participatory version still shows lower cultural misclassification rates and higher user acceptance.

read the original abstract

This paper documents a collaborative research process involving peacebuilders and data scientists in Kenya and Sudan to develop AI-based text classifiers for monitoring online polarization and hatespeech. The method describes a participatory annotation process in which practitioners and domain experts contributed to problem definition, annotation design, iterative validation, and model evaluation. Fine-tuned BERT-based classifiers were trained on collaboratively annotated datasets and evaluated against held-out test sets. In each case, the models produced enhanced contextual alignment, reduced misclassification driven by cultural nuance, and increased practitioner ownership of AI tools. The resulting models (Kenya-polarization and Sudan-hate speech) are open-source and accessible via HuggingFace. The study contributes empirical evidence that participatory AI development can simultaneously improve technical robustness, contextual validity, and normative alignment in sensitive humanitarian domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This white paper describes a participatory process for building open BERT classifiers on Kenya polarization and Sudan hate speech but offers no numbers or controls to back its claims of improvement.

read the letter

This paper is a white paper on building text classifiers for online polarization in Kenya and hate speech in Sudan through collaboration between peacebuilders and data scientists. The key takeaway is that they used a participatory annotation process and released the fine-tuned BERT models openly on HuggingFace, but the work provides no quantitative results to support its claims of better performance. What is new is the specific application to these two conflict settings with local practitioners involved at every stage, from problem definition to model evaluation. The open-source release is a concrete step that others can build on or adapt. The paper does well in describing the process in detail and highlighting how local input can help with contextual validity and normative alignment. Involving domain experts in annotation design and iterative validation is a reasonable approach for sensitive topics where cultural nuance matters. The soft spots are in the evidence. The abstract claims enhanced contextual alignment and reduced misclassification due to cultural nuance, yet there are no numbers, no baseline comparisons to non-participatory annotation, and no details on how improvements were measured. Without F1 scores, error rates, or controls for dataset size and model choice, it's impossible to attribute gains to the participatory method rather than other factors. The held-out test sets are referenced but lack information on their construction or independence from the training process. This kind of work is useful for humanitarian practitioners and organizations looking for examples of AI development in conflict monitoring. Readers focused on practical deployment and ethics in AI for social good will get value from the process description. Core machine learning researchers seeking novel techniques or rigorous benchmarks will find less here. The paper shows clear thinking about the importance of local involvement, so it qualifies as serious engagement with the issues. I would recommend sending it to peer review, but only after the authors add the missing quantitative evaluations and comparisons to make the claims verifiable.

Referee Report

3 major / 2 minor

Summary. This white paper documents a participatory collaboration between peacebuilders in Kenya and Sudan and data scientists to develop BERT-based text classifiers for monitoring online polarization and hate speech. Practitioners contributed to problem definition, annotation design, iterative validation, and model evaluation. The resulting fine-tuned models are claimed to show enhanced contextual alignment, reduced misclassifications due to cultural nuance, and increased practitioner ownership, with the Kenya-polarization and Sudan-hate speech models released as open-source on Hugging Face. The work positions this as empirical evidence that participatory AI can improve technical robustness, contextual validity, and normative alignment in humanitarian domains.

Significance. If the participatory annotation process can be shown to produce causally superior models in these sensitive domains, the paper would provide a useful case study for ethical AI development in conflict monitoring, highlighting benefits for contextual validity and stakeholder ownership. The open-sourcing of the models is a clear strength that enables reproducibility and extension by others.

major comments (3)

[Abstract] Abstract: The central claims of 'enhanced contextual alignment' and 'reduced misclassification driven by cultural nuance' are presented without any quantitative support such as F1 scores, precision/recall, error rates, or direct comparisons to non-participatory baselines on the same held-out sets.
[Evaluation] Evaluation description: No details are given on test-set construction, controls for data leakage from the participatory loop, ablation on annotation volume or iteration count, or how cultural nuance was operationalized and measured, making it impossible to attribute improvements to the participatory mechanism rather than confounders like dataset size or model choice.
[Methods] Methods: The participatory process is described at a high level but lacks specifics on inter-annotator agreement, number of validation iterations, how practitioner feedback altered guidelines or labels, or any statistical tests for the reported improvements.

minor comments (2)

[Overall] The manuscript would benefit from explicit limitations section discussing generalizability beyond the two case studies and potential biases in practitioner selection.
[Abstract] Ensure consistent terminology for 'contextual alignment' and 'normative alignment' across the text, with operational definitions.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their constructive comments on our white paper. We address each major point below and commit to revisions that improve clarity, reproducibility, and appropriate qualification of claims without overstating the study's scope as a process-oriented white paper rather than a controlled benchmark experiment.

read point-by-point responses

Referee: [Abstract] The central claims of 'enhanced contextual alignment' and 'reduced misclassification driven by cultural nuance' are presented without any quantitative support such as F1 scores, precision/recall, error rates, or direct comparisons to non-participatory baselines on the same held-out sets.

Authors: We agree the abstract overstates the evidential basis. The claims derive from practitioner-identified error patterns and qualitative validation rather than formal metrics or baseline comparisons, which were outside the white paper's scope. We will revise the abstract to remove or qualify these phrases, explicitly noting that improvements are supported by participatory feedback and observed contextual errors rather than quantitative superiority. No new experiments will be added. revision: yes
Referee: [Evaluation] No details are given on test-set construction, controls for data leakage from the participatory loop, ablation on annotation volume or iteration count, or how cultural nuance was operationalized and measured, making it impossible to attribute improvements to the participatory mechanism rather than confounders like dataset size or model choice.

Authors: We accept this critique. The revised manuscript will expand the evaluation section with specifics on held-out test set construction (random split after final annotation round, with practitioner review for independence), examples of how cultural nuance was identified through practitioner disagreement cases, and explicit discussion of potential leakage risks from the iterative loop. We did not conduct ablations or formal controls for confounders, as the study prioritized documenting the collaborative workflow over isolating causal factors; a limitations paragraph will be added to acknowledge this. revision: partial
Referee: [Methods] The participatory process is described at a high level but lacks specifics on inter-annotator agreement, number of validation iterations, how practitioner feedback altered guidelines or labels, or any statistical tests for the reported improvements.

Authors: We will revise the methods section to include available details: inter-annotator agreement metrics (e.g., percentage agreement and any kappa values computed during annotation), the number of validation rounds (typically 2-3 per dataset), and concrete examples of how peacebuilder feedback revised label guidelines and corrected initial annotations. No statistical tests were performed given the qualitative emphasis; we will clarify this and avoid implying statistical significance. revision: yes

standing simulated objections not resolved

Direct quantitative comparisons to non-participatory baselines and ablations on annotation volume/iterations were never conducted and cannot be retroactively added without new data collection and experiments, which exceed the scope of this white paper.

Circularity Check

0 steps flagged

No circularity; empirical process description with no derivations

full rationale

The paper documents a participatory annotation workflow with peacebuilders, describes training fine-tuned BERT classifiers on the resulting datasets, and reports held-out evaluation outcomes such as contextual alignment. No equations, parameter fits, or derivation chains exist that could reduce claims to self-defined quantities. The central contribution is framed as empirical evidence from a described process rather than a mathematical prediction or uniqueness theorem. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results appear in the provided text. This is self-contained empirical reporting against external benchmarks (open-source models on HuggingFace), consistent with the default non-circular finding for such work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central claim rests on the assumption that participatory annotation produces measurable gains in contextual validity without introducing new biases; no free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5461 in / 1065 out tokens · 17037 ms · 2026-05-09T23:02:46.144218+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

B., & Kulesza, T

Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120. Dellermann, D., Ebel, P., Söllner, M., & Leimeister, J. M. (2019). The future of human–AI collaboration: A taxonomy and research agenda. International Journal of Information Management, 49, 82–92...

work page arXiv 2014
[2]

Spang, and Sebastian Möller

Data, Annotation, and Meaning-Making: The Politics of Categorization in Annotating a Dataset of Faith-based Communal Violence. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24). Association for Computing Machinery, New York, NY, USA, 2148–2156. https://doi.org/10.1145/3630106.3659030 Pauls, Evelyn

work page doi:10.1145/3630106.3659030 2024
[3]

Policy Brief

Participatory Methods in Peacebuilding Work. Policy Brief. Berlin: Berghof Foundation. Puig Larrauri, H. (2023). How to ﬁnd evidence of divisive behavior on social media. Build Up Blog. Retrieved September 29, 2023, from https://howtobuildup.medium.com/how-to-ﬁnd-evidence-of-divisive-behavior-on-social-media-7b5322d9d65b Rathje, S., Mirea, D.-M., Sucholut...

work page doi:10.31234/osf.io/sekf5 2023

[1] [1]

B., & Kulesza, T

Amershi, S., Cakmak, M., Knox, W. B., & Kulesza, T. (2014). Power to the people: The role of humans in interactive machine learning. AI Magazine, 35(4), 105–120. Dellermann, D., Ebel, P., Söllner, M., & Leimeister, J. M. (2019). The future of human–AI collaboration: A taxonomy and research agenda. International Journal of Information Management, 49, 82–92...

work page arXiv 2014

[2] [2]

Spang, and Sebastian Möller

Data, Annotation, and Meaning-Making: The Politics of Categorization in Annotating a Dataset of Faith-based Communal Violence. In Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT '24). Association for Computing Machinery, New York, NY, USA, 2148–2156. https://doi.org/10.1145/3630106.3659030 Pauls, Evelyn

work page doi:10.1145/3630106.3659030 2024

[3] [3]

Policy Brief

Participatory Methods in Peacebuilding Work. Policy Brief. Berlin: Berghof Foundation. Puig Larrauri, H. (2023). How to ﬁnd evidence of divisive behavior on social media. Build Up Blog. Retrieved September 29, 2023, from https://howtobuildup.medium.com/how-to-ﬁnd-evidence-of-divisive-behavior-on-social-media-7b5322d9d65b Rathje, S., Mirea, D.-M., Sucholut...

work page doi:10.31234/osf.io/sekf5 2023