pith. sign in

arxiv: 2512.04269 · v2 · submitted 2025-12-03 · 💻 cs.CY · cs.HC

Mapping Data Labour Supply Chain in Africa in an Era of Digital Apartheid: a Struggle for Recognition

Pith reviewed 2026-05-17 01:36 UTC · model grok-4.3

classification 💻 cs.CY cs.HC
keywords data labourcontent moderationAfricasupply chaindigital apartheidstruggle for recognitionworking conditionsBPO
0
0 comments X

The pith

Data labour in Africa spans 43 countries and 17 firms where workers on short contracts face stress while serving Global North clients.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper maps the data labour supply chain across Africa, revealing that content moderation and data annotation work has moved to business process outsourcing firms in the Global South. A participatory study with an NGO and union used desk research plus a questionnaire of 81 workers to show operations in 43 of 55 African countries by 17 major firms, mostly for North American and European clients. Workers report short-term contracts, psychological stress and economic instability that hide the adaptability and resilience their jobs actually require. The work frames these conditions through the lens of a struggle for recognition to make visible both the industry scale and workers' demands for professional and social acknowledgement.

Core claim

Data labour in Africa constitutes an extensive supply chain operating in 43 countries through 17 firms that serve primarily Global North clients, with employees held to short-term contracts under conditions of psychological stress and economic instability that render their required competences of adaptability and resilience invisible; a worker-centered methodology developed in collaboration with unions can document these realities and advance demands for recognition.

What carries the argument

The participatory mapping methodology that centers union organising goals in the questionnaire design and draws on Honneth's struggle for recognition to interpret workers' demands for acknowledgement of their professional competences.

If this is right

  • The data labour industry reaches nearly the entire continent and depends on short-term contracts that create economic instability for workers.
  • Psychological stress and obscured competences form standard features of the work performed for Global North clients.
  • Centering workers' collective actions produces documentation that can support demands for professional and social recognition.
  • The supply chain structure links African labour directly to content moderation and annotation needs in North America and Europe.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Wider adoption of union-attuned mapping methods could surface similar hidden labour chains in other Global South regions.
  • Public recognition of these workers' adaptability might pressure firms to shift from short-term contracts toward more stable arrangements.
  • The documented scale suggests that digital service delivery for the Global North rests on a geographically concentrated but under-acknowledged workforce.
  • Addressing the invisibility could connect to broader questions of how digital platforms distribute labour burdens across borders.

Load-bearing premise

Questionnaire responses from 81 workers attuned to union goals accurately represent conditions across 43 countries and 17 firms without significant selection or response bias.

What would settle it

An independent survey of several hundred data workers drawn from multiple countries and firms that finds markedly different rates of contract length, stress levels, or skill recognition would contradict the mapped conditions.

Figures

Figures reproduced from arXiv: 2512.04269 by James Oyange, Jessica Pidoux, Kauna Ibrahim Malgwi, Mariame Tighanimine, Mophat Okinyi, Richard Mwaura Mathenge, Sofia Kypraiou, Sonia Kgomo.

Figure 1
Figure 1. Figure 1: Map of the content moderation industry. Asia (n=1, 5.9%). This distribution reflects the global nature of the content moderation indus￾try, with a notable concentration of headquarters within African markets themselves, suggesting international corporations establishing regional presence alongside the emergence of locally-based firms. Operational presence patterns, i.e. where work is being performed, diffe… view at source ↗
Figure 2
Figure 2. Figure 2: The principal occupation of survey participants. [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Participants’ work locations. 14 [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Participants’ work locations. frameworks exemplify the precarious employment conditions characteristic of the content modera￾tion sector with short-term contracts, wherein workers face recurring uncertainty regarding contract renewal. Regarding contracted working hours ( [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Contractual working hours [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Actual working hours. operational flexibility and intensive labour availability, confirming the trend identified in question￾naire responses regarding extended working hours and overtime practices. Both Teleperformance and Sama contracts mandate continuous 24/7 availability with compulsory shift assignments de￾termined by supervisors. The Teleperformance agreement stipulates 45-48 hours weekly across five … view at source ↗
Figure 7
Figure 7. Figure 7: Monthly salary rate. Among content moderators, 28 (65.1%) earned monthly 251–500 USD, with smaller shares earning more than 500 USD (n=6, 13.9%) or less than 250 USD (n=7, 16.2%). Hybrid workers had a more variable income profile: 6 (42.8%) earned 251–500 USD, but 4 (28.5%) earned 100–250 USD and 2 (14.3%) earned under 100 USD. Only 2 (14.3%) earned more than 500 USD. This variability reflects the less sta… view at source ↗
Figure 8
Figure 8. Figure 8: Reported challenges among all workers Content moderators were particularly affected by emotional stress (n=40), low pay (n=39), fear of speaking out (n=38), and insufficient psychological support (n=37), pointing to both emotional and structural strain. Hybrid workers, while also reporting low pay (n=16) and stress (n=14), gave more emphasis to inconsistent hours (n=13), unclear roles (n=13), and limited a… view at source ↗
Figure 9
Figure 9. Figure 9: Reported challenges among all workers Hybrid workers citing low pay were more likely to earn below 250 USD: 3 (23%) earned 100–250 USD, and 2 (15.4%) earned under 100 USD. Only 2 (15.4%) of hybrid respondents citing low pay earned above 500 USD. These trends reinforce the association between hybrid roles and greater income precarity, supporting broader concerns around financial sustainability in outsourced… view at source ↗
read the original abstract

Content moderation and data annotation work has shifted to the Global South, particularly Africa, where workers at business process outsourcing (BPO) companies operate under precarity to serve Global North needs. We address the invisibility of this data labour supply chain and the underdocumented working conditions of its workforce. Drawing on a participatory collaboration between academics, an NGO, and a union, we conducted desk research and deployed a questionnaire (n=81) attuned to unions' organising goals. Our findings show that data labour spans 43 out of 55 African countries, involving 17 major firms serving predominantly North-American and European clients, with workers employed on short-term contracts, under psychological stress and economic instability - conditions that obscure the competences, i.e. adaptability and resilience, that their work demands. We contribute the first comprehensive map of Africa's data labour industry and demonstrate a methodology that centers workers' collective actions in documenting their conditions, drawing on Honneth's "struggle for recognition" to capture workers' demands for professional and social acknowledgement.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to provide the first comprehensive map of Africa's data labour supply chain, showing that content moderation and data annotation work spans 43 out of 55 African countries and involves 17 major BPO firms serving predominantly North American and European clients. Drawing on desk research and a participatory questionnaire (n=81) developed in collaboration with an NGO and union and attuned to organising goals, it reports precarious conditions including short-term contracts, psychological stress, and economic instability while highlighting workers' competences such as adaptability and resilience. The work applies Honneth's struggle for recognition to frame workers' demands for professional and social acknowledgement and positions the methodology as centering collective action.

Significance. If the mapping and descriptive findings hold after methodological clarification, the paper would make a meaningful contribution to the literature on digital labour in the Global South by rendering visible an underdocumented supply chain and by demonstrating a worker-centered, participatory approach that integrates union perspectives. The explicit linkage to recognition theory and the enumeration of specific countries and firms could support future advocacy, policy work, and comparative studies on platform and BPO precarity.

major comments (2)
  1. [Abstract / Methods] Abstract and Methods section: the headline claim of a 'first comprehensive map' of 43 countries and 17 firms is load-bearing for the paper's contribution, yet the desk-research component provides no search strategy, source list, inclusion/exclusion criteria, or verification protocol, rendering the enumeration un-auditable and its completeness impossible to assess.
  2. [Questionnaire / Findings] Questionnaire and Findings sections: with n=81 responses explicitly 'attuned to unions' organising goals,' the reported conditions (short-term contracts, psychological stress, economic instability) across 43 countries rest on a sample whose recruitment channels, response rate, stratification by country/firm, and potential selection bias are not described; given the thin per-country coverage this necessarily entails, the generalisability of the descriptive claims is not secured.
minor comments (2)
  1. [Abstract] The abstract is dense; splitting the contribution statement from the empirical findings would improve readability.
  2. [Discussion or Conclusion] Consider adding a dedicated limitations subsection that explicitly discusses sample size, union-attuned recruitment, and the scope of the desk research.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback, which highlights important opportunities to strengthen the transparency of our methods. We address each major comment below and commit to revisions that clarify our approach without altering the core claims or participatory orientation of the work.

read point-by-point responses
  1. Referee: [Abstract / Methods] Abstract and Methods section: the headline claim of a 'first comprehensive map' of 43 countries and 17 firms is load-bearing for the paper's contribution, yet the desk-research component provides no search strategy, source list, inclusion/exclusion criteria, or verification protocol, rendering the enumeration un-auditable and its completeness impossible to assess.

    Authors: We acknowledge that the current Methods section lacks sufficient detail on the desk-research process. The mapping was constructed through searches of publicly available industry reports, company websites, news articles, and union documents to identify the 17 firms and their documented operations across African countries. We will revise the Methods section to include an explicit search strategy (keywords and databases used), a list of primary sources, inclusion criteria (e.g., firms with verifiable contracts for data annotation or content moderation serving North American or European clients), and a verification protocol based on source triangulation. These additions will make the enumeration auditable while leaving the reported scope of 43 countries and 17 firms unchanged. revision: yes

  2. Referee: [Questionnaire / Findings] Questionnaire and Findings sections: with n=81 responses explicitly 'attuned to unions' organising goals,' the reported conditions (short-term contracts, psychological stress, economic instability) across 43 countries rest on a sample whose recruitment channels, response rate, stratification by country/firm, and potential selection bias are not described; given the thin per-country coverage this necessarily entails, the generalisability of the descriptive claims is not secured.

    Authors: The questionnaire was developed collaboratively with an NGO and union specifically to support organising goals and to surface workers' own accounts of precarity and competence, rather than to produce a statistically representative sample. The supply-chain mapping of 43 countries and 17 firms rests on the separate desk-research component; the questionnaire provides complementary, worker-centered evidence of conditions in the locations reached through union networks. We will revise the Methods and Findings sections to describe recruitment channels, note the purposive sampling strategy, discuss response limitations and potential selection bias, and explicitly state that the descriptive findings are indicative rather than generalisable across all 43 countries. This clarification will better align the presentation with the participatory intent of the study. revision: yes

Circularity Check

0 steps flagged

Empirical mapping via desk research and survey shows no circularity

full rationale

The paper's core contribution is a descriptive map of data labour across African countries and firms, derived from desk research plus a questionnaire (n=81) whose responses are presented as primary data. No equations, fitted parameters, or predictions appear that reduce outputs to inputs by construction. The invocation of Honneth's recognition theory functions as an interpretive lens rather than a load-bearing self-citation or ansatz that forces the empirical findings. The derivation chain therefore remains self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on domain assumptions about data labor precarity and the value of participatory union-centered methods rather than new mathematical entities or fitted parameters.

axioms (2)
  • domain assumption Data annotation and moderation work in Africa is systematically invisible and under-documented.
    Invoked in the opening framing of the abstract to justify the mapping effort.
  • domain assumption Participatory collaboration with unions produces more accurate documentation of working conditions than conventional academic surveys.
    Stated as the methodological contribution.

pith-pipeline@v0.9.0 · 5518 in / 1172 out tokens · 31818 ms · 2026-05-17T01:36:28.275802+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

  1. [1]

    The role of expertise in effectively moderating harmful social media content

    [AYK+25] Nuredin Ali Abdelkadir, Tianling Yang, Shivani Kapania, Meron Estefanos, Fa- sica Berhane Gebrekidan, Zecharias Zelalem, Messai Ali, Rishan Berhe, Dylan Baker, Zeerak Talat, et al. The role of expertise in effectively moderating harmful social media content. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–21,

  2. [2]

    Ptsd, depression and anxiety: why former face- book moderators in kenya are taking legal action.The Guardian, 12:2024,

    [BK24] Robert Booth and Caroline Kimeu. Ptsd, depression and anxiety: why former face- book moderators in kenya are taking legal action.The Guardian, 12:2024,

  3. [3]

    Linkedin will soon use your data to train ai

    [Con25] Elena Constantinescu. Linkedin will soon use your data to train ai. here’s what you can do to opt out.Proton.me, 09:2025,

  4. [4]

    Fairwork ai ratings 2023: The workers behind ai at sama

    [CSBG23] Callum Cant, Funda Ustek Spilda, Lola Brittain, and Mark Graham. Fairwork ai ratings 2023: The workers behind ai at sama. report

  5. [5]

    We live in an era of cognitive/digital apartheid and feudalism

    26 [EM21] El-Mahdi El-Mhamdi. We live in an era of cognitive/digital apartheid and feudalism. especially humans outside us-canada-europe. these are cold observations, not a pas- sionate feeling, from someone living in the lucky side, in switzerland, english-content consumer with a scientific background, etc.https://twitter.com/L_badikho/ status/1363571814...

  6. [6]

    [EMHT25] El-Mahdi El-Mhamdi, Lˆ e-Nguyˆ en Hoang, and Mariame Tighanimine

    Accessed: 2025-09-09. [EMHT25] El-Mahdi El-Mhamdi, Lˆ e-Nguyˆ en Hoang, and Mariame Tighanimine. A case for specialisation in non-human entities.Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8(1):824–837, Oct

  7. [7]

    The perspectivist paradigm shift: Assumptions and challenges of capturing human labels.arXiv preprint arXiv:2405.05860,

    [Eth] [FBKT24] Eve Fleisig, Su Lin Blodgett, Dan Klein, and Zeerak Talat. The perspectivist paradigm shift: Assumptions and challenges of capturing human labels.arXiv preprint arXiv:2405.05860,

  8. [8]

    Digital apartheid and the horn of africa

    [GETM23] Timnit Gebru, Meron Estefanos, Asmelash Teka, and Richard Mathenge. Digital apartheid and the horn of africa. InACM Conference on Fairness, Accountability, and Transparency (FAccT) 2023, Location: Chicago,

  9. [9]

    Silicon savanna: The workers taking on africa’s digital sweatshops

    27 [Hel23] Erica Hellerstein. Silicon savanna: The workers taking on africa’s digital sweatshops. Coda Story, 10:2023,

  10. [10]

    I was a content moderator for facebook

    [ken] [Kgo25] Sonia Kgomo. I was a content moderator for facebook. i saw the real cost of out- sourcing digital labour.The Guardian, 12:2025,

  11. [11]

    A hunt for the snark: Annotator diversity in data practices

    [KTW23] Shivani Kapania, Alex S Taylor, and Ding Wang. A hunt for the snark: Annotator diversity in data practices. InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–15,

  12. [12]

    ´Etude psycho sociale

    [Lat80] Genevi` eve Latreille.La naissance des m´ etiers en France, 1950-75. ´Etude psycho sociale. Presses Universitaires de Lyon,

  13. [13]

    The work of ai: Mapping human labour in the ai pipeline

    [LCC+24] Airi Lampinen, Rob Comber, Srravya Chandhiramowuli, Naja Holten Møller, and Alex S Taylor. The work of ai: Mapping human labour in the ai pipeline. In Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing, pages 728–731,

  14. [14]

    Going dark: Social factors in collective action against platform operators in the reddit blackout

    [Mat16] J Nathan Matias. Going dark: Social factors in collective action against platform operators in the reddit blackout. InProceedings of the 2016 CHI conference on human factors in computing systems, pages 1138–1151,

  15. [15]

    The end of trust and safety?: Examining the future of content moderation and upheavals in professional online safety efforts

    [MSBS25] Rachel Elizabeth Moran, Joseph Schafer, Mert Bayar, and Kate Starbird. The end of trust and safety?: Examining the future of content moderation and upheavals in professional online safety efforts. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–14,

  16. [16]

    Facebook loses jurisdiction appeal in kenyan court paving the way for moderators’ case to proceed.The Independent, 09:2024,

    [Mus24] Evelyne Musambi. Facebook loses jurisdiction appeal in kenyan court paving the way for moderators’ case to proceed.The Independent, 09:2024,

  17. [17]

    Tackling hate speech in low-resource languages with context experts

    [NSBE22] Daniel Nkemelu, Harshil Shah, Michael Best, and Irfan Essa. Tackling hate speech in low-resource languages with context experts. InProceedings of the 2022 International Conference on information and communication technologies and development, pages 1–11,

  18. [18]

    150 african workers for chatgpt, tiktok and facebook vote to unionize at landmark nairobi meeting.Time Magazine, 1:2023,

    [Per23a] Billy Perrigo. 150 african workers for chatgpt, tiktok and facebook vote to unionize at landmark nairobi meeting.Time Magazine, 1:2023,

  19. [19]

    Exclusive: Openai used kenyan workers on less than$2 per hour to make chatgpt less toxic.Time Magazine, 18:2023,

    [Per23b] Billy Perrigo. Exclusive: Openai used kenyan workers on less than$2 per hour to make chatgpt less toxic.Time Magazine, 18:2023,

  20. [20]

    Former tiktok moderator threatens lawsuit in kenya over alleged trauma and unfair dismissal.Time, 07:2023,

    [Per23c] Billy Perrigo. Former tiktok moderator threatens lawsuit in kenya over alleged trauma and unfair dismissal.Time, 07:2023,

  21. [21]

    Richard mathenge organizer, african content moderators union.Time, Time100 AI 2023:2023,

    [Per23d] Billy Perrigo. Richard mathenge organizer, african content moderators union.Time, Time100 AI 2023:2023,

  22. [22]

    Kauna malgwi chairperson, nigeria chapter, content moderators union

    [Per24a] Billy Perrigo. Kauna malgwi chairperson, nigeria chapter, content moderators union. Time, Time100 AI 2024:2024,

  23. [23]

    Mophat okinyi chairperson, content moderators union.Time, Time100 AI 2024:2024,

    [Per24b] Billy Perrigo. Mophat okinyi chairperson, content moderators union.Time, Time100 AI 2024:2024,

  24. [24]

    Meta’s ai to train using social media posts from europe.Reuters, 06:2024,

    [Reu24] Reuters. Meta’s ai to train using social media posts from europe.Reuters, 06:2024,

  25. [25]

    The psychological well-being of content moderators: the emotional labor of commercial moderation and avenues for improving support

    [SBV+21] Miriah Steiger, Timir J Bharucha, Sukrit Venkatagiri, Martin J Riedl, and Matthew Lease. The psychological well-being of content moderators: the emotional labor of commercial moderation and avenues for improving support. InProceedings of the 2021 CHI conference on human factors in computing systems, pages 1–14,

  26. [26]

    it’s been tough for us

    [Sie23] Martin Siele. “it’s been tough for us”: Meta’s kenyan content moderators say they’ll keep fighting.Rest of World, 05:2023,

  27. [27]

    everyone wants to do the model work, not the data work

    [SKH+21] Nithya Sambasivan, Shivani Kapania, Hannah Highfill, Diana Akrong, Praveen Par- itosh, and Lora M Aroyo. “everyone wants to do the model work, not the data work”: Data cascades in high-stakes ai. Inproceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pages 1–15,

  28. [28]

    [SV23] Farhana Shahid and Aditya Vashistha. Decolonizing content moderation: does uni- form global community standard resemble utopian equality or western power hege- mony? InProceedings of the 2023 CHI Conference on Human Factors in Computing Systems, pages 1–18,

  29. [29]

    Meta argues its ai needs personal information from social media posts to learn ‘australian concepts’.The Guardian, 07:2025,

    [Tay25] Josh Taylor. Meta argues its ai needs personal information from social media posts to learn ‘australian concepts’.The Guardian, 07:2025,

  30. [30]

    Creators say they didn’t know google uses youtube to train ai.CNBC, 06:2025,

    [Uga] [Val25] Zach Vallese. Creators say they didn’t know google uses youtube to train ai.CNBC, 06:2025,

  31. [31]

    The making of performative accuracy in ai training: Precision labor and its consequences

    [ZYM+25] Ben Zefeng Zhang, Tianling Yang, Milagros Miceli, Oliver L Haimson, and Michae- lanne Thomas. The making of performative accuracy in ai training: Precision labor and its consequences. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pages 1–19,