pith. sign in

Understanding the (in) effectiveness of content moderation: A case study of facebook in the context of the us capitol riot

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.CY 3

years

2026 3

roles

background 1

polarities

background 1

clear filters

representative citing papers

Algorithmic Constitutionalism

cs.CY · 2026-05-16 · unverdicted · novelty 6.0

Authors introduce algorithmic constitutionalism, a three-pillar framework for AI governance with layered architecture, algorithmic meta-reasoning, and deliberative correction, applied to Facebook's content moderation and implications for the EU Digital Services Act.

An Evaluation of Chat Safety Moderations in Roblox

cs.CY · 2026-05-06 · unverdicted · novelty 5.0 · 2 refs

Roblox's automated chat moderation fails to catch numerous unsafe messages involving grooming, sexualization of minors, bullying, violence, self-harm, and sensitive information sharing, with users evading detection through various techniques.

citing papers explorer

Showing 3 of 3 citing papers.

  • Algorithmic Constitutionalism cs.CY · 2026-05-16 · unverdicted · none · ref 8

    Authors introduce algorithmic constitutionalism, a three-pillar framework for AI governance with layered architecture, algorithmic meta-reasoning, and deliberative correction, applied to Facebook's content moderation and implications for the EU Digital Services Act.

  • The Enforcement and Feasibility of Hate Speech Moderation on Twitter cs.CY · 2026-04-14 · conditional · none · ref 25

    80% of hateful tweets remain online after five months with no higher removal rate than non-hateful content, while human-AI moderation pipelines can feasibly cut user exposure below regulatory penalty costs.

  • An Evaluation of Chat Safety Moderations in Roblox cs.CY · 2026-05-06 · unverdicted · none · ref 57 · 2 links

    Roblox's automated chat moderation fails to catch numerous unsafe messages involving grooming, sexualization of minors, bullying, violence, self-harm, and sensitive information sharing, with users evading detection through various techniques.