Detecting Cyberbullying and Cyberaggression in Social Media
Pith reviewed 2026-05-24 18:24 UTC · model grok-4.3
The pith
Text, user, and network attributes allow machine learning to separate bullies and aggressors from ordinary Twitter users with over 90 percent accuracy and AUC.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors show that a methodology combining text-based, user-based, and network-based attributes, processed by several machine learning algorithms, can classify Twitter accounts as bullies, aggressors, or normal users at over 90 percent accuracy and AUC when the training data come from participants in hate-related discussions.
What carries the argument
The classification pipeline that fuses tweet text, account metadata, and social graph attributes to train supervised models distinguishing abusive from non-abusive accounts.
If this is right
- Twitter could apply the same feature set to flag accounts for manual review before suspension.
- The method separates cyberbullying from cyberaggression within the same community.
- Performance of different suspension policies can be simulated on the labeled set.
- Normal-topic users provide a baseline that highlights what changes when abuse appears.
Where Pith is reading between the lines
- The same attribute combination might transfer to other platforms if their data allow extraction of comparable text, profile, and link features.
- Future work could test whether adding victim-reported incidents improves label quality over topic-based proxies.
- If network features prove dominant, early detection could occur before many abusive tweets are posted.
Load-bearing premise
Participation in discussions around hate-related topics serves as a reliable proxy for labeling users as bullies or aggressors without independent verification of their actual behavior.
What would settle it
Independent human raters label a held-out sample of the same accounts using only the visible tweets and then compare agreement with the model's output.
Figures
read the original abstract
Cyberbullying and cyberaggression are increasingly worrisome phenomena affecting people across all demographics. More than half of young social media users worldwide have been exposed to such prolonged and/or coordinated digital harassment. Victims can experience a wide range of emotions, with negative consequences such as embarrassment, depression, isolation from other community members, which embed the risk to lead to even more critical consequences, such as suicide attempts. In this work, we take the first concrete steps to understand the characteristics of abusive behavior in Twitter, one of today's largest social media platforms. We analyze 1.2 million users and 2.1 million tweets, comparing users participating in discussions around seemingly normal topics like the NBA, to those more likely to be hate-related, such as the Gamergate controversy, or the gender pay inequality at the BBC station. We also explore specific manifestations of abusive behavior, i.e., cyberbullying and cyberaggression, in one of the hate-related communities (Gamergate). We present a robust methodology to distinguish bullies and aggressors from normal Twitter users by considering text, user, and network-based attributes. Using various state-of-the-art machine learning algorithms, we classify these accounts with over 90% accuracy and AUC. Finally, we discuss the current status of Twitter user accounts marked as abusive by our methodology, and study the performance of potential mechanisms that can be used by Twitter to suspend users in the future.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to analyze 1.2 million Twitter users and 2.1 million tweets from normal topics (NBA) versus hate-related topics (Gamergate, BBC gender pay), labeling the latter as proxies for bullies/aggressors. It presents a supervised ML methodology using text, user, and network features to classify these accounts with >90% accuracy and AUC, explores manifestations of abuse in Gamergate, and discusses potential Twitter suspension mechanisms.
Significance. If the proxy labeling reliably identifies abusive behavior rather than topic-specific patterns, the multi-feature classifier at this scale could support practical platform moderation tools. The combination of feature types and the focus on both bullying and aggression are potential strengths, but the lack of validation for the labeling limits the result's immediate significance and generalizability.
major comments (2)
- [Abstract] Abstract: the central claim of >90% accuracy and AUC for distinguishing bullies/aggressors is unsupported because the abstract (and visible evidence) supplies no information on the labeling procedure, feature definitions, cross-validation strategy, class imbalance handling, or baseline comparisons.
- [Abstract and methods] Data collection and labeling (implied in abstract and methods): assigning positive labels to users participating in Gamergate/BBC discussions as a proxy for cyberbullying/cyberaggression without independent ground-truth verification, manual annotation, or external validation is load-bearing for the supervised classification result; this risks the model learning topic-specific signals instead of abuse signals.
minor comments (2)
- [Abstract] Abstract: dataset sizes are stated without breakdown by topic, class distribution, or how the 2.1M tweets relate to the 1.2M users.
- [Discussion] The discussion of Twitter suspension mechanisms is mentioned but lacks quantitative evaluation or comparison to existing platform policies.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major point below and propose targeted revisions where appropriate.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim of >90% accuracy and AUC for distinguishing bullies/aggressors is unsupported because the abstract (and visible evidence) supplies no information on the labeling procedure, feature definitions, cross-validation strategy, class imbalance handling, or baseline comparisons.
Authors: The abstract is intentionally concise per journal guidelines and summarizes the key result; full details on labeling (topic-based proxy), feature sets (text/user/network), 10-fold cross-validation, class weighting for imbalance, and baseline comparisons (e.g., against text-only models) appear in Sections 3 and 4. We will revise the abstract to include one additional sentence outlining the multi-feature supervised approach and evaluation protocol. revision: partial
-
Referee: [Abstract and methods] Data collection and labeling (implied in abstract and methods): assigning positive labels to users participating in Gamergate/BBC discussions as a proxy for cyberbullying/cyberaggression without independent ground-truth verification, manual annotation, or external validation is load-bearing for the supervised classification result; this risks the model learning topic-specific signals instead of abuse signals.
Authors: We selected Gamergate and BBC gender-pay topics precisely because they are documented in prior literature as containing elevated rates of abusive behavior, providing a scalable proxy when manual annotation of 1.2 M accounts is infeasible. Network and user features were included alongside text to reduce reliance on topic vocabulary alone; we further validate the proxy by manually examining abuse manifestations within the Gamergate subset. We will add an explicit limitations paragraph discussing the proxy assumption and outlining how future work could obtain platform-provided labels for external validation. revision: partial
Circularity Check
No significant circularity; standard empirical ML pipeline
full rationale
The paper collects tweets from topic-based cohorts (Gamergate/BBC as positive labels, NBA as negative), extracts text/user/network features, and trains standard supervised classifiers to report accuracy/AUC. No equations, parameter fits, or self-citations are shown that reduce the classification performance to a definition, a renamed input, or a load-bearing prior result by the same authors. The derivation chain consists of independent data collection, feature engineering, and off-the-shelf ML, making the reported results self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- ML model hyperparameters
axioms (2)
- domain assumption Participation in Gamergate or BBC gender-pay discussions serves as a valid proxy label for cyberbullying or cyberaggression.
- domain assumption Text, user, and network attributes are sufficiently discriminative for the classification task.
Reference graph
Works this paper leans on
-
[1]
List of swear words & curse words, 2017
AllSlang. List of swear words & curse words, 2017. https: //www.noswearing.com/dictionary
work page 2017
-
[2]
A. A. Amleshwaram, N. Reddy, S. Yadav, G. Gu, and C. Yang. Cats: Characterizing automation of twitter spammers. In 2013 Fifth International Conference on Communication Sys- tems and Networks (COMSNETS), pages 1–10, Jan 2013
work page 2013
- [3]
-
[4]
Zoe Quinn, prominent SJW and indie developer is a liar and a slut
Anonymous. Zoe Quinn, prominent SJW and indie developer is a liar and a slut. 4chan. https://archive.is/QIjm3
-
[5]
Neural Machine Translation by Jointly Learning to Align and Translate
D. Bahdanau, K. Cho, and Y . Bengio. Neural machine trans- lation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[6]
S. Bergsma, M. Post, and D. Yarowsky. Stylometric analysis of scientific articles. In Proceedings of the 2012 Conference of the North American Chapter of the Association for Com- putational Linguistics: Human Language Technologies, pages 327–337. Association for Computational Linguistics, 2012
work page 2012
-
[7]
D. Bhattacharya and S. Ram. Rt news: An analysis of news agency ego networks in a microblogging environment. ACM Transactions on Management Information Systems, 6(3):11:1– 11:25, 2015
work page 2015
-
[8]
J. Blackburn, R. Simha, N. Kourtellis, X. Zuo, M. Ripeanu, J. Skvoretz, and A. Iamnitchi. Branded with a scarlet ”c”: cheaters in a gaming social network. In WWW, 2012
work page 2012
-
[9]
D. M. Blei, A. Y . Ng, and M. I. Jordan. Latent dirichlet alloca- tion. Journal of machine Learning research, 3(Jan):993–1022, 2003
work page 2003
-
[10]
V . Blondel, J. Guillaume, R. Lambiotte, and E. Lefebvre. The Louvain method for community detection in large networks. Statistical Mechanics: Theory and Experiment, 10, 2011
work page 2011
-
[11]
A. Bruns and S. Stieglitz. Towards more systematic twitter analysis: metrics for tweeting activities. International Journal of Social Research Methodology, 16(2):91–108, 2013
work page 2013
-
[12]
D. Chatzakou, N. Kourtellis, J. Blackburn, E. D. Cristofaro, G. Stringhini, and A. Vakali. Mean birds: Detecting aggression and bullying on twitter. In WebSci, 2017
work page 2017
-
[13]
D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringhini, and A. Vakali. Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter. In ACM Hyper- text, 2017
work page 2017
-
[14]
D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringhini, and A. Vakali. Mean birds: Detecting aggression and bullying on twitter. In Proceedings of the 2017 ACM on web science conference, pages 13–22. ACM, 2017
work page 2017
-
[15]
D. Chatzakou, N. Kourtellis, J. Blackburn, E. De Cristofaro, G. Stringhini, and A. Vakali. Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying. In WWW CyberSafety Work- shop, 2017
work page 2017
-
[16]
D. Chatzakou, V . Koutsonikola, A. Vakali, and K. Kafet- sios. Micro-blogging Content Analysis via Emotionally- Driven Clustering. In ACII, 2013
work page 2013
-
[17]
D. Chatzakou, N. Passalis, and A. Vakali. Multispot: Spotting sentiments with semantic aware multilevel cascaded analysis. In DaWaK, volume 9263, pages 337–350. Springer, 2015
work page 2015
-
[18]
D. Chatzakou and A. Vakali. Harvesting opinions and emo- tions from social media textual resources.Internet Computing, IEEE, 19(4):46–50, 2015
work page 2015
-
[19]
D. Chatzakou, A. Vakali, and K. Kafetsios. Detecting variation of emotions in online activities. Expert Systems with Applica- tions, 89:318 – 332, 2017
work page 2017
-
[20]
C. Chen, J. Zhang, X. Chen, Y . Xiang, and W. Zhou. 6 million spam tweets: A large ground truth for timely Twitter spam detection. In IEEE ICC, 2015
work page 2015
-
[21]
Y . Chen, Y . Zhou, S. Zhu, and H. Xu. Detecting Offensive Lan- guage in Social Media to Protect Adolescent Online Safety. In PASSAT and SocialCom, 2012
work page 2012
-
[22]
L. Corcoran, C. M. Guckin, and G. Prentice. Cyberbullying or cyber aggression?: A review of existing definitions of cyber- based peer-to-peer aggression. Societies, 5(2), 2015
work page 2015
-
[23]
https://cyberbullying.org/ summary-of-our-cyberbullying-research, November 2016
Cyberbullying Research Center. https://cyberbullying.org/ summary-of-our-cyberbullying-research, November 2016
work page 2016
-
[24]
https://cyberbullying.org/ facts, 2017
Cyberbullying Research Center. https://cyberbullying.org/ facts, 2017
work page 2017
- [25]
-
[26]
J. Davis and M. Goadrich. The relationship between Precision- Recall and ROC curves. In Machine learning, 2006
work page 2006
-
[27]
T. G. Dietterich. Ensemble Methods in Machine Learning. In Proceedings of the First International Workshop on Multiple Classifier Systems, 2000
work page 2000
-
[28]
K. Dinakar, R. Reichart, and H. Lieberman. Modeling the de- tection of Textual Cyberbullying. The Social Mobile Web, 11, 2011
work page 2011
- [29]
- [30]
-
[31]
D. Fetterly, M. Manasse, and M. Najork. On the evolution of clusters of near-duplicate web pages. volume 2, pages 228–
-
[32]
Institute of Electrical and Electronics Engineers, Inc., Oc- tober 2004
work page 2004
-
[33]
D. Fetterly, M. Manasse, and M. Najork. Detecting phrase- level duplication on the world wide web. In28th Annual Inter- national ACM SIGIR Conference on Research and Develop- ment in Information Retrieval (SIGIR) , Salvador, Brazil, Au- gust 2005. Association for Computing Machinery, Inc
work page 2005
- [34]
- [35]
-
[36]
N. Friedman, D. Geiger, and M. Goldszmidt. Bayesian Net- work Classifiers. Mach. Learn., 29(2-3), 1997. 31
work page 1997
-
[37]
M. Giatsoglou, D. Chatzakou, N. Shah, C. Faloutsos, and A. Vakali. Reteeting Activity on Twitter: Signs of Deception. In PAKDD, 2015
work page 2015
-
[38]
D. W. Grigg. Cyber-aggression: Definition and concept of cy- berbullying. Australian Journal of Guidance and Counselling, 20(02), 2010
work page 2010
- [39]
-
[40]
J. Guberman and L. Hemphill. Challenges in Modifying Ex- isting Scales for Detecting Harassment in Individual Tweets. In System Sciences, 2017
work page 2017
-
[41]
L. D. Hanish, B. Kochenderfer-Ladd, R. A. Fabes, C. L. Mar- tin, D. Denning, et al. Bullying among young children: The in- fluence of peers and teachers. Bullying in American schools: A social-ecological perspective on prevention and intervention , 2004
work page 2004
- [42]
- [43]
-
[44]
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition , pages 770–778, 2016
work page 2016
-
[45]
A. Hern. Feminist critics of video games facing threats in ‘gamergate’ campaign. The Guardian, Oct
-
[46]
https://www.theguardian.com/technology/2014/oct/23/ felicia-days-public-details-online-gamergate
work page 2014
- [47]
-
[48]
G. E. Hine, J. Onaolapo, E. De Cristofaro, N. Kourtellis, I. Leontiadis, R. Samaras, G. Stringhini, and J. Blackburn. A longitudinal measurement study of 4chan’s politically in- correct forum and its effect on the web. arXiv preprint arXiv:1610.03452, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[49]
H. Hosseinmardi, R. Han, Q. Lv, S. Mishra, and A. Ghasemi- anlangroodi. Towards understanding cyberbullying behavior in a semi-anonymous social network. In IEEE/ACM ASONAM, 2014
work page 2014
-
[50]
H. Hosseinmardi, S. A. Mattson, R. I. Rafiq, R. Han, Q. Lv, and S. Mishra. Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network. In In SocInfo, 2015
work page 2015
-
[51]
F. Jin, E. Dougherty, P. Saraf, Y . Cao, and N. Ramakrishnan. Epidemiological Modeling of News and Rumors on Twitter. In SNAKDD, 2013
work page 2013
-
[52]
J.-H. K. Estimating Classification Error Rate: Repeated Cross- validation, Repeated Hold-out and Bootstrap. Comput. Stat. Data Anal., 53(11), 2009
work page 2009
- [53]
- [54]
-
[55]
A. Z. Khan, M. Atique, and V . Thakare. Combining lexicon- based and learning-based methods for twitter sentiment analy- sis. International Journal of Electronics, Communication and Soft Computing Science & Engineering (IJECSCSE), page 89, 2015
work page 2015
-
[56]
K. Kira and L. A. Rendell. A Practical Approach to Feature Selection. In 9th International Workshop on Machine Learn- ing, 1992
work page 1992
-
[57]
J. M. Kleinberg. Hubs, Authorities, and Communities. ACM Computing Surveys, 31(4es), 1999
work page 1999
-
[58]
Twitter says it’s punishing 10 times more users for being abusive than it was a year ago
Kurt Wagner. Twitter says it’s punishing 10 times more users for being abusive than it was a year ago. https://www.vox.com/2017/7/20/15999636/ twitter-safety-abuse-update-suspensions-increase, Jul 2017
work page 2017
-
[59]
H. Kwak, J. Blackburn, and S. Han. Exploring Cyberbully- ing and Other Toxic Behavior in Team Competition Online Games. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2015
work page 2015
-
[60]
K. Lee, J. Caverlee, and S. Webb. Uncovering social spam- mers: Social honeypots + machine learning. In Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval , SIGIR ’10, pages 435–442, New York, NY , USA, 2010. ACM
work page 2010
- [61]
-
[62]
M. McCord and M. Chuah. Spam detection on twitter using traditional classifiers. In Autonomic and Trusted Computing , pages 175–186. Springer Berlin Heidelberg, 2011
work page 2011
-
[63]
Efficient Estimation of Word Representations in Vector Space
T. Mikolov, K. Chen, G. Corrado, and J. Dean. Efficient Es- timation of Word Representations in Vector Space. CoRR, abs/1301.3781, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[64]
M. Miller. goo.gl/n1W6nt, Oct 2016
work page 2016
-
[65]
T. E. Mortensen. Anger, Fear, and Games. Games and Culture, 2016
work page 2016
- [66]
-
[67]
G. Navarro. A Guided Tour to Approximate String Matching. ACM Computing Surveys, 33(1), 2001
work page 2001
-
[68]
S. Nilizadeh, F. Labr `eche, A. Sedighian, A. Zand, J. Fernan- dez, C. Kruegel, G. Stringhini, and G. Vigna. Poised: Spotting twitter spam off the beaten paths. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017
work page 2017
- [69]
-
[70]
D. O’Sullivan. Bomb suspect threatened people on twitter, and twitter didn’t act. https://edition.cnn.com/2018/10/26/tech/ cesar-sayoc-twitter-response/index.html, Oct 2018
work page 2018
-
[71]
J. Pennington, R. Socher, and C. D. Manning. Glove: Global vectors for word representation. In Empirical Methods in Nat- ural Language Processing (EMNLP), pages 1532–1543, 2014
work page 2014
-
[72]
http://www.pewinternet.org/2017/07/ 11/online-harassment-2017/, 2014
Pew Research Center. http://www.pewinternet.org/2017/07/ 11/online-harassment-2017/, 2014
work page 2017
-
[73]
J. Pfeffer, T. Zorbach, and K. M. Carley. Understanding online firestorms: Negative word-of-mouth dynamics in social me- dia networks. Journal of Marketing Communications, 20(1-2), 2014
work page 2014
-
[74]
Twitter tries new measures in crackdown on harassment
Pham, Sherisse. Twitter tries new measures in crackdown on harassment. CNNtech, February
-
[75]
https://money.cnn.com/2017/02/07/technology/ twitter-combat-harassment-features/
work page 2017
-
[76]
S. Pieschl, T. Porsch, T. Kahl, and R. Klockenbusch. Relevant dimensions of cyberbullying - Results from two experimental studies . Journal of Applied Developmental Psychology, 34(5), 2013
work page 2013
- [77]
-
[78]
A. PRESS. https://www.dailymail.co.uk/wires/ap/article- 3419263/venezuela-doctors-worried-official-silence-zika.htm. https://www.dailymail.co.uk/wires/ap/article-3419263/ Venezuela-doctors-worried-official-silence-Zika.htm, 2016
work page 2016
-
[79]
J. Quinlan. Induction of Decision Trees. Machine Learning, 1(1), 1986
work page 1986
-
[80]
E. Raff. Jsat: Java statistical analysis tool, a library for machine learning. Journal of Machine Learning Research , 18(23):1–5, 2017
work page 2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.