pith. sign in

arxiv: 2104.00124 · v2 · pith:A3ZX7R37new · submitted 2021-03-31 · 💻 cs.CL · cs.SI

Misinformation detection in Luganda-English code-mixed social media text

classification 💻 cs.CL cs.SI
keywords misinformationdetectionclassificationcode-mixedluganda-englishmachinemediamodels
0
0 comments X
read the original abstract

The increasing occurrence, forms, and negative effects of misinformation on social media platforms has necessitated more misinformation detection tools. Currently, work is being done addressing COVID-19 misinformation however, there are no misinformation detection tools for any of the 40 distinct indigenous Ugandan languages. This paper addresses this gap by presenting basic language resources and a misinformation detection data set based on code-mixed Luganda-English messages sourced from the Facebook and Twitter social media platforms. Several machine learning methods are applied on the misinformation detection data set to develop classification models for detecting whether a code-mixed Luganda-English message contains misinformation or not. A 10-fold cross validation evaluation of the classification methods in an experimental misinformation detection task shows that a Discriminative Multinomial Naive Bayes (DMNB) method achieves the highest accuracy and F-measure of 78.19% and 77.90% respectively. Also, Support Vector Machine and Bagging ensemble classification models achieve comparable results. These results are promising since the machine learning models are based on n-gram features from only the misinformation detection dataset.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.