Comparison of Classical Machine Learning Approaches on Bangla Textual Emotion Analysis

Md. Ataur Rahman; Md. Hanif Seddiqui

arxiv: 1907.07826 · v1 · pith:HVX4NE3Xnew · submitted 2019-07-18 · 💻 cs.CL · cs.LG

Comparison of Classical Machine Learning Approaches on Bangla Textual Emotion Analysis

Md. Ataur Rahman , Md. Hanif Seddiqui This is my paper

Pith reviewed 2026-05-24 20:12 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords Banglaemotion analysismachine learningsupport vector machinetext classificationFacebook commentssentiment analysisclassical ML

0 comments

The pith

Support vector machines with radial basis function kernels classify six emotions in Bangla Facebook comments with 52.98 percent accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper experiments with classical machine learning to detect fine-grained emotions in Bangla text from social media. Authors collected and labeled comments from Facebook groups on political and economic topics for the six basic emotions of sadness, happiness, disgust, surprise, fear, and anger. They tested Naive Bayes, Decision Tree, k-Nearest Neighbor, Support Vector Machine, and K-Means Clustering using various feature sets. The best performing approach was SVM using a non-linear RBF kernel, which achieved an average accuracy of 52.98 percent and a macro F1 score of 0.3324. This work demonstrates that standard classifiers can be applied directly to emotion analysis in Bangla without deep learning.

Core claim

The authors gathered a corpus of Bangla Facebook comments and annotated it for six emotions. They compared five classical machine learning techniques and found that SVM with a non-linear radial-basis function kernel gave the highest performance with 52.98% average accuracy and 0.3324 macro F1 score.

What carries the argument

Support vector machine classifier using a radial basis function kernel on combinations of features derived from the annotated Bangla text corpus

Load-bearing premise

The manual annotations of the Facebook comments correctly capture the emotions expressed and the collected corpus is representative of Bangla emotional language use.

What would settle it

A new experiment that re-annotates the same or similar Bangla comments with different annotators and retrains the models to check if SVM with RBF still achieves the top accuracy and F1 scores.

read the original abstract

Detecting emotions from text is an extension of simple sentiment polarity detection. Instead of considering only positive or negative sentiments, emotions are conveyed using more tangible manner; thus, they can be expressed as many shades of gray. This paper manifests the results of our experimentation for fine-grained emotion analysis on Bangla text. We gathered and annotated a text corpus consisting of user comments from several Facebook groups regarding socio-economic and political issues, and we made efforts to extract the basic emotions (sadness, happiness, disgust, surprise, fear, anger) conveyed through these comments. Finally, we compared the results of the five most popular classical machine learning techniques namely Naive Bayes, Decision Tree, k-Nearest Neighbor (k-NN), Support Vector Machine (SVM) and K-Means Clustering with several combinations of features. Our best model (SVM with a non-linear radial-basis function (RBF) kernel) achieved an overall average accuracy score of 52.98% and an F1 score (macro) of 0.3324

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Straightforward baseline on a new Bangla emotion corpus using classical ML, with modest results and sparse methodological detail.

read the letter

This paper gathers Facebook comments in Bangla on socio-political topics, annotates them for six basic emotions, and compares five standard classifiers with various features. The main deliverable is the new annotated corpus plus the observation that SVM with RBF kernel reaches 52.98% accuracy and 0.3324 macro F1. That is the useful part: it supplies an empirical starting point for a language that still has very little labeled emotion data. The numbers are reported directly from held-out evaluation, so there is no circularity. The work stays within classical methods and does not claim new algorithms or theoretical advances. The corpus itself is the clearest addition to the literature. The soft spots are the missing specifics. The abstract gives no corpus size, class balance, inter-annotator agreement, or train-test split details, which makes it difficult to interpret whether 53% accuracy is a reasonable result or simply reflects noisy labels and imbalance. Macro F1 at 0.33 also signals that performance is uneven across emotions. Without those numbers or a majority-class baseline, the comparison among the five methods is harder to evaluate. The annotation process is described at a high level only. This is the kind of paper that matters to researchers building tools for Bangla social media or low-resource emotion detection. It is not going to change how anyone does NLP in high-resource languages. A serious editor should send it to review because the data contribution is real and the experiments are reproducible in principle; referees can ask for the missing dataset statistics and a clearer validation protocol. I would not bring it to a general reading group, and I would not cite it unless I needed the specific Bangla numbers.

Referee Report

2 major / 2 minor

Summary. The paper collects and annotates a corpus of Bangla Facebook comments for six basic emotions (sadness, happiness, disgust, surprise, fear, anger) and compares five classical ML approaches—Naive Bayes, Decision Tree, k-NN, SVM, and K-Means—across feature combinations. The best reported result is SVM with RBF kernel at 52.98% accuracy and 0.3324 macro-F1.

Significance. If the experimental details hold, the work supplies a new annotated resource and baseline numbers for Bangla emotion detection, a low-resource setting where such comparisons remain useful. The inclusion of both supervised classifiers and unsupervised clustering broadens the scope, though the modest absolute performance underscores the inherent difficulty of fine-grained emotion classification.

major comments (2)

[Abstract] Abstract: the reported accuracy (52.98%) and macro-F1 (0.3324) are presented without any mention of corpus size, class distribution, number of annotators, inter-annotator agreement, train/test split, or validation procedure. These omissions are load-bearing for the central empirical claim, as it is impossible to determine whether the scores exceed chance level for a 6-class problem or are reproducible.
[Methodology/Results] Methodology/Results (inferred from the listed methods): K-Means is an unsupervised clustering algorithm, yet the paper evaluates it on labeled emotion data. The manuscript must specify the cluster-to-label mapping procedure (e.g., majority vote, Hungarian assignment) used to compute accuracy and F1; without this, the direct comparison to the supervised models is not interpretable.

minor comments (2)

[Abstract] Abstract: 'this paper manifests the results' is nonstandard; 'reports' or 'presents' is clearer.
[Abstract] The abstract states that 'several combinations of features' were tested but does not enumerate them (e.g., unigrams, TF-IDF, n-grams, or lexical resources). Adding this list would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major point below and will incorporate the suggested clarifications in the revised version.

read point-by-point responses

Referee: [Abstract] Abstract: the reported accuracy (52.98%) and macro-F1 (0.3324) are presented without any mention of corpus size, class distribution, number of annotators, inter-annotator agreement, train/test split, or validation procedure. These omissions are load-bearing for the central empirical claim, as it is impossible to determine whether the scores exceed chance level for a 6-class problem or are reproducible.

Authors: We agree that the abstract would benefit from these contextual details to help readers immediately interpret the results. The full manuscript describes the Facebook comment corpus, the annotation process for the six emotions, and the train/test splits with validation. We will revise the abstract to briefly note the corpus size, the 6-class setup, and the evaluation procedure so that the reported accuracy and macro-F1 can be assessed against chance performance. revision: yes
Referee: [Methodology/Results] Methodology/Results (inferred from the listed methods): K-Means is an unsupervised clustering algorithm, yet the paper evaluates it on labeled emotion data. The manuscript must specify the cluster-to-label mapping procedure (e.g., majority vote, Hungarian assignment) used to compute accuracy and F1; without this, the direct comparison to the supervised models is not interpretable.

Authors: We acknowledge that the cluster-to-label mapping must be stated explicitly. After running K-Means on the feature vectors, each cluster was assigned the emotion label with the highest frequency among its members according to the ground-truth annotations (majority vote). We will add a clear description of this procedure in the methodology section of the revised manuscript to make the unsupervised results directly comparable to the supervised classifiers. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a standard empirical ML comparison: a new Bangla emotion corpus is collected and annotated, then five classical classifiers (NB, DT, kNN, SVM, K-Means) are trained with feature combinations and evaluated via accuracy/F1 on held-out data. The reported 52.98% accuracy and 0.3324 macro-F1 for SVM-RBF are direct experimental measurements, not quantities that reduce to fitted inputs by construction. No equations, derivations, self-citation chains, uniqueness theorems, or ansatzes appear; the annotation premise is an explicit modeling assumption rather than a hidden circular step. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central empirical claim rests on the quality and representativeness of the manual emotion annotations plus the assumption that standard text features suffice to separate the six emotion classes in Bangla.

axioms (1)

domain assumption The six basic emotions (sadness, happiness, disgust, surprise, fear, anger) are the appropriate and exhaustive categories for the emotional content in the collected comments.
The abstract invokes these categories directly without validation or discussion of alternatives for Bangla text.

pith-pipeline@v0.9.0 · 5710 in / 1287 out tokens · 25786 ms · 2026-05-24T20:12:09.340440+00:00 · methodology

Comparison of Classical Machine Learning Approaches on Bangla Textual Emotion Analysis

Core claim

What carries the argument

Load-bearing premise

What would settle it

discussion (0)