Sentiment Analysis of AI Adoption in Indonesian Higher Education Using Machine Learning and Transformer-Based Models
Pith reviewed 2026-05-07 09:47 UTC · model grok-4.3
The pith
DistilBERT outperforms SVM and other models in classifying Indonesian student sentiments on AI adoption in higher education.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By combining 1,154 student opinions with lexical sentiment data to form 2,295 labeled samples, the study evaluates LightGBM, Random Forest, SVM, and DistilBERT for binary sentiment classification. DistilBERT reaches 84.78% accuracy and 84.75% F1-score, surpassing SVM's 82.14% test accuracy and F1-score, indicating that transformer-based models better handle contextual information in this domain.
What carries the argument
Fine-tuned DistilBERT for binary sentiment classification, compared against TF-IDF vectorized machine learning models like SVM.
Load-bearing premise
The 2,295 labeled samples accurately reflect true student sentiments without significant labeling errors or bias.
What would settle it
Retraining and testing the models on a fresh set of independently verified student opinions from Indonesian universities that shows DistilBERT no longer leading in accuracy.
Figures
read the original abstract
This study analyzes Indonesian student opinions on the adoption of artificial intelligence in higher education using two approaches: TF-IDF-based machine learning and Transformer-based deep learning. The dataset consists of 2,295 labeled samples, combining 1,154 student opinions with additional lexical sentiment data. LightGBM, Random Forest, and Support Vector Machine (SVM) are evaluated as machine learning models, while DistilBERT is fine-tuned for binary sentiment classification. The results show that SVM achieves the best performance among the machine learning models with 82.14% test accuracy and F1-score, while DistilBERT performs best overall with 84.78% accuracy and 84.75% F1-score. These findings indicate that Transformer-based models better capture contextual information, although SVM remains a competitive and efficient alternative for sentiment classification.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper evaluates TF-IDF-based machine learning classifiers (LightGBM, Random Forest, SVM) against a fine-tuned DistilBERT model for binary sentiment classification of Indonesian-language opinions on AI adoption in higher education. It uses a combined dataset of 2,295 labeled samples (1,154 student opinions plus lexical sentiment data), reports SVM as the strongest ML model at 82.14% test accuracy and F1, and DistilBERT as the overall best at 84.78% accuracy and 84.75% F1.
Significance. If the performance numbers hold under verified labeling and evaluation protocols, the work supplies a useful empirical baseline for transformer versus classical ML trade-offs on a low-resource language task in an education domain. The direct comparison of three ML models with one transformer is a practical strength.
major comments (1)
- [Abstract and Methods] Abstract and Methods: the central performance claims (DistilBERT 84.78% accuracy / 84.75% F1; SVM 82.14% accuracy / F1) rest on the 2,295-sample dataset whose labeling process, lexicon source, inter-annotator agreement for the 1,154 student opinions, and distributional match between lexical and student-opinion components are not described. Without these details the reported metrics and model ranking cannot be interpreted or reproduced.
minor comments (2)
- [Results] Results section: the abstract and results give point estimates for accuracy and F1 but omit the train-test split ratio, whether stratified sampling or cross-validation was used, and any statistical significance test for the observed differences between models.
- [Methods] Methods: hyperparameter search procedure, learning-rate schedule for DistilBERT fine-tuning, and exact TF-IDF configuration (n-gram range, vocabulary size) are not reported, limiting reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address the single major comment below and will revise the paper accordingly to improve clarity and reproducibility.
read point-by-point responses
-
Referee: [Abstract and Methods] Abstract and Methods: the central performance claims (DistilBERT 84.78% accuracy / 84.75% F1; SVM 82.14% accuracy / F1) rest on the 2,295-sample dataset whose labeling process, lexicon source, inter-annotator agreement for the 1,154 student opinions, and distributional match between lexical and student-opinion components are not described. Without these details the reported metrics and model ranking cannot be interpreted or reproduced.
Authors: We agree that the current manuscript provides insufficient detail on dataset construction. In the revised version we will expand the Methods section with a new subsection that explicitly describes: (1) the source and curation of the lexical sentiment data, (2) the collection and labeling protocol for the 1,154 student opinions (including annotation guidelines), (3) inter-annotator agreement statistics for the student-opinion subset, and (4) a quantitative comparison of sentiment label distributions between the lexical and student-opinion components to justify their combination. These additions will directly support interpretation and reproducibility of the reported accuracy and F1 scores. revision: yes
Circularity Check
No circularity: empirical model evaluation on held-out data
full rationale
The paper reports direct empirical results from training ML models (LightGBM, Random Forest, SVM) and fine-tuning DistilBERT on a 2,295-sample dataset, then measuring accuracy and F1 on a test split. No equations, derivations, or self-citations are invoked to reduce the reported metrics to quantities defined by the same fitted parameters. The performance numbers (e.g., DistilBERT 84.78% accuracy) are computed outputs from standard train/test evaluation, not predictions forced by construction from the input labels or model choices. Label quality concerns are a separate validity issue, not a circularity reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The 2,295 samples are correctly labeled and representative of the target population of Indonesian student opinions.
- standard math TF-IDF features plus standard classifiers and DistilBERT fine-tuning constitute appropriate methods for the binary sentiment task.
Reference graph
Works this paper leans on
-
[1]
Sentiment analysis and opinion mining
Bing Liu. Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers, 2012. doi:10.2200/S00416ED1V01Y201204HLT016
-
[2]
Gomez, ukasz Kaiser, and Illia Polosukhin
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, ukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017
work page 2017
-
[3]
Sentiment analysis algorithms and applications: A survey
Walaa Medhat, Ahmed Hassan, and Hoda Korashy. Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5 0 (4): 0 1093--1113, 2014. doi:10.1016/j.asej.2014.04.011
-
[4]
IndoNLU : Benchmark and resources for evaluating I ndonesian natural language understanding
Bryan Wilie, Karissa Vincentio, Genta Indra Winata, Samuel Cahyawijaya, Xiaohong Li, Zhi Yuan Lim, Sidik Soleman, Rahmad Mahendra, Pascale Fung, Syafri Bahar, and Ayu Purwarianti. IndoNLU : Benchmark and resources for evaluating I ndonesian natural language understanding. In Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association ...
work page 2020
-
[5]
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh, Lysandre Debut, Julien Chaumond, and Thomas Wolf. DistilBERT , a distilled version of BERT : Smaller, faster, cheaper and lighter. In 5th Workshop on Energy Efficient Machine Learning and Cognitive Computing ( EMC ^2 ) at NeurIPS 2019 , 2019. URL https://arxiv.org/abs/1910.01108
work page internal anchor Pith review arXiv 2019
-
[6]
BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT : Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North A merican Chapter of the Association for Computational Linguistics: Human Language Technologies , pages 4171--4186. Association for Computational Linguistics...
- [7]
-
[8]
Journal of Computational and Applied Mathematics, 20:53–65
Gerard Salton and Christopher Buckley. Term-weighting approaches in automatic text retrieval. Information Processing & Management, 24 0 (5): 0 513--523, 1988. doi:10.1016/0306-4573(88)90021-0
-
[9]
Corinna Cortes and Vladimir Vapnik. Support-vector networks. Machine Learning, 20 0 (3): 0 273--297, 1995. doi:10.1007/BF00994018
-
[10]
Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, R \'e mi Louf, Morgan Funtowicz, Joe Davison, Sam Shleifer, Patrick von Platen, Clara Ma, Yannig Jernite, Julien Plu, Canwen Xu, Teven Le Scao, Sylvain Gugger, Mariama Drame, Quentin Lhoest, and Alexander M. Rush. Transformers: State-of-the...
-
[11]
Surat Teerakapibal and Poompak Kusawat. Opportunities and challenges of integrating ChatGPT in education: Sentiment analysis and topic modeling. Journal of Education for Business, pages 1--12, 2025. doi:10.1080/08832323.2025.2536255
-
[12]
Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Rahmad Mahendra, Christian Wibisono, Ade Romadhony, Karissa Vincentio, Fajri Koto, Jennifer Santoso, David Moeljadi, Cahya Wirawan, Frederikus Hudi, Ivan Halim Parmonangan, Ika Alfina, Muhammad Satrio Wicaksono, Ilham Firdausi Putra, Samsul Rahmadani, Yulianti Oenang, Ali ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.