Sentiment Analysis and Customer Satisfaction Prediction on E-Commerce Platforms Based on YouTube Comments Using the XGBoost Algorithm
Pith reviewed 2026-05-08 16:52 UTC · model grok-4.3
The pith
XGBoost with TF-IDF on YouTube comments predicts e-commerce customer satisfaction while exposing heavy socio-political influence on polarity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Using a secondary dataset of YouTube comments from e-commerce review videos, the study applies TF-IDF vectorization followed by PyCaret-optimized XGBoost classification and finds both strong predictive resilience and the infiltration of socio-political terminology that alters sentiment polarity.
What carries the argument
PyCaret-optimized XGBoost classifier operating on TF-IDF features extracted from preprocessed YouTube comment text.
If this is right
- Large volumes of unstructured comments can be tracked automatically instead of manually.
- Feature-importance maps can flag when external terminology such as political language begins to dominate satisfaction signals.
- Polarity predictions for audience satisfaction become conditional on the surrounding socio-political context captured in the comments.
- Preprocessing and PyCaret tuning steps can be reused as a template for similar comment-based prediction tasks.
Where Pith is reading between the lines
- Satisfaction models built on social video comments may need explicit context filters to separate product opinion from political overlay.
- The same pipeline could be tested on comments from other platforms to check whether the socio-political infiltration is YouTube-specific.
- If political terms reliably shift polarity, e-commerce platforms might monitor comment streams for early signals of external events affecting brand perception.
Load-bearing premise
The chosen secondary YouTube comment collection from e-commerce videos is representative of customer satisfaction and that the PyCaret-tuned XGBoost truly outperforms alternatives without any detailed baseline comparisons or error analysis presented.
What would settle it
Retraining and testing the identical pipeline on a fresh, direct e-commerce review dataset that lacks socio-political terms and shows materially lower classification accuracy would falsify both the performance claim and the infiltration claim.
Figures
read the original abstract
The exponential expansion of digital commerce in Indonesia has significantly shifted consumer interactions toward video-centric social networks, particularly YouTube. Consequently, the sheer volume of unstructured, multi-contextual comments poses a tremendous challenge for manual sentiment tracking. This study investigates and constructs a predictive model for customer satisfaction leveraging the Extreme Gradient Boosting (XGBoost) architecture coupled with Term Frequency-Inverse Document Frequency (TF-IDF) vectorization. By utilizing a secondary dataset of YouTube comments retrieved from e-commerce review videos, the raw text underwent rigorous preprocessing to generate normalized numerical features. The experimental results demonstrate that the PyCaret-optimized machine learning framework delivers superior classification resilience. Beyond standard performance metrics, lexical evaluations and feature-importance mapping uncover a notable phenomenon: e-commerce discourse is heavily infiltrated by socio-political terminologies, which ultimately influence the polarity of audience satisfaction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies the XGBoost algorithm with TF-IDF vectorization to a secondary dataset of YouTube comments from Indonesian e-commerce review videos. It uses PyCaret for hyperparameter optimization to build a sentiment classifier for predicting customer satisfaction, claiming superior classification resilience, and uses feature-importance analysis to argue that socio-political terminology infiltrates e-commerce discourse and influences sentiment polarity.
Significance. If the performance claims were supported by concrete metrics, baselines, and validation details, the work could contribute modestly to applied NLP for consumer sentiment in social media, particularly by highlighting lexical overlaps between commercial and political language in emerging markets. The socio-political infiltration observation, if rigorously evidenced, might interest interdisciplinary researchers, but the current lack of empirical grounding limits any broader impact.
major comments (3)
- [Abstract] Abstract: The central claim that 'the PyCaret-optimized machine learning framework delivers superior classification resilience' is unsupported by any reported accuracy, F1, precision/recall, confusion matrix, cross-validation scores, or statistical tests. No baseline comparisons (e.g., logistic regression, SVM, or BERT) are mentioned, rendering the superiority assertion untestable.
- [Results] Experimental results / feature-importance section: The conclusion that 'e-commerce discourse is heavily infiltrated by socio-political terminologies, which ultimately influence the polarity of audience satisfaction' relies on lexical evaluations and feature-importance mapping, yet no top features, example terms, quantitative influence scores, or validation of this lexical effect are provided.
- [Methodology] Methodology / Data section: No dataset statistics (size, number of comments, class balance), labeling procedure for sentiment or satisfaction labels, collection method for the secondary YouTube dataset, or justification of its representativeness for customer satisfaction are given, undermining reproducibility and generalizability.
minor comments (1)
- [Abstract] The abstract and introduction could more explicitly define 'customer satisfaction' as operationalized from YouTube comments versus traditional review ratings.
Simulated Author's Rebuttal
We sincerely thank the referee for the detailed and constructive feedback on our manuscript. We have carefully reviewed each major comment and will revise the paper to address the concerns regarding empirical support, reproducibility, and clarity. Our responses to the points are provided below.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim that 'the PyCaret-optimized machine learning framework delivers superior classification resilience' is unsupported by any reported accuracy, F1, precision/recall, confusion matrix, cross-validation scores, or statistical tests. No baseline comparisons (e.g., logistic regression, SVM, or BERT) are mentioned, rendering the superiority assertion untestable.
Authors: We agree that the abstract's claim of superior classification resilience requires explicit numerical support to be verifiable. While the results section presents performance from the PyCaret-optimized XGBoost model, these metrics were not summarized in the abstract. In the revised manuscript, we will update the abstract to include concrete metrics such as accuracy, F1-score, precision, recall, and cross-validation scores. We will also add baseline comparisons against logistic regression and SVM (and note any limitations with BERT due to computational constraints) to substantiate the performance claims. revision: yes
-
Referee: [Results] Experimental results / feature-importance section: The conclusion that 'e-commerce discourse is heavily infiltrated by socio-political terminologies, which ultimately influence the polarity of audience satisfaction' relies on lexical evaluations and feature-importance mapping, yet no top features, example terms, quantitative influence scores, or validation of this lexical effect are provided.
Authors: The referee is correct that the socio-political infiltration claim needs more granular evidence. The manuscript performs feature-importance analysis via XGBoost and lexical evaluation, but specific top features and examples were not listed. In the revision, we will include a table of the highest-ranked features with their importance scores, provide concrete examples of socio-political terms (e.g., political or social-issue vocabulary appearing in comments), and discuss their quantitative influence on sentiment polarity with supporting data excerpts. revision: yes
-
Referee: [Methodology] Methodology / Data section: No dataset statistics (size, number of comments, class balance), labeling procedure for sentiment or satisfaction labels, collection method for the secondary YouTube dataset, or justification of its representativeness for customer satisfaction are given, undermining reproducibility and generalizability.
Authors: We acknowledge the omission of these essential details, which limits reproducibility. The study uses a secondary YouTube comments dataset from Indonesian e-commerce videos, but specifics were not elaborated. The revised manuscript will add dataset statistics (total comments, class balance), a full description of the labeling procedure for sentiment and satisfaction, the collection approach, and a justification of representativeness for e-commerce customer satisfaction in the Indonesian context. revision: yes
Circularity Check
No circularity: empirical ML pipeline with standard tuning and no derivations
full rationale
The manuscript applies XGBoost classification to a secondary YouTube comments dataset after TF-IDF vectorization and PyCaret hyperparameter search. No equations, uniqueness theorems, or first-principles derivations appear; performance claims rest on experimental outputs rather than any reduction of predictions to fitted inputs by construction. Feature-importance observations about socio-political terms are post-hoc interpretations of model results, not self-definitional or self-cited premises. Standard library usage (XGBoost, PyCaret) and external dataset sourcing introduce no load-bearing self-citation chains. The work is therefore self-contained against external benchmarks with zero circular steps.
Axiom & Free-Parameter Ledger
free parameters (1)
- XGBoost hyperparameters
axioms (2)
- domain assumption TF-IDF vectorization produces features that reliably encode sentiment polarity in YouTube comments
- domain assumption The secondary YouTube comment dataset is representative of general e-commerce customer satisfaction
Reference graph
Works this paper leans on
-
[1]
Daza, A., et al. (2024). Sentiment Analysis on E-Commerce Product Re- views Using Machine Learning and Deep Learning Algorithms.Interna- tional Journal of Information Management Data Insights
work page 2024
-
[2]
Ramadhani, W. A., & Rozi, F. (2025). Prediksi Kepuasan Pelanggan Berdasarkan Ulasan Produk di Lazada Indonesia Menggunakan Algoritma Decision Tree C4.5.Infotek
work page 2025
- [3]
-
[4]
Darmawan, T. D. (2022).Analisis Sentimen Review Pelanggan E-Commerce Di Indonesia Menggunakan Algoritma Naïve Bayes Classifier
work page 2022
-
[5]
Bahri, S., & Widodo, A. M. (2024). Penerapan Algoritma Pengklasifikasi Untuk Mengukur Kepuasan Pelanggan E-Commerce (Studi Kasus: Shopee). ADIJAYA
work page 2024
-
[6]
Tribuana, D., Baharuddin, & Resky, A. M. (2025). Penerapan Algoritma XGBoost Untuk Prediksi Kepuasan Pelanggan Pada Layanan E-Commerce. JTBC
work page 2025
- [7]
-
[8]
Dewi, T., Asrianda, & Afrillia, Y . (2025). Sentiment Analysis of Customer Satisfaction Towards Shopee and Lazada E-commerce Platform Using the Random Forest Algorithm Classifier.IJESIT
work page 2025
-
[9]
Amari, O. E. S., & Udayasuriyan, A. (2026). Analyzing Customer Review Sentiments using Machine Learning.IJIRE
work page 2026
-
[10]
Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System.Proceedings of the 22nd ACM SIGKDD. 5
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.