pith. machine review for the scientific record. sign in

arxiv: 1605.05362 · v1 · submitted 2016-05-17 · 💻 cs.CL · cs.IR· cs.LG

Recognition: unknown

Yelp Dataset Challenge: Review Rating Prediction

Authors on Pith no claims yet
classification 💻 cs.CL cs.IRcs.LG
keywords reviewratingpredictionclassificationmodelsproblemyelpdataset
0
0 comments X
read the original abstract

Review websites, such as TripAdvisor and Yelp, allow users to post online reviews for various businesses, products and services, and have been recently shown to have a significant influence on consumer shopping behaviour. An online review typically consists of free-form text and a star rating out of 5. The problem of predicting a user's star rating for a product, given the user's text review for that product, is called Review Rating Prediction and has lately become a popular, albeit hard, problem in machine learning. In this paper, we treat Review Rating Prediction as a multi-class classification problem, and build sixteen different prediction models by combining four feature extraction methods, (i) unigrams, (ii) bigrams, (iii) trigrams and (iv) Latent Semantic Indexing, with four machine learning algorithms, (i) logistic regression, (ii) Naive Bayes classification, (iii) perceptrons, and (iv) linear Support Vector Classification. We analyse the performance of each of these sixteen models to come up with the best model for predicting the ratings from reviews. We use the dataset provided by Yelp for training and testing the models.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

    cs.LG 2026-04 unverdicted novelty 7.0

    ProjRes achieves near-100% accuracy in membership inference on FedLLMs by measuring projection residuals of hidden embeddings on gradient subspaces, outperforming prior methods by up to 75.75% even under differential privacy.

  2. FLAME: Condensing Ensemble Diversity into a Single Network for Efficient Sequential Recommendation

    cs.IR 2026-04 conditional novelty 6.0

    FLAME condenses ensemble diversity into a single network via modular ensemble simulation and guided mutual learning during training, delivering ensemble-level performance with single-network inference speed on sequent...

  3. LLMs-as-Judges: A Comprehensive Survey on LLM-based Evaluation Methods

    cs.CL 2024-12 accept novelty 3.0

    A survey that organizes LLMs-as-judges research into functionality, methodology, applications, meta-evaluation, and limitations.