Giter VIP home page Giter VIP logo

minas26902 / improving_yelp_ratings_with_ml Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 1.0 17.26 MB

Our goal in this group project is to apply NLP and other features from Yelp reviews into a model that outputs a new 5-star-rating, so that there is less discrepancy between reviews and star ratings. In order to make our model more robust, we will also incorporate new user star-ratings based on reviews read (meaning that someone who did not write the review gives a star-rating based on the review text alone) into our model so that it better reflects the review sentiment. We used multiple ML models, including: Naive Bayes, k-NN, K-Means, LSTM, N-Gram, TD-IDF and Linear Regression

Python 1.96% Jupyter Notebook 95.71% HTML 1.56% CSS 0.77%
machine-learning-algorithms natural-language-processing review-sentiment yelp-reviews kmeans-clustering naive-bayes-classifier lstm-neural-networks linear-regression sentiment-analysis afinn seaborn scikit-learn spark

improving_yelp_ratings_with_ml's Introduction

Improving the Yelp Review Experience by Stardardizing Reviewer Sentiment

Team:

  • Angela Detweiler
  • Hee Kang
  • Alexander Lam
  • Behesteh Mostaghni

Dataset link: Yelp Dataset in Kaggle with a focus on Restaurants- https://www.kaggle.com/yelp-dataset/yelp-dataset

Problem: When you are researching restaurants on Yelp, do you look at the star rating or do you read the review? Do you look at both? Given that reviews are highly subjective, and star ratings can be influenced by various aspects of business performance, can we use machine learning to standardize the interpretation of reviews?

Goal: Our goal is to apply Natural Language Processing (NLP) and other features from the Yelp reviews into a model that outputs a new 5-star-rating, so that there is less discrepancy between reviews and star ratings. In order to make our model more robust, we will also incorporate new user star-ratings based on reviews read (meaning that someone who did not write the review gives a star-rating based on the review text alone) into our model so that it better reflects the review sentiment.

Hypothesis: We hypothesize that automating star ratings based on NLP of restaurant reviews will improve Yelp review experience by normalizing reviewer sentiment.

ML algorithms:

  1. Naive Bayes
  2. k-NN
  3. K-Means
  4. LSTM
  5. N-Gram
  6. TD-IDF
  7. Linear Regression

Libraries:

  1. Numpy
  2. Scipy
  3. Scikit_Learn
  4. Pandas
  5. Matplotlib
  6. NLTK
  7. PySpark
  8. Keras
  9. HTML/ CSS/ Bootstrap
  10. Tableau

Sentiment Analysis Lexicon:

  1. AFINN
  2. VADER

Project components, steps, analyses, and final products:

  1. Components and final products

    • ML algorithms
    • Game (user rates reviews)/HTML page
    • Database with game data to be reincorporated into model
    • Model output/vizualizations in JN
  2. Steps and analyses

    • Select and clean restaurant/food category data from Yelp
    • Cluster reviews into 5 categories (5 star-rating)
    • Use NLP to train model
    • Test Yelp rating/review data (user inputs both)
    • Incorporate new user star-rating from game into the model
    • Other...

Questions/Topics of Interest:

  1. (ML) Are yelp reviews highly correlated to restaurant quality (based on star rating) ? In other words, are the reviews useful?
  2. What percentage of reviews talk about the quality of the food versus the quality of the service?
  3. Correlate photo captions to reviews.
  4. (ML) Is there consistency in review style for a particular user?
  5. Distribution of ratings (stars)- Is it a bell curve or does it peak at both extremes (1 and/or 5 star ratings)?
  6. (ML) Is there a pattern to Yelp Elite status? Elite vs non-elite.
  7. Patterns in ratings/review sentiment correlated to business attributes? (Outdoor seating, live music, etc.)
  8. Patterns in 'useful' reviews?
  9. Use NLP to train model, test then have HUMANS rate as well and compare the difference

improving_yelp_ratings_with_ml's People

Contributors

avlam avatar beheshteh avatar kanglm32 avatar minas26902 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

wandabwa2004

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.