Giter VIP home page Giter VIP logo

amznrevs's Introduction

AMZNREVS

This is a small program that used Doc2Vec to vectorize user reviews for products (for example on Amazon) and subsequently use sentiment analysis in order to classify the review as either good or bad. One can choose between different classifiers, including a deep neural network implemented with TensorFlow.

How to use

So far, the program relies on provided review data in order to train the model for classification. The train dataset consists of 60000 either positive or negative reviews and the test data consists of additional 20000 reviews. When using for the first time, the program should detect no existing model and should automatically create the Doc2Vec model based on the provided train and test data. Afterwards this can be forced by using the -l True flag in order to make a new Doc2Vec model.

In addition, the vectorization of the train and test files will only be done in the first run. Afterwards it will be saved in two text files in order to save computation time (at the cost of space on the hard drive).

The -m flag can be used to select different machine learning algorithms. The options are

  • lr: logistic regression (default)

  • rf: Random forest

  • gd: Gradient boosting

  • nn: Neural network with one hidden layer with dropout

Most of the classification algorithms have an accurary of about 89-90%.

Future additions

  • implement a scraping mechanism that will provide new reviews in order to classify with the trained model

  • add summary method to summarize new reviews and get important points

amznrevs's People

Contributors

jborchma avatar

Stargazers

Lucas Stegger avatar

Watchers

 avatar

Forkers

beijinggao

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.