Giter VIP home page Giter VIP logo

movie_reviews-sentiment_analysis's Introduction

Movie Reviews - Sentiment Analysis

Python 3.5 classification of movie reviews (positive or negative) using NLTK-3 and sklearn.

An analysis of the movie_review data set included in the nltk corpus.


What is in this repo

  • An implementation of nltk.NaiveBayesClassifier trained against 5000 movie reviews. Implemented in NLTK_Naive_Bayes.py.
  • Using sklearn
    • Naive Bayes:
      • MultinomialNB:
      • BernoulliNB:
    • Linear Model
      • LogisticRegression:
      • SGDClassifier:
    • SVM
      • SVC:
      • LinearSVC:
      • NuSVC:

Implemented in Scikit_Learn_Classifiers.py

  • Implemented a voting system to choose the best out of all the learning methods. Implemented in Voting_Algos.py

Accuracy achieved

Classifiers Accuracy achieved
nltk.NaiveBayesClassifier 73.0%
ScikitLearn Implementations
BernoulliNB 72.0%
MultinomialNB 75.0%
LogisticRegression 71.0%
SGDClassifier 69.0%
SVC 48.0%
LinearSVC 74.0%
NuSVC 75.0%

Requirements

The simplest way(and the suggested way) would be to install the the required packages and the dependencies by using either anaconda or miniconda

After that you can do

$ conda update conda
$ conda install scikit-learn nltk

Downloading the dataset

The dataset used in this package is bundled along with the nltk package.

Run your python interpreter

>>> import nltk
>>> nltk.download('stopwords')
>>> nltk.download('movie_reviews') 

NOTE: You can check system specific installation instructions from the official nltk website

Check if everything is good till now by running your interpreter again and importing these

>>> import nltk
>>> from nltk.corpus import stopwords, movie_reviews
>>> import sklearn
>>> 

If these imports work for you. Then you are good to go!


Running it

  1. Clone the repo
$ git clone https://github.com/aalind0/Movie_Reviews-Sentiment_Analysis
$ cd Movie_Reviews-Sentiment_Analysis
  1. Order of running

  2. NLTK_Naive_Bayes.py

  3. Scikit_Learn_Classifiers.py

  4. Voting_Algos.py

  5. Hack away!


So

"So what, Well this is pretty basic!"

Yes, it is but hey we all do start somewhere right?

Coming Up. I am working on a Twitter Sentiment Analysis project which first trains on a given data-set and then takes in the live twitter feeds, analyses them plus plots them for data visualization.

You can follow me on twitter @singh_aalind to keep tabs on it.


End

Hacked together by Aalind Singh.

movie_reviews-sentiment_analysis's People

Contributors

aalind0 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.