Giter VIP home page Giter VIP logo

moviegenreprediction's Introduction

Predicting movie genre from plot summaries using Machine Learning models

Our goal was to build movie genre classification models based on the movie plot summary dataset.We build movie genre classification models based on the movie plot summary dataset.

We have used CMU Movie Summary Corpus data set. This data was collected by David Bamman, Brendan O’Connor, and Noah Smith at the Language Technologies Institute and Machine Learning Department at Carnegie Mellon University. It contains 42,306 movie plot summaries extracted from Wikipedia + aligned metadata extracted from Freebase.

Algorithms

As there have been projects done on this dataset before, we decided to use those results as our baseline and try to get a better prediction. We decided to focus on both shallow learning models (SVM, Logistic Regression, Random Forest) and deep learning models (DNN, CNN, LSTM).

So, to finalize, our attempt is to develop a movie genre prediction system where the main challenge is to deal with the multiple number of tags possible for any movies. For the data preprocessing, we have applied advanced approaches like stemming, tf-idf vectorizer, multilabel binarizer etc. Both statistical and deep learning-based approaches are adopted for the analysis. The significant thing is that, the available evaluating metrics aren’t that effective in judging the performance for multilabel classification. So, we had to develop task specific metrics. We have plan to explore more sophisticated models to improve the applicability of our model.

moviegenreprediction's People

Contributors

redwanwalid avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.