Giter VIP home page Giter VIP logo

sebastianrokholt / hybrid-recommender-system Goto Github PK

View Code? Open in Web Editor NEW
38.0 2.0 14.0 13.78 MB

A repository for a machine learning project about developing a hybrid movie recommender system.

Jupyter Notebook 100.00%
recommender-system hybrid-recommender-system collaborative-filtering content-based-filtering python matplotlib data-science machine-learning regression-models k-nearest-neighbours

hybrid-recommender-system's Introduction

Movie Recommender System

This repository contains the files for a Data Science project about recommender systems and machine learning. The project revolves around building a hybrid recommender system with collaborative and content-based filtering.

The Dataset

All data was sourced from the MovieLens 1M Dataset, which contains 1 million movie ratings given by 6000 users on 4000 movies in the years 2000-2003.

The Process

In order to learn as much as possible about recommender systems and data science in general, I decided early on to follow the Data Science Process for this project:

The Data Science Process

This project has largely been motivated by wanting to learn how to implement a recommender system in Python. I started with wondering whether I would be able to improve the root mean squared error (RMSE) of a superior recommender system by combining it with a different recommender system approach. I had read about the "cold start problem", and though it intuitively made sense to combine content-based and collaborative filtering models to improve recommendations, I was curious about how one would practically apply this to create a hybrid recommender system. After taking a look at the dataset for the first time, I also wrote down a string of questions which were only answerable through data wrangling and visualization.

Cleaning the data was more difficult than I originally thought, as a lot of the data was missing or incorrect. I quickly came to learn the power of value imputations, especially using KNN. I also learned a surprisingly lot about the US postal code system...

Before I began modelling, I had to answer the questions I had about possible patterns in the data. After figuring out that I could potentially improve the training dataset through engineering three new features, I also wanted to know how much RMSE would improve from adding the new features to the training dataset. Therefore, I first created different content-based and collaborative filtering models using a variety of machine learning algorithms. I then re-built and re-trained the most promising models using the new dataset with the engineered features.

Finally, I ran the improved hybrid recommender a last time and predicted ratings from all users to all movies. Going forward, I'm considering whether I should implement the hybrid recommender in a simple website. I am hoping that I'll soon find the time to do so.

Explanation of files

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.