khanhnamle1994 / movielens Goto Github PK

4 different recommendation engines for the MovieLens dataset.

Home Page: https://grouplens.org/datasets/movielens/

License: MIT License

Python 0.14% Jupyter Notebook 99.86%

movielens collaborative-filtering notebooks deep-learning jupyter-notebook recommender-systems content-based-recommendation

movielens's Introduction

MovieLens Recommendation Systems

This repo shows a set of Jupyter Notebooks demonstrating a variety of movie recommendation systems for the MovieLens 1M dataset. The dataset contain 1,000,209 anonymous ratings of approximately 3,900 movies made by 6,040 MovieLens users who joined MovieLens in 2000.

Here are the different notebooks:

Data Processing: Loading and processing the users, movies, and ratings data to prepare them for input into my models.
Content-Based and Collaborative Filtering: Using the Content-Based and Collaborative Filtering approach
SVD Model: Using the SVD approach
Deep Learning Model: Using the Deep Learning approach

An accompanied Medium blog post has been written up and can be viewed here: The 4 Recommendation Engines That Can Predict Your Movie Tastes

Requirements

Dependencies

Choose the latest versions of any of the dependencies below:

License

MIT. See the LICENSE file for the copyright notice.

movielens's People

Contributors

Stargazers

Watchers

Forkers

vedsgit libardo1 nishantsbi shafiahmed horace89 mren1 ytjia rayhon1014 kumar-sameer yk1n0gea cryinguo micseb shubhampachori12110095 jabogithub jrey150 pd90506 p9anand nestornav alexanderchan sonhmai omarharchich 79212 fintrek swang-jisoo harrypi huzaifasaeed yiren-liu gptcod iamharshverma handavidbang vinus24 tingweishen zyong812 wguo123 rakhee-veettil tiketdatafelita nixoncorp newandy kstv364 aashokanuno shencangblue stjiao13 rakeshksaraf youjoinit radhikab41 mounikachandra leno1993 prolearner mengwangk sharmarahul21 microw jerrycatleung musttafayildirim vcheng728 stjordanis sikura undecidedpie ssalgia qiaojuliu lusonpan62678 gongmingming maomige kaankalaycioglu taylorage jwatq abhishek-kumar-onometra finesure2017 contextinnovationlab macordobes avichauhan6832 kvt0012 pramit118 akshayjh afcarl mgalusza sid-gupta712 yujian-wu naveedafzal shawamar qonitam farshadsm shongs20 wagjk hanwang921017 luongtuanlinh rajeshyel champloo11 vaquarkhan salmaelbakkali f-lauria arqam123 thisisjorgelima vikram687 glongh tzenkner90 simplify1 lvtwoodpecker asmaahasan100 kyliemkim chenjiale-ty

movielens's Issues

Content based filtering

I am new to recommendation system as well as machine learning . Sir, could you please tell me how to calculate the accuracy of the content based filtering recommendation model that u have created using tf-idf and vector space model .

Want to use PCC, WPCC etc similarity metrices. How can I use that

Can you update the code and let me know with the evaluation

how to load weights back into the tensorflow.

why use svds instead of svd since Ratings_demeaned is not sparse?

Hi,
In the svd recommender, you got the Ratings_demeaned matrix by R - user_ratings_mean.reshape(-1, 1), actually it's not a sparse matrix (only the original R is), is there any reason for you to use scipy.sparse.linalg.svds to do the decomposition?

thanks,
Terry

What is the difference between your deep learning model and the FunkSVD model？

I think the Emebding layer is the same with the random initialization in FunkSVD.

Maybe your code exist some errors?

Hi @khanhnamle1994
I have read your code in Content_Based_and_Collaborative_Filtering_Models.ipynb, I think there are some errors in them.
(1)when compute user_correlation,you use train_data directly. you can check the size of train_data,its column size is 3. Its correct size should equal to item size. The same problem when to compute item_correlation .
(2)In predict function,you wrote mean_user_rating = ratings.mean(axis=1), but the ratings variant is the whole rates which have not been groupby user_id.Therefor the mean_user_rating maybe wrong. You can also check the size or shape of mean_user_rating or ratings variant.
(3) I do not how the two kinds computational formulas come from.

pred = mean_user_rating[:, np.newaxis] + similarity.dot(ratings_diff) / np.array([np.abs(similarity).sum(axis=1)]).T

pred = ratings.dot(similarity) / np.array([np.abs(similarity).sum(axis=1)])

Could you please tell me some details of the formulas.
Thanks a lot!

content based filtering

How to give recommendation to a certain user using content based model ? U have given recommendation using movie titles
.

unable to train deep learning model

TypeError: in user code:

D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\training.py:805 train_function  *
    return step_function(self, iterator)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\training.py:795 step_function  **
    outputs = model.distribute_strategy.run(run_step, args=(data,))
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:1259 run
    return self._extended.call_for_each_replica(fn, args=args, kwargs=kwargs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:2730 call_for_each_replica
    return self._call_for_each_replica(fn, args, kwargs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\distribute\distribute_lib.py:3417 _call_for_each_replica
    return fn(*args, **kwargs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\training.py:788 run_step  **
    outputs = model.train_step(data)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\training.py:754 train_step
    y_pred = self(x, training=True)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1012 __call__
    outputs = call_fn(inputs, *args, **kwargs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\sequential.py:389 call  **
    outputs = layer(inputs, **kwargs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:1008 __call__
    self._maybe_build(inputs)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\engine\base_layer.py:2710 _maybe_build
    self.build(input_shapes)  # pylint:disable=not-callable
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\utils\tf_utils.py:272 wrapper
    output_shape = fn(instance, input_shape)
D:\anaconda3\envs\tr\lib\site-packages\tensorflow\python\keras\layers\merge.py:500 build
    del reduced_inputs_shapes[i][self.axis]

TypeError: list indices must be integers or slices, not ListWrapper

alternative solution for keras.layers.Merge

In your CFModel.py, you have used keras.layers.Merge() function to apply a dot product to the two embedding layers output and this is deprecated long time ago.
Now, I'm using keras 2.2.0 and tensorflow 1.9.0 as backend and found out that using keras.layers.dot( [X,Y] ,axes=1) is a replacement.