evancasey / spark-knn-recommender Goto Github PK

View Code? Open in Web Editor NEW

126.0 126.0 52.0 12.02 MB

Item and User-based KNN recommendation algorithms using PySpark

License: MIT License

Python 93.73% Perl 3.30% Shell 2.97%

spark-knn-recommender's People

Contributors

Stargazers

Watchers

spark-knn-recommender's Issues

Can't run in spark 1.6.1

I fixed the path problem by modifying the config.py and utils.py. But it still have problems.That is:
"""
python2.7: can't open file '/usr/share/spark-1.6.1-bin-hadoop2.6/python/sparkler/kmeans.py': [Errno 2] No such file or directory

"""
And I don't know why I need to copy the algorithms to the sparkler directory.Because in Spark 1.6.1,I can't find any directory named sparkler

use flatMap to find all pairs instead of map

in usercf

findItemPairs returning only first item pair for each user

findItemPairs returning only first item pair for each user:
def findItemPairs(user_id,items_with_rating):
'''
For each user, find all item-item pairs combos. (i.e. items with the same user)
'''
for item1,item2 in combinations(items_with_rating,2):
return ((item1[0],item2[0]),(item1[1],item2[1]))

Solution:
change return to yield to get the generator.

Stuck on Last Step on Lunching Spark Knn Recommender

Hello Everyone,

I am running on Windows with the last step of lunching the program from pyspark folder executed by this function "run_itemcf(DATA_CF_LOCAL)".

I have the following requirements on my PC:
-Python 2.7.13
-pyspark 2.1.1
-Numpy

I have tried to install and run this project, i have done the first two steps:
-Install project to your local drive ..............(DONE)
-run "$ python setup.py"...............(DONE)
-python "train_and_test.py"...............(Error: file path does not exist)

Because i am working on Windows, i changed the working directory:
`

import pdb
import sys,os

import config
from utils import run_kmeans, run_usercf, run_itemcf
from algorithms import *

DATA_KMEANS = "data/kmeans_data.txt"
DATA_CF_LOCAL = "tests/data/cftrain.txt"
DATA_CF_S3 = "s3n://sparkler-data/ratings10m.txt"

if __name__ == "__main__":

# Copy contents of algorithms into pyspark home
# TODO: use spark_home from install.sh (make install.sh set it in config?)
os.system("xcopy ../algorithms"+config.SPARKLER_HOME)

# run_kmeans(DATA_KMEANS, 2, 5)

# run_usercf(DATA_CF_LOCAL)

run_itemcf(DATA_CF_LOCAL)

# run_itemcf(DATA_CF_S3)

`
These are config.py values:

- PYSPARK_HOME = ".\spark\python\pyspark"
- PYSPARK_MODULE_HOME = ".\spark\python\pyspark"
- SPARKLER_HOME = "build\spark-0.7.0\python\sparkler"

Refactor item_cf with join instead of combinations

see user_cf

evancasey / spark-knn-recommender Goto Github PK

spark-knn-recommender's People

Contributors

Stargazers

Watchers

Forkers

spark-knn-recommender's Issues

Can't run in spark 1.6.1

use flatMap to find all pairs instead of map

findItemPairs returning only first item pair for each user

Stuck on Last Step on Lunching Spark Knn Recommender

Refactor item_cf with join instead of combinations

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent