pranab / sifarish Goto Github PK
View Code? Open in Web Editor NEWContent based and collaborative filtering based recommendation and personalization engine implementation on Hadoop and Storm
Home Page: http://pkghosh.wordpress.com
Content based and collaborative filtering based recommendation and personalization engine implementation on Hadoop and Storm
Home Page: http://pkghosh.wordpress.com
The duplicate detection seems quite sweet, but I am having runtime problems with the tutorial example data. The application throws the exception near the output stage as (partially shown) below. sifarsh is svn rev 136 and chombo is rev 94. There is no problem rebuilding sifarish with
mvn clean install
and I even tried explicitly setting the CLASSPATH to include the lang3 jar.
Thanks for any advice
--Bob
14/02/09 13:35:33 INFO mapred.LocalJobRunner:
14/02/09 13:35:33 INFO mapred.Task: Task 'attempt_local2000795973_0001_m_000000_0' done.
14/02/09 13:35:33 INFO mapred.LocalJobRunner: Finishing task: attempt_local2000795973_0001_m_000000_0
14/02/09 13:35:33 INFO mapred.LocalJobRunner: Map task executor complete.
14/02/09 13:35:33 INFO mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@6f7918f0
14/02/09 13:35:33 INFO mapred.LocalJobRunner:
14/02/09 13:35:33 INFO mapred.Merger: Merging 1 sorted segments
14/02/09 13:35:33 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 37652 bytes
14/02/09 13:35:33 INFO mapred.LocalJobRunner:
14/02/09 13:35:33 WARN mapred.LocalJobRunner: job_local2000795973_0001
java.lang.NoClassDefFoundError: org/apache/commons/lang3/StringUtils
at org.sifarish.feature.SameTypeSimilarity$SimilarityReducer.setup(SameTypeSimilarity.java:238)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:398)
Caused by: java.lang.ClassNotFoundException: org.apache.commons.lang3.StringUtils
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
... 5 more
14/02/09 13:35:33 INFO mapred.JobClient: map 100% reduce 0%
14/02/09 13:35:33 INFO mapred.JobClient: Job complete: job_local2000795973_0001
14/02/09 13:35:33 INFO mapred.JobClient: Counters: 19
14/02/09 13:35:33 INFO mapred.JobClient: File Input Format Counters
14/02/09 13:35:33 INFO mapred.JobClient: Bytes Read=609
14/02/09 13:35:33 INFO mapred.JobClient: FileSystemCounters
14/02/09 13:35:33 INFO mapred.JobClient: FILE_BYTES_READ=148345
14/02/09 13:35:33 INFO mapred.JobClient: FILE_BYTES_WRITTEN=240118
14/02/09 13:35:33 INFO mapred.JobClient: Map-Reduce Framework
14/02/09 13:35:33 INFO mapred.JobClient: Map output materialized bytes=37656
14/02/09 13:35:33 INFO mapred.JobClient: Map input records=9
14/02/09 13:35:33 INFO mapred.JobClient: Reduce shuffle bytes=0
14/02/09 13:35:33 INFO mapred.JobClient: Spilled Records=450
14/02/09 13:35:33 INFO mapred.JobClient: Map output bytes=36750
14/02/09 13:35:33 INFO mapred.JobClient: CPU time spent (ms)=0
14/02/09 13:35:33 INFO mapred.JobClient: Total committed heap usage (bytes)=188743680
14/02/09 13:35:33 INFO mapred.JobClient: Combine input records=0
14/02/09 13:35:33 INFO mapred.JobClient: SPLIT_RAW_BYTES=95
14/02/09 13:35:33 INFO mapred.JobClient: Reduce input records=0
14/02/09 13:35:33 INFO mapred.JobClient: Reduce input groups=0
14/02/09 13:35:33 INFO mapred.JobClient: Combine output records=0
14/02/09 13:35:33 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
14/02/09 13:35:33 INFO mapred.JobClient: Reduce output records=0
14/02/09 13:35:33 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
14/02/09 13:35:33 INFO mapred.JobClient: Map output records=450
Pranab,
I am getting below error message, pls refer the summary of the error message from here,,
Also, I have attached the details error messages here.
Description Resource Path Location Type
DynamicAttrSimilarityStrategy cannot be resolved to a type ItemDynamicAttributeSimilarity.java /sifarish/src/main/java/org/sifarish/common line 171 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type ItemDynamicAttributeSimilarity.java /sifarish/src/main/java/org/sifarish/common line 192 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type JaccardSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 20 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type SameTypeSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 161 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type SameTypeSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 176 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type SameTypeSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 301 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type TypeSchema.java /sifarish/src/main/java/org/sifarish/feature line 96 Java Problem
DynamicAttrSimilarityStrategy cannot be resolved to a type TypeSchema.java /sifarish/src/main/java/org/sifarish/feature line 97 Java Problem
fieldDelimRegex cannot be resolved to a variable CosineSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 32 Java Problem
fieldDelimRegex cannot be resolved to a variable CosineSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 33 Java Problem
fieldDelimRegex cannot be resolved to a variable JaccardSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 36 Java Problem
fieldDelimRegex cannot be resolved to a variable JaccardSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 37 Java Problem
isBooleanVec cannot be resolved to a variable CosineSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 76 Java Problem
isCountIncluded cannot be resolved to a variable CosineSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 64 Java Problem
isCountIncluded cannot be resolved to a variable CosineSimilarity.java /sifarish/src/main/java/org/sifarish/feature line 76 Java Problem
Hello Pranab,
I am getting number in front of user id in final output , can you help me to know how is the final output of recommendation.Here recommendation is about movies.i have given my data as a input which is having movie name instead of item ID and person name instead of user ID. but I didn't understand from where that no came. and recommendation gets repeated with different numbers.
sample output:
1bhushan,3idiots,245
1bhushan,dilchahatahai,558
1bhushan,sarfarosh,463
1bhushan,satya,515
1bhushan,sholay,479
1krishna,3idiots,188
1krishna,dilchahatahai,248
1krishna,sarfarosh,415
1krishna,satya,216
1krishna,sholay,319
1neha,3idiots,234
1neha,dilchahatahai,404
1neha,sarfarosh,379
1neha,satya,378
1neha,sholay,407
1poonam,3idiots,186
1poonam,dilchahatahai,464
1poonam,sarfarosh,357
1poonam,satya,359
1poonam,sholay,433
bhushan,dilchahatahai,355
bhushan,sarfarosh,232
bhushan,satya,243
bhushan,sholay,364
2shubham,dilchahatahai,301
2shubham,sarfarosh,197
2shubham,satya,206
2shubham,sholay,309
3sayali,dilchahatahai,74
3sayali,sarfarosh,48
3sayali,satya,51
3sayali,sholay,76
3shubham,3idiots,120
3shubham,dilchahatahai,136
3shubham,sarfarosh,164
3shubham,satya,255
Stack trace:
java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected
at org.sifarish.feature.SameTypeSimilarity$SimilarityReducer.reduce(SameTypeSimilarity.java:361)
at org.sifarish.feature.SameTypeSimilarity$SimilarityReducer.reduce(SameTypeSimilarity.java:237)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
at org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
This is the error that I got when I execute org.sifarish.feature.SameTypeSimilarity
I tried this http://stackoverflow.com/questions/18967147/cdh4-version-conflict-found-interface-org-apache-hadoop-mapreduce-counter-but but it didn't work. I'm pretty new to mvn and Java world. Am I doing something wrong here?
Hi Pranab, I followed the tutorial through Implicit scenario and I stuck at the step of genHistEvent.
As this mentioned ./brec.sh genHistEvent <item_count> <user_count> <average_event_count_per_user>
.
I run the following below.
./brec.sh genHistEvent 100 100 9
And got error
./brec.sh: line 58: $5: ambiguous redirect
The schema I use is exacly engageEvent.json. and the variables I used in brec.sh are below.
JAR_NAME=/etc/recomlib/sifarish-1.0.jar CHOMBO_JAR_NAME=/etc/recomlib/chombo-1.0.jar HDFS_BASE_DIR=/user/pranab/reco PROP_FILE=/etc/git/sifarish/reco.properties HDFS_META_BASE_DIR=/user/pranab/meta/imra
Also I have already created JAR_NAME, CHOMBO_JAR_NAME, PROP_FILE
and HDFS_BASE_DIR, HDFS_META_BASE_DIR
in local filesystem and HDFS accordingly.
I have downloaded all the required dependencies.
I've been trying to solve this for too long time and I can not. So I couldn't help but asked for your help here and would appreciate your answer.
Hello pranab,
I got "Exception in thread "main" java.lang.NoClassDefFoundError: com/amazonaws/auth/AWSCredentials " this exception while running ImplicitRatingEstimator to generate Implicit rating.
Plz help. and why we required AWSCredentials ?? I didn't find this in your project.
Hi Pranab,
I am trying to setup the siffarish project in my local desktop, I am getting bellow error message., I think, I need to add the repository details in pom.xml, Please advice me what would be right approach to fix this issue.
Details Error message here:
Missing artifact mawazo:chombo:jar:1.0 pom.xml /sifarish line 107 Maven Dependency Problem
The container 'Maven Dependencies' references non existing library 'C:\Documents and Settings\Administrator.m2\repository\mawazo\chombo\1.0\chombo-1.0.jar' sifarish Build path Build Path Problem
The project cannot be built until build path errors are resolved sifarish Unknown Java Problem
Thanks and Regards
Vijay
Hi Pranab,
Please find the above exception while running implicit rating genaeration script. Please let me know how to resolve the issue.
Exception in thread "main" java.lang.NoClassDefFoundError: org/chombo/util/SecondarySort$TuplePairGroupComprator
at org.sifarish.common.ImplicitRatingEstimator.run(ImplicitRatingEstimator.java:73)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
at org.sifarish.common.ImplicitRatingEstimator.main(ImplicitRatingEstimator.java:243)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
Caused by: java.lang.ClassNotFoundException: org.chombo.util.SecondarySort$TuplePairGroupComprator
Hello,pranad! I have encountered this problem when I using eclipse-m2e to checkout sifarish. And I cannot find mawazo or chombo.jar. I don't know how to solve this problem.
Hello Pranab,
This recommendation is user to item recommendation. Now I wan to develop Item to Item recommendation i.e. recommendation for similar products .Can you help me with this? What should I refer from your tutorial?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.