beechung / latent-factor-models Goto Github PK
View Code? Open in Web Editor NEWR functions for fitting latent factor models with internal computation in C/C++
License: BSD 3-Clause "New" or "Revised" License
R functions for fitting latent factor models with internal computation in C/C++
License: BSD 3-Clause "New" or "Revised" License
######################################################## Research Code for Fitting Latent Factor Models ######################################################## Authors: Bee-Chung Chen, Deepak Agarwal and Liang Zhang Yahoo! Labs I. Introduction This code base consists of algorithms for fitting factor models written in R and C/C++. The entry point of any fitting algorithm is in R. The computationally intensive parts are written in C/C++. The models and algorithms have been described in the following papers. [1] Bee-Chung Chen, Jian Guo, Belle Tseng, Jie Yang. User reputation in a comment rating environment. KDD 2011. [2] Deepak Agarwal, Bee-Chung Chen. Regression-based latent factor models. KDD 2009. [3] Deepak Agarwal, Bee-Chung Chen, Bo Long. Localized factor models for multi-context recommendation. KDD 2011. [4] Deepak Agarwal, Bee-Chung Chen. Latent OLAP: Data cubes over latent variables. SIGMOD Conference 2011. [5] Deepak Agarwal, Bee-Chung Chen. fLDA: Matrix factorization through latent Dirichlet allocation. WSDM 2010. II. Tutorial See doc/tutorial.pdf for a tutorial on how to use this package to fit the latent factor models described in [1,2]. III. Compilation You need to have R installed before compiling the code. To install R, see: http://www.r-project.org/ You have to install R from source on a linux machine. It is recommended to use R version >= 2.10.1. The following R packages also need to be installed. Matrix glmnet To compile the C/C++ code, just type make. IV. Examples Localized factor model (multi-context, multi-application factorization) [2]: src/multi-app/R/example/fitting.R fLDA model (LDA topic modeling + Matrix factorization) [5]: src/LDA-RLFM/R/model/examples.R
Hello,
I'm currently studying localized latent factor models and I'm trying to run the code.
I prepared the data according to the manual and tried to run the multicontext function, but an error occurred.
This is my setting and code.
setting = data.frame(
name = 'uvw3-F',
has.u = FALSE,
has.gamma = FALSE,
nFactors = 33,
nLocalFactors = 3,
is.logistic = FALSE
)
ans = run.multicontext(
data.train = data.train, data.test=data.test,
setting = setting,
nSamples = 200,
nBurnIn = 30,
nIter = 30,
out.dir="multi_results/");
max(obs$edge.context) is 11. So, I set the nLocalFactors = 3 and nFactors = 3*11 = 33.
But, The following error occurs
Error in run.multicontext(data.train = data.train, data.test = data.test, :
Please check input parameter 'setting' when calling function run.multicontext: setting$nFactors must = setting$nLocalFactors * max(obs$edge.context).
Am I misunderstanding the setting? I'd appreciate it if you could tell me what the problem is.
Thank you for providing good research and code.
Hi,
Follow the tutorial and I got this issue:
R CMD SHLIB src/C/util.c src/C/factor_model_util.c src/C/pagerank.c src/C/hierarchical.c src/C/factor_model_multicontext.c src/C/factor_model_util2.cpp -o lib/c_funcs.so
clang++ -dynamiclib -Wl,-headerpad_max_install_names -undefined dynamic_lookup -single_module -multiply_defined suppress -L/Library/Frameworks/R.framework/Resources/lib -L/usr/local/lib -o lib/c_funcs.so src/C/util.o src/C/factor_model_util.o src/C/pagerank.o src/C/hierarchical.o src/C/factor_model_multicontext.o src/C/factor_model_util2.o -L/Library/Frameworks/R.framework/Resources/lib -lRlapack -L/Library/Frameworks/R.framework/Resources/lib -lRblas -L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2 -lgfortran -lquadmath -lm -Wall -F/Library/Frameworks/R.framework/.. -framework R -Wl,-framework -Wl,CoreFoundation
ld: warning: directory not found for option '-L/usr/local/lib/gcc/x86_64-apple-darwin13.0.0/4.8.2'
ld: library not found for -lgfortran
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: *** [lib/c_funcs.so] Error 1
make: *** [c_funcs] Error 1
Thanks!
Hi,
Follow the tutorial and I got this error:
Error in .Call("sum_margin", out, A, as.integer(nrow(A)), as.integer(ncol(A)), :
载入表里没有"C"字符名"sum_margin"
I have try my best to find the solution, but failed. Please give me some help, thanks. I run the code on my Mac with R version 3.3.2 .
Hi,
Following is the output of my make:
R CMD SHLIB src/C/util.c src/C/factor_model_util.c src/C/pagerank.c src/C/hierarchical.c src/C/factor_model_multicontext.c src/C/factor_model_util2.cpp -o lib/c_funcs.so
make[1]: Entering directory /home/Desktop/Latent-Factor-Models-master' make[1]: Nothing to be done for
all'.
make[1]: Leaving directory /home/Desktop/Latent-Factor-Models-master' make: *** No rule to make target
src/arslogistic/arms.c', needed by `arslogistic'. Stop.
Should there be an arms.c somewhere?
Thanks!
I finally got it compiled (thanks to the solutions provided by others), but when I ran it, I got this error:
Error in dyn.load(sprintf("%slib/c_funcs.so", code.dir)) :
unable to load shared object '/Users/patrickng/works/Latent-Factor-Models/lib/c_funcs.so':
dlopen(/Users/patrickng/works/Latent-Factor-Models/lib/c_funcs.so, 6): Symbol not found: _compute_szuBv_c_single_dense
Referenced from: /Users/patrickng/works/Latent-Factor-Models/lib/c_funcs.so
Expected in: flat namespace
I am using R version 3.2.4 on Mac OSX. Wonder if anyone has seen this?
ans = fit.bst(obs.train=obs.train, x.obs.train=x.obs.train, x_src=x_src, x_dst=x_dst, x_ctx=x_ctx,out.dir = "/tmp/unit-test/simulated-mtx-uvw-10K", model.name="uvw", nFactors=3, nIter=10);
WARNING: You did not specify the following components in factor: gamma
WARNING: You did not specify the following components in param: h0, var_gamma
================= START fit.MCEM =====================================
Initial loglik: CD = -19358.42494, E = NA (0.01 sec)
Error in data.frame(Method = "MCEM", Iter = iter, nSteps = nSamples, CDlogL = loglik, :
arguments imply differing number of rows: 1, 0
Running src/LDA-RLFM/R/model/examples.R, I got following exception:
(B) Fit a model
dyn.load("c_funcs.so");
source("R/c_funcs.R");
source("R/utils.R");
source("R/model/MCEM_MStep.R");
source("R/model/fit.MCEM.R");set.seed(2);
ans = fit.MCEM(
nIter=5, # Number of EM iterations
nSamples=100, # Number of samples drawn in each E-step: could be a vector of size nIter.
nBurnIn=10, # Number of burn-in draws before take samples for the E-step: could be a vector of size nIter.
factor=data.train$factor, # Initial factor values
obs=data.train$obs, # Observed rating
feature=data.train$feature, # Feature values
param=data.train$param, # Initial parameter values
corpus=data.train$corpus, # The text corpus
try=list(lambda=c(0.5,1,2,4,8), eta=c(0.5, 1, 2, 4)), # Values of lambda and eta that you want to try
out.level=1, # out.level=1: Save the factor & parameter values to out.dir/est.highestCDL and out.dir/est.last
out.dir="/tmp/test/lda-rlfm", # out.level=2: Save the factor & parameter values of each iteration i to out.dir/est.i
out.append=FALSE,
debug=1, # Set to 0 to disable internal sanity checking; Set to 100 for most detailed sanity checking
verbose=1, # Set to 0 to disable console output; Set to 100 to print everything to the console
verbose.M=1, # Verbose setting for the M-step
use.C=TRUE, # Whether to use the C implementation (R implementation does not have full functionalities)
lm=T # Whether to use lm to fit linear regression (otherwise bayesglm will be used, which will be slow)
- );
WARNING: Some terms do not belong to any items in the corpus.
*** caught segfault ***
address 0x0, cause 'unknown'
Traceback:
1: .C("fillInTopicCounts", output$cnt_item_topic, output$cnt_topic_term, output$cnt_topic, output$z_avg, corpus_topic, corpus$item, corpus$term, corpus$weight, as.integer(nItems), as.integer(nrow(corpus)), as.integer(nTopics), as.integer(nTerms), as.integer(nCorpusWeights), as.integer(debug), DUP = FALSE)
2: getTopicCounts(corpus, factor$corpus_topic, nItems, nTopics, size$nTerms)
3: fit.MCEM(nIter = 5, nSamples = 100, nBurnIn = 10, factor = data.train$factor, obs = data.train$obs, feature = data.train$feature, param = data.train$param, corpus = data.train$corpus, try = list(lambda = c(0.5, 1, 2, 4, 8), eta = c(0.5, 1, 2, 4)), out.level = 1, out.dir = "/tmp/test/lda-rlfm", out.append = FALSE, debug = 1, verbose = 1, verbose.M = 1, use.C = TRUE, lm = T)
An irrecoverable exception occurred. R is aborting now ...
Thanks for any help!
The tutorial-BST.R gives the following error:
source("src/R/examples/tutorial-BST.R");
Loading required package: lattice
WARNING: You did not specify the following components in factor: gamma
WARNING: You did not specify the following components in param: h0, var_gamma
================= START fit.MCEM =====================================
Initial loglik: CD = -19358.42494, E = NA (0.00 sec)
test loss: 6.148298 (0.007 sec)write a model & summary info on to disk (used 0.003 sec)
Iteration 1
start E-STEP
Error in MCEM_EStep.multicontext.C(factor = factor, obs = obs, feature = feature, :
invalid mode (NULL) to pass to C or Fortran (arg 3)
In addition: Warning messages:
1: You did not specify the following components in factor: gamma
2: You did not specify the following components in param: h0, var_gamma
I am using R version 3.0.1 (2013-05-16) -- "Good Sport" and the C, FORTRAN code compiles without any warning message.
What are the possible reasons for this error? Thanks.
I want to translate your book into Chinese, and I hope I can get your permission.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.