tohtsky / myfm Goto Github PK
View Code? Open in Web Editor NEWA Python/C++ implementation of Bayesian Factorization Machines
Home Page: https://myfm.readthedocs.io
License: MIT License
A Python/C++ implementation of Bayesian Factorization Machines
Home Page: https://myfm.readthedocs.io
License: MIT License
The mapper library doesn't seem to have any type DefaultMapper...
ImportError Traceback (most recent call last)
<ipython-input-9-46b0ef8584af> in <module>
6 import pandas as pd
7 from scipy import sparse as sps
----> 8 from mapper import DefaultMapper
9 # read movielens 1m data.
10 from myfm.utils.benchmark_data import MovieLens1MDataManager
ImportError: cannot import name 'DefaultMapper' from 'mapper'
It looks as though both notebooks are loosely following the docs but all three are different. Is one example more up to date than the others or are they all intentionally different? Which notebook should readers step through?
How should the FM be used to make predictions? For example, say I train this model on 1000 user movie pairs. I want to make a prediction for an unseen user, which is a vector where the values are predicted ratings for all possible movies. However, in the examples it looks like the same users get used for training and testing. ie. for user A the model trains on 80% of the known movie ratings and then tries to predict the remaining 20%. How should we call the model when we want to predict 80% of ratings for an unseen user ie. one not in the training set?
In other words I would like to take a vector of length n where I have m known ratings and infer the remaining n-m? Would I have to include the m known ratings in the training set?
Hello Tomoki Ohtsuki,
I was following your ml-100k-extended exemplary notebook, but had problems running the myfm.MyFMOrderedProbit() model with use_date=False. The fitting works fine, but the problem arises during prediction. I tried to attach a screenshot of the error I get. I hope it worked. If not, the error I am getting is "ValueError: Relation blocks have inconsistent mapper size with case_size". In your notebook, if the use_date=False is set, then X is set to None and X_rel=test_blocks. The error message is based on this set None value. However, in the myfm.MyFMRegressor() model everything works as expected.
Thanks in advance for your help!
Hi, when I run 'python ml-100k-regression.py 1', the bug traceback is as follows:
df_train.shape = (80000, 4), df_test.shape = (20000, 4)
Traceback (most recent call last):
File "ml-100k-regression.py", line 233, in
target.append(RelationBlock(user_map, augment_user_id(unique_users)))
File "ml-100k-regression.py", line 189, in augment_user_id
col.append(movie_to_internal[mid])
TypeError: 'CategoryValueToSparseEncoder' object is not subscriptable
Can you give me some instructions on the bug?
Is it possible to multicore training as in the prediction?
We factorize a matrix into factors, could we reconstruct a matrix based on factors.
Trying out toy model from documentation causes segmentation fault.
import myfm
from sklearn.feature_extraction import DictVectorizer
import numpy as np
train = [
{"user": "1", "item": "5", "age": 19},
{"user": "2", "item": "43", "age": 33},
{"user": "3", "item": "20", "age": 55},
{"user": "4", "item": "10", "age": 20},
]
v = DictVectorizer()
X = v.fit_transform(train)
# Note that X is a sparse matrix
print(X.toarray())
# The target variable to be classified.
y = np.asarray([0, 1, 1, 0])
fm = myfm.MyFMClassifier(rank=4)
print("fit")
fm.fit(X,y)
print("fit done")
# It also supports prediction for new unseen items.
fm.predict_proba(v.transform([{"user": "1", "item": "10", "age": 24}]))
$ python test.py
[[19. 0. 0. 0. 1. 1. 0. 0. 0.]
[33. 0. 0. 1. 0. 0. 1. 0. 0.]
[55. 0. 1. 0. 0. 0. 0. 1. 0.]
[20. 1. 0. 0. 0. 0. 0. 0. 1.]]
fit
0%| | 0/100 [00:00<?, ?it/s]Segmentation fault (core dumped)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.