Giter VIP home page Giter VIP logo

lightfm-dataset-helper's People

Contributors

med-elomari avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

lightfm-dataset-helper's Issues

How to predict for new users?

Great package, I was able to run it and do a prediction for an existing user:

scores = model.predict(user_ids=6, item_ids=[1,2,3,5,6])
print(scores)

However, I would like to know how to make predictions for new users (cold start), I am not able to find documentation here or on light fm about how do it.

I tried this

new_user_feature = [8,{'name:John', 'Age:33', 'los:IFS','ou:development', 'skills:sql'} ]    
new_user_feature = [8,new_user_feature]

#predict new users User-Id	name	age	los	ou	gender	skills
model.predict(0, item_ids=[1,2,3,5,6], user_features=new_user_feature)

But I get this error:

<1 sec
AttributeError: 'list' object has no attribute 'tocsr'

any idea?

how to fit_partial the new data?

  • LightFM Dataset helper version:
  • Python version:
  • Operating System:

Description

Describe what you were trying to get done.
Tell us what happened, what went wrong, and what you expected to happen.

What I Did

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

The user feature matrix specifies more features than there are estimated feature embeddings:

I have the following datasets:

Users:
10.000 Rows.
Features: User-Id, name, age, los, ou, gender, skills, language, grade, career interests

Trainings:
Training-Id, training name, main skill

Trainings Taken
User-Id, Training-Id, TrainingTaken
TrainingTaken will be a 10 when the user took the training, otherwise it wont appear in the dataset

The idea is to make a recommneder for trainings :)

I used this helper class for the matrices.

``from lightfm_dataset_helper.lightfm_dataset_helper import DatasetHelper`

I defined the feature columns for user and trainings.

items_column = "Training-Id"
user_column = "User-Id"
ratings_column = "TrainingTaken"

items_feature_columns = [
    "training name",
    "main skill"
]

user_features_columns = ["name","age","los","ou", "gender", "skills", "language", "grade", "career interests"]

Then I build the matrices
dataset_helper_instance = DatasetHelper(
    users_dataframe=usersdf,
    items_dataframe=trainingsdf,
    interactions_dataframe=trainingstakendf,
    item_id_column=items_column,
    items_feature_columns=items_feature_columns,
    user_id_column=user_column,
    user_features_columns=user_features_columns,
    interaction_column=ratings_column,
    clean_unknown_interactions=True,
)
dataset_helper_instance.routine()

Then I train:

from lightfm import LightFM
from lightfm.cross_validation import random_train_test_split
(train, test) = random_train_test_split(interactions=dataset_helper_instance.interactions, test_percentage=0.2)

model = LightFM(loss='warp')

model.fit(
    interactions=dataset_helper_instance.interactions,
    sample_weight=dataset_helper_instance.weights,
    item_features=dataset_helper_instance.item_features_list,
    user_features=dataset_helper_instance.user_features_list,
    verbose=True,
    epochs=20,
    num_threads=20,
)

then I try to use the predict

import numpy as np
from lightfm.data import Dataset
#predict existing users
scores = model.predict(user_ids=81727, item_ids=[1])
print(scores)

However I am getting this error:
ValueError: The user feature matrix specifies more features than there are estimated feature embeddings: 19400 vs 81728.

what could be wrong?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.