deepgraphlearning / recommendersystems Goto Github PK

License: MIT License

Python 99.10% Shell 0.90%

recommendersystems's Introduction

A library of Recommender Systems

This repository provides a summary of our research on Recommender Systems. It includes our code base on different recommendation topics, a comprehensive reading list and a set of bechmark data sets.

Code Base

Currently, we are interested in sequential recommendation, feature-based recommendation and social recommendation.

Sequential Recommedation

Since users' interests are naturally dynamic, modeling users' sequential behaviors can learn contextual representations of users' current interests and therefore provide more accurate recommendations. In this project, we include some state-of-the-art sequential recommenders that empoly advanced sequence modeling techniques, such as Markov Chains (MCs), Recurrent Neural Networks (RNNs), Temporal Convolutional Neural Networks (TCN) and Self-attentive Neural Networks (Transformer).

Feature-based Recommendation

A general method for recommendation is to predict the click probabilities given users' profiles and items' features, which is known as CTR prediction. For CTR prediction, a core task is to learn (high-order) feature interactions because feature combinations are usually powerful indicators for prediction. However, enumerating all the possible high-order features will exponentially increase the dimension of data, leading to a more serious problem of model overfitting. In this work, we propose to learn low-dimentional representations of combinatorial features with self-attention mechanism, by which feature interactions are automatically implemented. Quantitative results show that our model have good prediction performance as well as satisfactory efficiency.

Social recommendation

Online social communities are an essential part of today's online experience. What we do or what we choose may be explicitly or implicitly influenced by our friends. In this project, we study the social influences in session-based recommendations, which simultaneously model users' dynamic interests and context-dependent social influences. First, we model users' dynamic interests with recurrent neural networks. In order to model context-dependent social influences, we propose to employ attention-based graph convolutional neural networks to differentiate friends' dynamic infuences in different behavior sessions.

Reading List

We maintain a reading list of RecSys papers to keep track of up-to-date research.

Data List

We provide a summary of existing benchmark data sets for evaluating recommendation methods.

New Data

We contribute a new large-scale dataset, which is collected from a popular movie/music/book review website Douban (www.douban.com). The data set could be useful for researches on sequential recommendation, social recommendation and multi-domain recommendation. See details here.

Publications:

Weiping Song, Zhijian Duan, Ziqing Yang, Hao Zhu, Ming Zhang and Jian Tang. Explainable Knowledge Graph-based Recommendation via Deep Reinforcement Learning. arXiv'2019.
Weiping Song, Zhiping Xiao, Yifan Wang, Laurent Charlin, Ming Zhang and Jian Tang. Session-based Social Recommendation via Dynamic Graph Attention Networks. WSDM'19.
Weiping Song, Chence Shi, Zhiping Xiao, Zhijian Duan, Yewen Xu, Ming Zhang and Jian Tang. AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks. CIKM'2019.

recommendersystems's People

Contributors

Stargazers

Watchers

Forkers

zscdumin ml-lab wuyunhua spencerai sprinterzzj linuxclab gavinljj gdh756462786 awesome-archive aaronanima jianwenl turboljy allensmile countif prometeoai nickyongzhang zhanghonglishanzai yejg2017 supermousse limingmingli321 tools-only microw ivanpavlyshyn mguarin0 limunan yoanyombapro1234 nullees btbujiangjun slymon syunzhou fudp huizou3 notfoundgg qianrenjian lqfarmer statusrank tiananxiao masonyyp joejiong hopexu li-study ghost-999 goncaloluis89 ouou222 chritter ustc-miner fengxing11111 yueyedeai shubhampachori12110095 zhccgpz liuweiping2020 bigbear2017 arijitthegame songweiping crazygirlfym dimple-bansal jackwangsysu emilywangattri olasojiamujo xuanzizhou jinyang88 buptygz baobunuo mzamjadi cpturing wutenghu zerlina0106 wgcn96 seeker1943 hemingkai derrick-www katarinayuan kentchun33333 mengxiaozhibo thaair zshwuhan 1508816494 qq826346094 yichengdwu hamedmx abhi4rana7 greenary-john ahuiwang dust9 prismheart gmlyytt-yang moumitab xjdupeng hninthant matthew-tech speechlessman italolourenco joojowalker zakra xiang0716 joneswong crystal22 ybling ggnbnb anthonyalford

recommendersystems's Issues

When i run the proprecess.py in Avazu,a bug

when cnt_line=23000000,the 80 line has a indexError:list index out of range?
thank you for reply

会话中的物品顺序

由于会话划分时只使用会话ID进行排序，一个会话内物品的顺序是不是乱序的？
data = data.sort_values(by=['TimeId']).groupby('SessionId')['ItemId'].apply(list).to_dict()

doubts about the data processing of Criteo in AutoInt

@Songweiping ,Hello,I am very interested in your work(AutoInt) at CIKM'19,but I had some doubts when I reproduced the experiments of the paper.
As we know, Criteo has its own bench mark, including two csv files(train.csv & test.csv).But the preprocess of AutoInt splits the file(train_samples.txt) to get the train data,valid data and test data.
I'm wondering how to transform the original Dataset(train.csv & test.csv) into a Dataset that the data_process code can handle.
Could you post your code for converting original csv files(train.csv & test.csv) into the whole txt file(the data used in your paper, and Format is similar to the sample file train_example.txt)?

Question about ReLU in Multi-Head Attention

In multi-head attention, there is a relu after queries, keys, and values. Is this a correct implementation? The paper did not mention the relu in Eq. 5. Besides, it seems that the relu will make the attention matrix always positive.

# Linear projections
Q = tf.layers.dense(queries, num_units, activation=tf.nn.relu)
K = tf.layers.dense(keys, num_units, activation=tf.nn.relu)
V = tf.layers.dense(values, num_units, activation=tf.nn.relu)```

Converting to top-n recommendation

I am very impressed about the usefulness of AutoInt to my personal project.
I'm wondering if there is any handy way of converting this AutoInt to top-n recommendaiton?
Would it be possible?

数据格式

请问这些框架结构要求的数据格式是什么样的呢？

数据处理

请问，Gowalla数据集是如何处理和划分的？

each of session id will use the all padding data, if it will cause information to travel through

hi, each of session id will use the all padding data, if it will cause information to travel through?

在SocialRec的test.py中batchsize被设为1

请问为什么您在训练集中将batchsize的值设为200，在测试集中却设置为1？
代码在test.py的第81行，内容为args.batch_size = 1。

Code for your paper: Ekar: An Explainable Method for Knowledge Aware Recommendation?

May I know if you have published your code for the paper: Ekar: An Explainable Method for Knowledge Aware
Recommendation?

Thanks!

Session segmentation on Delicious datasets

Hi, I'm trying to test the result in the paper Session-Based Social Recommendation on the Delicious datasets . From the paper and datasets, I understand that you consider each session is a sequence of tags a user has assigned to a bookmark and all the tagging actions for that bookmark will have the same timestamp according to the datasets.
So my question is that when you give the time_id for each session (as in the preprocess_DoubanMovie.py file), will two sessions have the same id only if their timestamps have exactly same date and time ?
E.g: A session with timestamp '01/06/2020 2:30:00 pm' and another session with timestamp '01/06/2020 2:30:01 pm' will have different time_id.

UniformNeighborSampler in SocialRec

While I was looking through the code, I noticed that in neigh_samplers.py, users can sample themselves as the second-order neighbors because when we sample 10 neighbors of the first-order neighbor, the user is also included.

        adj = self.adj_info[node, :]
        neighbors = []
        for neighbor in adj:
            if first_or_second == 'second':
                if self.visible_time[neighbor] <= timeid:
                    neighbors.append(neighbor)

I was wondering is this intended or a logic flaw. Thank you!

数据集

作者，您好：
我目前在做动态网络表示学习，请问您可提供一下你实验数据吗？（论文里用的豆瓣原始数据就好，未被处理的，非常感谢，我的邮箱是[email protected]）

DGRec代码Pytorch版本

是否已经有Pytorch版本，可否请大佬分享一下，邮箱[email protected]，感激不敬！

Extracting Attention And Visualize it

After running the model, I am wondering if weights of attention and their visualization can be done as the attached file displays from the original paper.

There are many examples for extracting and visualization for seq2seq but couldn't really find one for feature explanation.

Is there any ideas/code that can be used for attention visualization of meaningful features? ?

SOCIALrec的model中的local_features

local_features中的两层initial_state_layer1和initial_state_layer2如何理解？

Only One class present issue

I am working on my own data -purchase history of financial production.
So I have data as who, with features, purchased which products, with features and when.
However, I have found that the metric of the model is AUC -- and I, as expected, get the following error

ValueError: Only one class present in y_true. ROC AUC score is not defined in that case.

Since all my Ys are 1s -as I only have who purchased (clicked) which and there is no 0 value ( who didn't purchase which).
As other CTR prediction models also use AUC and log loss, I assume that there must be a way to use AUC for such a dataset.

Do you have any idea how to solve this issue?
That would be really helpful!