Giter VIP home page Giter VIP logo

sasrec's People

Contributors

kang205 avatar pmixer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sasrec's Issues

key_masks

"key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)" at line 185 in modules.py indicates that the key_masks depend on the embedding of those keys, but why not use the original "key_masks = tf.sequence_mask(keys_length, tf.shape(keys)[1]) # (N, T_k)"?

help,please

line 199 of modules.py : outputs *= query_masks # broadcasting. (N, T_q, C)
I think,is it outputs = query_masks # broadcasting. (hN, T_q, C)?

Why put the test[u][0] into the candidate seqences?

In the evaluate function, it makes a item_index list and put the test[u][0] in it.
What I consider is that the test[u][0] should be what we want to predict, but in this way, the model knows it should predict from the possibility of these candidates, including the one we want to predict.
Is this a kind of data leaking? Or did I misunderstand something?

cuda error

Hi, thanks for your excellent work.
When i run the code, the following error occurred
E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasGemmBatchedEx: CUBLAS_STATUS_NOT_SUPPORTED
2021-07-22 23:05:32.120816: E tensorflow/stream_executor/cuda/cuda_blas.cc:2574] Internal: failed BLAS call, see log for details

tensorflow version = 1.12.0
python version=2.7.18
Looking forward to your reply!

Problem with num_batch

In your code, the num_batch is calculated by below way:
num_batch = len(user_train) / args.batch_size
But it could cause error in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b')
Because

Traceback (most recent call last):
File "main.py", line 61, in
for step in tqdm(range(num_batch), total=num_batch, ncols=70, leave=False, unit='b'):
TypeError: 'float' object cannot be interpreted as an integer

Could you provide the original data of Amazon dataset

Thanks to your outstanding work. But I still have some question about data.
When I open the Amazon dataset website, I find that the data have been updated in 2018. If I try to use preprocess code in new dataset of beauty with 5 core set (in your code) , only a few thousand records are kept, Much less than the preprocessed data in your Repositories.
I need to re-preprocess original data because of my model need to use time-stamp data, not only the interaction order.
If you could provide the original amazon data and the preprocess code(for amz game), I will be very appreciate!

How to set the amount of negative samples?

Quote the paper:

For each user u, we randomly sample 100 negative items, and rank these items
with the ground-truth item. Based on the rankings of these 101 items, Hit@10 and NDCG@10 can be evaluated.

How could I set the number of negative items in the code?
It seems the (pos, neg) pairs are generated independently.

if nxt != 0: neg[idx] = random_neq(1, itemnum + 1, ts)

how to implement Caser?

Hi, Caser uses Softmax to get the interaction probability of each item. In the experiment of this paper, you select 100 negative samples to test. I would like to know how you deal with it. Thank you very much.

Questions about Performance of Caser

I modified the code based on https://github.com/graytowne/caser_pytorch and ran it on ML-1M dataset but the end result is not ideal.

The summary of the running results is as follows:
hit@10:0.622 , NDCG@10 = 0.481
The above evaluation metrics are much lower than those reported in the paper:
Hit@10 = 0.7886, NDCG@10 = 0.5538

I was wondering whether the difference was made due to my negligence. Could you please share your code. Looking forward to your reply! Thanks!

Need help pls

I my name is Mauro, and I am a computer science student. I am trying to run your code of SASRec, but I have troubles due to the versions of tensorflow, could be possible to you, make a running of it using a lighter version of the sport amazon revies dataset (that i will give you, it is just 60 MB). Your help will be really precious to me. If you want to help me you could send me an e mail to this account [email protected]

why num_head=1?

For multi-head attention module, why you set num_head=1 according to args in main.py? then it is not using multi-head structure of the attention block, is it?

Thanks,

Problem with multiprocessing(Sampler)

When I run program I got following error.

 An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.

To fix this issue I added
if __name=="__main__": main()
but because of close function in sampler, program is closed without training
I used python 3.6.
Thanks a lot beforehand !

Training method of baseline GRU4Rec

Could you tell me the detail of GRU4Rec?
For example, GRU4Rec model is trained by BPTT or normal BP?
Additionally, could you share the code of baselines?

Preprocessing for ml1m?

Hello.

Your data preprocessing code seems to only regard beauty reviews.

Can you also provide the code for preprocessing ml1m data?

Thank you.

How do we calculate AUC for SASRec?

Hello!

I am currently working on CTR and Sequential Recommendation Tasks. I see many recent papers like TallRec uses AUC to compare their result with SASRec. I am really curious as to how can I calculate AUC for SASRec. In my understanding SASRec require metrics like NDCG@k and HitRate@k. It would be really helpful if you could shed some light on it.

Thanks and Regards
Millennium Bismay

Application for adding third-party re-implemenation link on README

Hi Team, @kang205
I had prepared a PyTorch version of SASRec based on your tf implementation which behaves almost the same:

https://github.com/pmixer/SASRec.pytorch

could u pls consider adding the Third-party Re-implementation section like in https://github.com/wy1iu/LargeMargin_Softmax_Loss to let people know the work?

I hope to get the PyTorch implementation useful for wider audience and have someone help checking why it converges bit slower than tf implementation in model training 🤣

Regards,
Zan

`tf.sign(tf.abs(tf.reduce_sum` vs `tf.sign(tf.reduce_sum(tf.abs(` for generating masks?

key_masks = tf.sign(tf.abs(tf.reduce_sum(keys, axis=-1))) # (N, T_k)

Hi Guys,
I'm reading the code for porting the implementation to PyTorch for personal use, the code looks well written and documented, thx for the great work :)

Moreover, as self attention module is borrowed from another project, some details may not be 100% right according to my observation(despite some magic numbers like -2^32+1 for enforcing softmax to output 0 for the entry which kills code readability), as an example, for query and key mask generation, the code used tf.sign+tf.abs+tf.reduce_sum combination for generating the masks but the order seems slightly wrong, as we are trying to mask the query/key of all 0 values in channel/embedding-dim, the right way might be firstly apply abs, then do reduce_sum and finally use sign to generate the results, but current implementation firstly use reduce_sum, later used abs and lastly apply sign, the two approaches should generate same results for most case as sum-to-zero is of low probability for high-dimensional fp32 vectors but it's still wrong and may generate incorrect outputs for corner cases.

Just want to check my assumption as stated above, pls respond if you happen to have time @kang205 @JiachengLi1995, thx!

Regards,
Zan

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.