Giter VIP home page Giter VIP logo

dream's Introduction

Welcome! 👋

My name is Yihong Chen. I research on AI knowledge acquisition, specifically on how different AI systems can learn to abstract, represent and use concepts/symbols efficiecntly.

I am open to collaborations on topics related to embedding learning, link prediction, and language modeling. If you would like to get in touch, you can reach me by emailing yihong-chen AT outlook DOT com, or simply booking a Zoom meeting with me.

Looking for Some Inspirations?

💥 Mar 2024, Quanta Magazine covers our research on periodical embedding forgetting. Check out the article here.

💥 Dec 2023, I will present our forgetting paper at NeurIPS 2023. Check out the poster here.

💥 Sep 2023, our latest work Improving Language Plasticity via Pretraining with Active Forgetting is accepted by NeurIPS 2023!

💥 Sep 2023, I presented our latest work on forgetting at IST-Unbabel seminar.

💥 Jul 2023, I presented our latest work on forgetting language modelling at ELLIS Unconference 2023. The slides are available here. Feel free to leave your comments.

💥 Jul 2023, discover the power of forgetting in language modelling! Our latest work, Improving Language Plasticity via Pretraining with Active Forgetting, shows how pretraining a language model with active forgetting can help it quickly learn new languages. You'll be amazed by the model plasticity imbued via pretraining with forgetting. Check it out :)

💥 Nov 2022, our paper, REFACTOR GNNS: Revisiting Factorisation-based Models from a Message-Passing Perspective, will appear in NeurIPS 2022! If you're interested in understanding why FMs can be some special GNNs and make them usable on new graphs, check it out!

💥 Jun 2022, if you're looking for a hands-on repo to start experimenting with link prediction, check out our repo ssl-relation-prediction. Simple code, easy to hack 🚀

dream's People

Contributors

aflah02 avatar yihong-chen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dream's Issues

The number of epochs

Hi, Lacey,

It's great to find your neat implementation of the DREAM model.
By updating several lines of code, I'm able to run it on the Instacart dataset with PyTorch 0.4.1. Just wonder how many epochs you have trained the model when you participated the Kaggle competition? And how much time did it take? In my case, the loss value on the test set starts at 0.66 and drops to 0.11 after 5 epochs, and it takes about 4.5 hours for each epoch. I'm using a Nvidia K80 GPU with 12 GB GPU memory. I just tried the "train_dream" function so far, not for "train_reorder_dream" yet.

The hyper-parameters I'm using are:

  • 'basket_pool_type': 'max',
  • 'rnn_layers': 2,
  • 'rnn_type': 'LSTM',
  • 'dropout': 0.5,
  • 'num_product': 49688 + 1 + 1,
  • 'none_idx': 49689,
  • 'embedding_dim': 128,
  • 'cuda': True,
  • 'clip': 200,
  • 'epochs': 10,
  • 'batch_size': 64,
  • 'learning_rate': 0.001,
  • 'log_interval': 100,

Thanks a lot,
Xuan

TypeError: iteration over a 0-d tensor

Running the train file results in this error:

basket_pool_type max
rnn_layers 2
rnn_type RNN_RELU
dropout 0.5
num_product 49690
none_idx 49689
embedding_dim 128
cuda False
clip 200
epochs 100
batch_size 32
learning_rate 0.001
log_interval 1
checkpoint_dir ../dream/dream-{epoch:02d}-{loss:.4f}.model

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-423abe3e7cdf> in <module>()
    270             train_reorder_dream()
    271         else:
--> 272             train_dream()
    273         print('-' * 89)
    274         if constants.REORDER:

<ipython-input-1-423abe3e7cdf> in train_dream()
     82     for i, x in enumerate(batchify(train_ub, dr_config.batch_size)):
     83         baskets, lens, _ = x
---> 84         dr_hidden = repackage_hidden(dr_hidden)  # repackage hidden state for RNN
     85         dr_model.zero_grad()  # optim.zero_grad()
     86         dynamic_user, _ = dr_model(baskets, lens, dr_hidden)

~/SageMaker/MBA/Instacart/src/utils.py in repackage_hidden(h)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in <genexpr>(.0)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in repackage_hidden(h)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in <genexpr>(.0)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in repackage_hidden(h)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in <genexpr>(.0)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/SageMaker/MBA/Instacart/src/utils.py in repackage_hidden(h)
     92         return Variable(h.data)
     93     else:
---> 94         return tuple(repackage_hidden(v) for v in h)
     95 
     96 ###################### Summary

~/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/tensor.py in __iter__(self)
    382         # map will interleave them.)
    383         if self.dim() == 0:
--> 384             raise TypeError('iteration over a 0-d tensor')
    385         if torch._C._get_tracing_state():
    386             warnings.warn('Iterating over a tensor might cause the trace to be incorrect. '

TypeError: iteration over a 0-d tensor

dataset

你好,我想跑一下你的程序,但是发现数据集找不到了,麻烦能不能发一份至我的邮箱([email protected])。谢谢!

sampling negative samples

In the bpr_loss function, the neg_idx seems to be a random sample of 'all' the items instead of just the set of negative items. Why not just sample from the negative item set? Is this approximation done for performance reasons?

The code looks great & simple at the same time to understand. Keep up the good work!

Data dont converge (Loss going up and down)

Hello,

I downloaded this network and im trying to run with the base dataset from Instacart, and the data doesnt seem to converge, and goes nowhere. Loss keeps going up and down, i tried changing the LR from 0.1 to 0.0001 and alternate the optimizers but nothing seems to work. I wanted to simulate the instacart output to start working on the dream from there.

If you guys have any insights, would be helpful, thanks

Evaluation metrics

Hello Yihong,

I was able to run your model after a few modification, great work, and the coding is propper and can be nicely understood. I see in your other repo that you implemented evaluation metrics such as F1 score based on recall and precision. Do you have that for DREAM as well?

Furthermore, are the choice of your parameters random (clip, learning rate...) or did you do a grid search ?

Thanks again!

train.py

hi, may i ask a question that how you use train.py to train the model DREAM? i didnt see any function call the function evaluate_reorder_dream(). i cant find any entry to use this function to train the model. your answer will helps a lot, thank you!

size mismatch

Hello,

Thank you for making your code public,I've been trying to run dream, however I keep having the error

"size mismatch, m1: [1 x 256], m2: [64 x 256] at c:\anaconda2\conda-bld\pytorch_1513133520683\work\torch\lib\th\generic/THTensorMath.c:1416"

After examination, it comes from that part of the code:
for i,x in enumerate(batchify(train_ub, dr_config.batch_size)):
baskets, lens, _ = x
dr_hidden = repackage_hidden(dr_hidden) # repackage hidden state for RNN
dr_model.zero_grad() # optim.zero_grad()
dynamic_user, _ = dr_model(baskets, lens, dr_hidden)

  • dr_hidden is of size 1x4x64
  • baskets is a list with index from 0 to 255, each having a size of 99 (padded)
  • lens is a list with index from 0 to 255, each having size dependent on the basket of the user (max at 99)

The parameter I used for the config are:

DREAM_CONFIG = {'basket_pool_type': 'max', # 'avg'
'rnn_layers': 2, # 2, 3
'rnn_type': 'LSTM',#'RNN_TANH',#'GRU',#'LSTM',# 'RNN_RELU',
'dropout': 0.5,
# 'num_product': 49688 + 1, # padding idx = 0
'num_product': 49688 + 1 + 1,
# 49688 products, padding idx = 0, none idx = 49689, none idx indicates no products
'none_idx': 49689,
'embedding_dim': 64, # 128
'cuda': False, # True,
'clip': 20, # 0.25
'epochs': 100,
'batch_size': 256,
'learning_rate': 0.01, # 0.0001
'log_interval': 1, # num of batchs between two logging
'checkpoint_dir': DREAM_MODEL_DIR + 'reorder-next-dream-{epoch:02d}-{loss:.4f}.model',
}

I dont use ub_rbks, ub_ihis; only ub_basket

Thanks for the help !

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.