Giter VIP home page Giter VIP logo

ifl-tpp's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ifl-tpp's Issues

Hyperparameters for reproducibility

Thank you for providing your code, it is very well organized.

Would it also be possible to release the hyperparameters used for each of the datasets in the paper? This comes from me having trouble reproducing the set of numbers in Table 3 in the appendix for the provided datasets (StackOverflow and other synthetic datasets)? For example, running the code out of the box gives:

Breaking due to early stopping at epoch 172
Negative log-likelihood:

  • Train: 1151.5
  • Val: 1179.1
  • Test: 1131.5

For stack overflow dataset at seed = 0, which is not the 14.4 number.

In addition, I was wondering if it would be possible to release the code for event time prediction using history?

ATM dataset testing

Hi,

I'm new to TPP.

I was trying to run your interactive ipython file with the ATM dataset as CSV file. I see all datasets you used are .npz.
Do I have to preprocess ATM csv file to .npz? How you set the sequence length? Since the ATM dataset I have contains only 4 months of data.

I request you to clarify.

Regards,
kiahsa

Sampling points of a specific mark

Hi, thanks for the wonderful code. I am able to use it easily and reproduce the results.

I wanted to sample points of a specific mark in a horizon, I would appreciate any pointers from you how I could do this with your code.

How could I get the predicted results?

This repo hightlights me a lot

However, I focus more on the predicted results In my work.

It's hard for me to understand your implementation datails to deduct the predict results as I just step into the area for few months.

Could you please tell me how to get the predicted results based on your code?

I'd appreciated it if you can help .

Thanks any way

all evaluation expriments code

Would you mind to open other evaluation expriments code include learn with masks and mask prediction
, event time predcition using history ?

Code for using context vector in the models

@shchur - thank you so much for sharing the code, it is very readable and follows the paper closely. I was hoping you could give me pointers on how to incorporate the context vector $y_i$ in the models. This is described for the Yelp dataset in sec. F.3 in the paper but I was not able to find it in your codebase.
Thanks again for your help!

history

I want to know whether your method takes into account the influence of historical events?thank you.

Implementation on missing data imputation

Hi @shchur, thanks for releasing the code! I'm wondering if it is possible that you could provide the code for the section of missing data imputation - Sec. 5.4 & F.4 MISSING DATA IMPUTATION from your paper.

I'm curious about your implementation of feeding imputations to the RNN. If keep your current framework, to include the imputations to be the history, the batch will keep changing and we need get_features(batch) & get_context(features) every time we have a new imputed event. This gives a very slow training process. Could you provide your implementation on this part, or give me some suggestions on implementing this 'training while imputing'? Thanks!

Missing data imputation

Hi, this is a very inspiring work, and I'm quite interested in section 5.4 (Missing data imputation). Could you please provide a pointer to the code implementation? Thanks very much!

Sampling with additional conditional information

Hi, Shchur! Thank you so much for sharing your code. It is pretty helpful for my current research, but I encountered a problem with sampling.

I trained an ifl-tpp model with additional conditional information (e.g., Yelp dataset). However, during the sampling process, there is no additional information for the model. So, I don't know how to get the final context embedding with additional conditional information during sampling.

Do you have some suggestions for this problem? Thanks!

Calculate the mean of the entire distribution.

According to your reply, I calculate the mean as follows:
prior_logits, means, log_scales = base_dist.get_params(h, emb)
s = torch.exp(log_scales)
prior = torch.exp(prior_logits)
expectation = torch.sum(prior * torch.exp(a * means + b + a * a * s * s / 2), dim=-1),
within which a=std_in_train, b=mean_in_train, base_dist = NormalMixtureDistribution(). But, I got error WARNING:root:NaN or Inf found in input tensor.. Could you help me to find out why I got such error? which step is wrong in my code?

Learning with Marks

Thanks for the release of your paper and code.
In trying to implement learning with marks with the provided interactive notebook, adapting the remarks in the paper, I'm also running into some trouble. Based on appendix F.2. I assume it's a case of just adding the terms?

model.log_prob in this case returns the (time_log_prob, mark_nll, accuracy) - so adapting for the training loop, is it as simple as changing lines as below?:

        log_prob = model.log_prob(input)
        loss = -model.aggregate(log_prob, input.length)

for

        if use_marks:
            log_prob, mark_nll, mark_acc = model.log_prob(input)
        else:
            log_prob = model.log_prob(input)
        loss = -model.aggregate(log_prob + mark_nll, input.length)

As a side problem - when doing the above with my custom dataset (which conforms to the same formatting as the example datasets, so arrival_times and marks), all loss terms are NaN. I'm wondering if you might have some insight as to why this might be occurring! When using the reddit dataset with the above modifications, I get non-zero loss terms for both log_prob and mark_nll.

on log likelihood misunderstanding

In the log likelihood, there are two terms:
\lambda(\tau) and - \int_0^\tau \Lambda(s)ds
However, the first term in your code is calculated directedly use the inter_time_dist.log_prob() function, which is already the log of pdf, i.e. the sum of the two terms. And then the second term is calculated by log_survival_function again. I want to make sure if I misunderstood you, or it is an error.

Calculate the mean of the entire distribution.

Thanks for your code.
I wonder how to calculate the mean of the entire distribution.
Because I not only want to get the NLL loss of \tau, but also want to get the rmse/mae of \tau. And I will use the mean of \tau to calculate rmse/mae.
I think I should use the eq E[\tau]=\sum_k w_k \exp( \mu_k + s^2_k/2). But I cannotget the right answer.
Could you tell me how to get the mean? Thank you!

Loss with NLL of mark and MAE of inter-event time

Hi Oleksandr,

I want to model a marked TPP with LogNormMix. For that I also want to consider the MAE as a loss for the inter-event times. But I'm unsure how to combine it with the NLL loss of the predicted marks? I tried following and it worked quite well:

Loss = abs(E[p(tau)] - tau) + NLL(mark) = MAE(tau) + NLL(mark)

But is this a reasonable loss function?

Thank you for your help!

use other dataset

hi,i am trying to use wikipedia data from the original code , but the type of data is different ,please tell me what should i do ? i am new in this area ,i am sorry if the question is native.

Understanding given datasets

Hi @shchur,

I do not really know if this is the right place to ask you this question. But I need to know the backgrounds of these datasets.

I see that in your data, the datasets are partitioned into multiple sequences. What is the reason behind partitioning a single sequence into multiple sequences? How do I partition my sequence into multiple sub-sequences? For your information, my data consists of multiple users (for the moment, around 50) posting on social media. Unique marks are assigned to every user.

Any help would be greatly appreciated. Thank you.

How to get sequence embeddings?

Would you provide the code to get sequence embeddings? Or If it is already provided in the code, would you please let me know which part?It's would save me a lot of time. thanks

LogNorm curiosity

Hi @shchur - my code is somewhat deviated from the repo as it currently stands, but we are still experiencing a few oddities. Notably - we are sometimes getting negative inter-event times when calling decoder.sample with the LogNormMix implementation.

Having dug around the codebase, I think I've found the problem, but I just wanted to verify!

Looking at the normal_sample

def normal_sample(means, log_scales):
if means.shape != log_scales.shape:
raise ValueError("Shapes of means and scales don't match.")
z = torch.empty(means.shape).normal_(0., 1.)
return torch.exp(log_scales) * z + means
as it is called in the _sample call, it seems that there isn't anything preventing a negative result if the sample drawn at Line 16 is negative, and the mean of the corresponding mixture component is positive enough, for small mixture components. Is my understanding of this presently correct?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.