shchur / ifl-tpp Goto Github PK

View Code? Open in Web Editor NEW

76.0 76.0 31.0 40.92 MB

Implementation of "Intensity-Free Learning of Temporal Point Processes" (Spotlight @ ICLR 2020)

Home Page: https://openreview.net/forum?id=HygOjhEYDH

License: MIT License

Python 60.43% Jupyter Notebook 39.57%

normalizing-flows pytorch temporal-point-processes

ifl-tpp's People

Stargazers

Watchers

ifl-tpp's Issues

Hyperparameters for reproducibility

Thank you for providing your code, it is very well organized.

Would it also be possible to release the hyperparameters used for each of the datasets in the paper? This comes from me having trouble reproducing the set of numbers in Table 3 in the appendix for the provided datasets (StackOverflow and other synthetic datasets)? For example, running the code out of the box gives:

Breaking due to early stopping at epoch 172
Negative log-likelihood:

Train: 1151.5
Val: 1179.1
Test: 1131.5

For stack overflow dataset at seed = 0, which is not the 14.4 number.

In addition, I was wondering if it would be possible to release the code for event time prediction using history?

NLL results

Hi - I was trying your code on Hawkes-1 similar to the https://github.com/shchur/ifl-tpp/blob/master/code/interactive.ipynb file. The NLL on test set is 43.7, but the reported result in the paper 0.52. Could you please clarify if there should be an additional step be done to get the final result.

My apology if this is a naive question, I am very new in the area of TPP!

Why does the normalization apply only to the input?

Hello, I see your code, and I find that you only apply log and normalization to the in_times? I don't understand why not apply them to in_times and out_times?

ATM dataset testing

Hi,

I'm new to TPP.

I was trying to run your interactive ipython file with the ATM dataset as CSV file. I see all datasets you used are .npz.
Do I have to preprocess ATM csv file to .npz? How you set the sequence length? Since the ATM dataset I have contains only 4 months of data.

I request you to clarify.

Regards,
kiahsa

How to get the expression of the distribution of inter-event time

Thanks for sharing the code and the excellent work.

I am curious about how you plot Figure 9 and obtain the conditioned distribution of inter-event time? Is it possible for you to share your code of that part? Thanks in advance.

Sampling points of a specific mark

Hi, thanks for the wonderful code. I am able to use it easily and reproduce the results.

I wanted to sample points of a specific mark in a horizon, I would appreciate any pointers from you how I could do this with your code.

How could I get the predicted results?

This repo hightlights me a lot

However, I focus more on the predicted results In my work.

It's hard for me to understand your implementation datails to deduct the predict results as I just step into the area for few months.

Could you please tell me how to get the predicted results based on your code?

I'd appreciated it if you can help .

Thanks any way

all evaluation expriments code

Would you mind to open other evaluation expriments code include learn with masks and mask prediction
, event time predcition using history ?

Code for using context vector in the models

@shchur - thank you so much for sharing the code, it is very readable and follows the paper closely. I was hoping you could give me pointers on how to incorporate the context vector $y_i$ in the models. This is described for the Yelp dataset in sec. F.3 in the paper but I was not able to find it in your codebase.
Thanks again for your help!

history

I want to know whether your method takes into account the influence of historical events?thank you.

Implementation on missing data imputation

Hi @shchur, thanks for releasing the code! I'm wondering if it is possible that you could provide the code for the section of missing data imputation - Sec. 5.4 & F.4 MISSING DATA IMPUTATION from your paper.

I'm curious about your implementation of feeding imputations to the RNN. If keep your current framework, to include the imputations to be the history, the batch will keep changing and we need get_features(batch) & get_context(features) every time we have a new imputed event. This gives a very slow training process. Could you provide your implementation on this part, or give me some suggestions on implementing this 'training while imputing'? Thanks!

Missing data imputation

Hi, this is a very inspiring work, and I'm quite interested in section 5.4 (Missing data imputation). Could you please provide a pointer to the code implementation? Thanks very much!

Sampling with additional conditional information

Hi, Shchur! Thank you so much for sharing your code. It is pretty helpful for my current research, but I encountered a problem with sampling.

I trained an ifl-tpp model with additional conditional information (e.g., Yelp dataset). However, during the sampling process, there is no additional information for the model. So, I don't know how to get the final context embedding with additional conditional information during sampling.

Do you have some suggestions for this problem? Thanks!

Calculate the mean of the entire distribution.

According to your reply, I calculate the mean as follows:
prior_logits, means, log_scales = base_dist.get_params(h, emb)
s = torch.exp(log_scales)
prior = torch.exp(prior_logits)
expectation = torch.sum(prior * torch.exp(a * means + b + a * a * s * s / 2), dim=-1),
within which a=std_in_train, b=mean_in_train, base_dist = NormalMixtureDistribution(). But, I got error WARNING:root:NaN or Inf found in input tensor.. Could you help me to find out why I got such error? which step is wrong in my code?

Learning with Marks

Thanks for the release of your paper and code.
In trying to implement learning with marks with the provided interactive notebook, adapting the remarks in the paper, I'm also running into some trouble. Based on appendix F.2. I assume it's a case of just adding the terms?

model.log_prob in this case returns the (time_log_prob, mark_nll, accuracy) - so adapting for the training loop, is it as simple as changing lines as below?:

        log_prob = model.log_prob(input)
        loss = -model.aggregate(log_prob, input.length)

for

        if use_marks:
            log_prob, mark_nll, mark_acc = model.log_prob(input)
        else:
            log_prob = model.log_prob(input)
        loss = -model.aggregate(log_prob + mark_nll, input.length)

As a side problem - when doing the above with my custom dataset (which conforms to the same formatting as the example datasets, so arrival_times and marks), all loss terms are NaN. I'm wondering if you might have some insight as to why this might be occurring! When using the reddit dataset with the above modifications, I get non-zero loss terms for both log_prob and mark_nll.

on log likelihood misunderstanding

In the log likelihood, there are two terms:
\lambda(\tau) and - \int_0^\tau \Lambda(s)ds
However, the first term in your code is calculated directedly use the inter_time_dist.log_prob() function, which is already the log of pdf, i.e. the sum of the two terms. And then the second term is calculated by log_survival_function again. I want to make sure if I misunderstood you, or it is an error.

Calculate the mean of the entire distribution.

Thanks for your code.
I wonder how to calculate the mean of the entire distribution.
Because I not only want to get the NLL loss of \tau, but also want to get the rmse/mae of \tau. And I will use the mean of \tau to calculate rmse/mae.
I think I should use the eq E[\tau]=\sum_k w_k \exp( \mu_k + s^2_k/2). But I cannotget the right answer.
Could you tell me how to get the mean? Thank you!

Loss with NLL of mark and MAE of inter-event time

Hi Oleksandr,

I want to model a marked TPP with LogNormMix. For that I also want to consider the MAE as a loss for the inter-event times. But I'm unsure how to combine it with the NLL loss of the predicted marks? I tried following and it worked quite well:

Loss = abs(E[p(tau)] - tau) + NLL(mark) = MAE(tau) + NLL(mark)

But is this a reasonable loss function?

Thank you for your help!

use other dataset

hi,i am trying to use wikipedia data from the original code , but the type of data is different ,please tell me what should i do ? i am new in this area ,i am sorry if the question is native.

Understanding given datasets

Hi @shchur,

I do not really know if this is the right place to ask you this question. But I need to know the backgrounds of these datasets.

I see that in your data, the datasets are partitioned into multiple sequences. What is the reason behind partitioning a single sequence into multiple sequences? How do I partition my sequence into multiple sub-sequences? For your information, my data consists of multiple users (for the moment, around 50) posting on social media. Unique marks are assigned to every user.

Any help would be greatly appreciated. Thank you.

How to get sequence embeddings?

Would you provide the code to get sequence embeddings? Or If it is already provided in the code, would you please let me know which part？It's would save me a lot of time. thanks

LogNorm curiosity

Hi @shchur - my code is somewhat deviated from the repo as it currently stands, but we are still experiencing a few oddities. Notably - we are sometimes getting negative inter-event times when calling decoder.sample with the LogNormMix implementation.

Having dug around the codebase, I think I've found the problem, but I just wanted to verify!

Looking at the normal_sample

ifl-tpp/code/dpp/distributions/gaussian_mixture.py

Lines 13 to 17 in adb1dc7

 def normal_sample(means, log_scales): 

 if means.shape != log_scales.shape: 

 raise ValueError("Shapes of means and scales don't match.") 

 z = torch.empty(means.shape).normal_(0., 1.) 

 return torch.exp(log_scales) * z + means

as it is called in the _sample call, it seems that there isn't anything preventing a negative result if the sample drawn at Line 16 is negative, and the mean of the corresponding mixture component is positive enough, for small mixture components. Is my understanding of this presently correct?

	def normal_sample(means, log_scales):
	if means.shape != log_scales.shape:
	raise ValueError("Shapes of means and scales don't match.")
	z = torch.empty(means.shape).normal_(0., 1.)
	return torch.exp(log_scales) * z + means

shchur / ifl-tpp Goto Github PK

ifl-tpp's People

Stargazers

Watchers

Forkers

ifl-tpp's Issues

Recommend Projects

Recommend Topics

Recommend Org