automan000 / convolutional_lstm_pytorch Goto Github PK

View Code? Open in Web Editor NEW

788.0 788.0 198.0 8 KB

Multi-layer convolutional LSTM with Pytorch

Python 100.00%

convolutional_lstm_pytorch's People

Contributors

Stargazers

Watchers

Forkers

longcw awaelchli wanjinchang ganji15 liviust colinsongf maga33 jxchen01 resurgo-genetics aresthu grseb9s jingweiz liygcheng felicia126 bmeatayi kwanegx shubhampachori12110095 dangdangsister gdwei blankworld link-zju storife wcsland danilecug secretdragon xq1560 gsygsy96 zhangyuteng lxj0276 tobbii kkasiri-miovision afcarl ksnzh worksking poonono yanxiaobin-ben thorpham wanghuogen kaikangsdu yandongchao skye777 alchemist1024 zhoulian cy3432412 lvvvmd deeplearnerjhb krischeng huantingzhao tsingzao icaresth mackeee-orange tungk koala7580 alterzero vineetgarg93 zkwalt congweilin nisoka valencebond hisangke bugercreater magicyu-2015 97chenxa qearly hyzcn xclmj uestcwangxiao suriyachaudary mikumeow minzhangm lc1003 smksyj dongxiaoheng xhlinxm braraki jarygrace suyanzhou626 huangyd8 hkkevinhf yanxiao1930 z942394197 guoleming leonzfa zhangxuanaj zjut-jianhuazhang marszhancz wongchichong zhangtongxue1994 pipixiapipi yangy-z wenqiushi bweng001 shx9610 ballmdr wanzysky ezelikman weigq pmaple dovelz tianfudhe

convolutional_lstm_pytorch's Issues

Peephole connections (Wci, Wcf, Wco) gradient update

The LSTM paper defines a specific rule for gradient updates of the 'peephole' connections. Specifically:

[...] during learning no error signals are propagated back from gates via peephole connections to CEC

Based on my understanding of the code the way these 3 variables are initialized (as asked in Issue 17) is an attempt at implementing this update rule, but I don't see how does initializing them as Variables helps. From my understanding of the quoted part of the LSTM paper, the peephole connections should be updated but the gradient that updates them should stop there and not flow any further. If that is the case then this implementation is incorrect, although it might be that Pytorch does not support such an operation as .detach() is not suitable for the job.

How to do the entire sequence all at once?

using for custom dataset

Hi, I'm trying to apply your code to a sequential image dataset.

However, whenever I tried to input concatenated images (Batch x Timeseries x Channels x Width x Height), it gives following error.



<ipython-input-40-c130ba752793> in forward(self, input, h, c)
     29         input = input.cuda()
     30 
---> 31         combined = torch.cat((input, h), dim=1)
     32 
     33         A = self.conv(combined)

TypeError: cat received an invalid combination of arguments - got (tuple, dim=int), but expected one of:
 * (sequence[torch.cuda.FloatTensor] seq)
 * (sequence[torch.cuda.FloatTensor] seq, int dim)
      didn't match because some of the arguments have invalid types: (tuple, dim=int)

It seems like something is wrong with .cuda() declaration.

So I looked into those two Variables input and h

input has torch.FloatTensor type,

h has torch.autograd.variable.Variable type.

From this article, those two Variables have to be changed to same data type.

My question is,

i) Have you undergone the same issue like this?

ii) I've tried to change the data type of h by h.cuda() didn't work. I'm not used to Pytorch. So is there any advice I can get?

BTW thank you for your update. Gave me a lot of help

How can I use this module for predicting the moving mnist like the paper?

I am new to video predicting and deep learning, how can I use this module for predicting the moving mnist like the paper? Do you have any example?

Why Wci, Wcf, Wco are Variables rather than nn.Parameters

Is there any reason for initializing Wci_o_f in ConvLSTMCell as autograd Variables rather than nn.Parameters

Input shape issue and lack of bias.

The first problem is that in ConvLSTM.forward, the code is using the same x = input in multiple timesteps.
I guess the input shape of forward func. shall be changed to

[sequence, bsize, channel, x, y]

instead of the original

[bsize, channel, x, y]

And, x=input line shall be changed to

x=input[step]

for different steps.
I am still studying if it's appropriate to loop layers within loops of timesteps, but after training your current code(with the change I mentioned above), I can get decent outcomes.

The second problem is that in ConvLSTMCell, there're no biases. For example in

ci = torch.sigmoid(self.Wxi(x) + self.Whi(h) + c * self.Wci)

While it should be something like

ci = torch.sigmoid(self.Wxi(x) + self.Whi(h) + c * self.Wci + self.Bci)

But I don't know if such constants would affect the backward phase.

P.S. I'm myself a beginner so maybe I'm wrong. Please reply :)

RuntimeError: Jacobian mismatch for output 0 with respect to input 0

Default code snippet ran in pytorch 0.40, it hangs a bit then this error shows up.

Complete Error:

`RuntimeError Traceback (most recent call last)
in ()
101 output = convlstm(input)
102 output = output[0][0]
--> 103 res = torch.autograd.gradcheck(loss_fn, (output, target), raise_exception=True)
104 print(res)

C:\Anaconda3\lib\site-packages\torch\autograd\gradcheck.py in gradcheck(func, inputs, eps, atol, rtol, raise_exception)
190 if not ((a - n).abs() <= (atol + rtol * n.abs())).all():
191 return fail_test('Jacobian mismatch for output %d with respect to input %d,\n'
--> 192 'numerical:%s\nanalytical:%s\n' % (i, j, n, a))
193
194 if not reentrant:

C:\Anaconda3\lib\site-packages\torch\autograd\gradcheck.py in fail_test(msg)
170 def fail_test(msg):
171 if raise_exception:
--> 172 raise RuntimeError(msg)
173 return False
174

RuntimeError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor(1.00000e-02 *
[[ 0.0000],
[ 0.0000],
[ 0.0000],
...,
[ 0.0000],
[ 0.0000],
[ 0.0000]])
analytical:tensor([[-3.6467e-05],
[ 2.2621e-06],
[-3.9878e-05],
...,
[-4.3014e-05],
[ 1.2278e-05],
[ 3.6285e-06]])`

Is there any experiments results provided?

The shape of the output

Hello ：
I have a question of the output of the first conv_lstm layer, what's the shape if it ?

about forward

I'm very happy to see that you have modified the previous question, but I suggest that your ConvLSTMCell forward process take the following calculations:
ci = torch.sigmoid(self.Wxi(input) + self.Whi(hidden_state) + c * self.Wci)
cf = torch.sigmoid(self.Wxf(input) + self.Whf(hidden_state) + c * self.Wcf)
new_c = cf * c + ci * torch.tanh(self.Wxc(input) + self.Whc(hidden_state))
co = torch.sigmoid(self.Wxo(input) + self.Who(hidden_state) + new_c * self.Wco)
new_h = co * torch.tanh(new_c)

why hidden_channels % 2 == 0 ?

What's the shape of input?

This is a question rather than an issue, sorry for bothering. > +
I have questions about input/output shape, as well as the meaning of "input_channel".
According to #gradient check part of the code, the input is in the shape of (1, 512, 64, 32), while the output shrinks to (1, 32, 64, 32).
I assume for the input, 1 is batch size, 512 is input_channel, and the image is of size 64*32.

The questions are: What does these channels mean? Are they filters as in Keras convLSTM library(https://keras.io/layers/recurrent/#convlstm2dcell)? How do we input a sequence of 5 images? And why is output channel smaller than the input?

about concat

https://arxiv.org/pdf/1506.04214.pdf
in this parper

the LSTM compute i(t),f(t),o(t) is use x(t),h(t-1) and c(t-1). but in your code convolution_lstm.py line 24:
combined = torch.cat((input, h), dim=1)

you just use x and h, why?

Found the code is much slower than Keras counterpart (takes 2-3 times longer time). Do you know why?

I think forward code is wrong

I think input shape to (T, B, C, H, W) and x = input[step] is correct.

Am I wrong?

Where is the squence length

input = Variable(torch.randn(1, 512, 64, 32)).cuda()
one for batchsize, ont for channel, the last two for H and W, Where is the squence length

Error in backward

Thanks for your implementation of conv-lstm. However, there may exist some bugs in the code. I use the code as a part of my project. The convolutional features are extracted from images and are passed to the conv-lstm. Following the conv-lstm are a fully-connected layer and the loss. But the loss.backward() will report an error, which tells that the 'retain_graph' parameter should be true. However, setting retain_graph=True will consume more and more memory and slow down the program.

Why Wci,Wcf,Wco should be initialized at the beginning of each batch?

@automan000 Thanks a lot for you work! I am a little confused that in code:

41 def init_hidden(self, batch_size, hidden, shape):
42 ¦ self.Wci = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
43 ¦ self.Wcf = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
44 ¦ self.Wco = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
45 ¦ return (Variable(torch.zeros(batch_size, hidden, shape[0], shape[1])).cuda(),
46 ¦ ¦ ¦ Variable(torch.zeros(batch_size, hidden, shape[0], shape[1])).cuda())

Why Wci, Wcf, Wco should be initialized?

关于input 和 step 的问题

代码的输入数据定义为: input = Variable(torch.randn(1, 512, 64, 32)).cuda() 参数分别为(batch_size, channels, height, weight)
在输入convlstm网络进行计算时有下面这个for循环：
for step in range(self.step): x= input for i in range(self.num_layers): ...

请问这个循环每次的输入x都是相同的吗？序列体现在哪里呢？有理解不对的地方希望能得到您的指正，谢谢！

Why self.num_features=4 in line 15?

I want to understand why you set self.num_features=4 in line 15 ? Thanks your response