Giter VIP home page Giter VIP logo

convolutional_lstm_pytorch's People

Contributors

automan000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

convolutional_lstm_pytorch's Issues

Peephole connections (Wci, Wcf, Wco) gradient update

The LSTM paper defines a specific rule for gradient updates of the 'peephole' connections. Specifically:

[...] during learning no error signals are propagated back from gates via peephole connections to CEC

Based on my understanding of the code the way these 3 variables are initialized (as asked in Issue 17) is an attempt at implementing this update rule, but I don't see how does initializing them as Variables helps. From my understanding of the quoted part of the LSTM paper, the peephole connections should be updated but the gradient that updates them should stop there and not flow any further. If that is the case then this implementation is incorrect, although it might be that Pytorch does not support such an operation as .detach() is not suitable for the job.

using for custom dataset

Hi, I'm trying to apply your code to a sequential image dataset.

However, whenever I tried to input concatenated images (Batch x Timeseries x Channels x Width x Height), it gives following error.



<ipython-input-40-c130ba752793> in forward(self, input, h, c)
     29         input = input.cuda()
     30 
---> 31         combined = torch.cat((input, h), dim=1)
     32 
     33         A = self.conv(combined)

TypeError: cat received an invalid combination of arguments - got (tuple, dim=int), but expected one of:
 * (sequence[torch.cuda.FloatTensor] seq)
 * (sequence[torch.cuda.FloatTensor] seq, int dim)
      didn't match because some of the arguments have invalid types: (tuple, dim=int)

It seems like something is wrong with .cuda() declaration.

So I looked into those two Variables input and h

input has torch.FloatTensor type,

h has torch.autograd.variable.Variable type.

From this article, those two Variables have to be changed to same data type.

My question is,

i) Have you undergone the same issue like this?

ii) I've tried to change the data type of h by h.cuda() didn't work. I'm not used to Pytorch. So is there any advice I can get?

BTW thank you for your update. Gave me a lot of help

Input shape issue and lack of bias.

The first problem is that in ConvLSTM.forward, the code is using the same x = input in multiple timesteps.
I guess the input shape of forward func. shall be changed to

[sequence, bsize, channel, x, y] 

instead of the original

[bsize, channel, x, y]

And, x=input line shall be changed to

x=input[step]

for different steps.
I am still studying if it's appropriate to loop layers within loops of timesteps, but after training your current code(with the change I mentioned above), I can get decent outcomes.

The second problem is that in ConvLSTMCell, there're no biases. For example in

ci = torch.sigmoid(self.Wxi(x) + self.Whi(h) + c * self.Wci)

While it should be something like

ci = torch.sigmoid(self.Wxi(x) + self.Whi(h) + c * self.Wci + self.Bci)

But I don't know if such constants would affect the backward phase.

P.S. I'm myself a beginner so maybe I'm wrong. Please reply :)

RuntimeError: Jacobian mismatch for output 0 with respect to input 0

Default code snippet ran in pytorch 0.40, it hangs a bit then this error shows up.

Complete Error:

`RuntimeError Traceback (most recent call last)
in ()
101 output = convlstm(input)
102 output = output[0][0]
--> 103 res = torch.autograd.gradcheck(loss_fn, (output, target), raise_exception=True)
104 print(res)

C:\Anaconda3\lib\site-packages\torch\autograd\gradcheck.py in gradcheck(func, inputs, eps, atol, rtol, raise_exception)
190 if not ((a - n).abs() <= (atol + rtol * n.abs())).all():
191 return fail_test('Jacobian mismatch for output %d with respect to input %d,\n'
--> 192 'numerical:%s\nanalytical:%s\n' % (i, j, n, a))
193
194 if not reentrant:

C:\Anaconda3\lib\site-packages\torch\autograd\gradcheck.py in fail_test(msg)
170 def fail_test(msg):
171 if raise_exception:
--> 172 raise RuntimeError(msg)
173 return False
174

RuntimeError: Jacobian mismatch for output 0 with respect to input 0,
numerical:tensor(1.00000e-02 *
[[ 0.0000],
[ 0.0000],
[ 0.0000],
...,
[ 0.0000],
[ 0.0000],
[ 0.0000]])
analytical:tensor([[-3.6467e-05],
[ 2.2621e-06],
[-3.9878e-05],
...,
[-4.3014e-05],
[ 1.2278e-05],
[ 3.6285e-06]])`

The shape of the output

Hello :
I have a question of the output of the first conv_lstm layer, what's the shape if it ?

about forward

I'm very happy to see that you have modified the previous question, but I suggest that your ConvLSTMCell forward process take the following calculations:
ci = torch.sigmoid(self.Wxi(input) + self.Whi(hidden_state) + c * self.Wci)
cf = torch.sigmoid(self.Wxf(input) + self.Whf(hidden_state) + c * self.Wcf)
new_c = cf * c + ci * torch.tanh(self.Wxc(input) + self.Whc(hidden_state))
co = torch.sigmoid(self.Wxo(input) + self.Who(hidden_state) + new_c * self.Wco)
new_h = co * torch.tanh(new_c)

What's the shape of input?

This is a question rather than an issue, sorry for bothering. > +
I have questions about input/output shape, as well as the meaning of "input_channel".
According to #gradient check part of the code, the input is in the shape of (1, 512, 64, 32), while the output shrinks to (1, 32, 64, 32).
I assume for the input, 1 is batch size, 512 is input_channel, and the image is of size 64*32.

The questions are: What does these channels mean? Are they filters as in Keras convLSTM library(https://keras.io/layers/recurrent/#convlstm2dcell)? How do we input a sequence of 5 images? And why is output channel smaller than the input?

about concat

https://arxiv.org/pdf/1506.04214.pdf
in this parper
image
the LSTM compute i(t),f(t),o(t) is use x(t),h(t-1) and c(t-1). but in your code convolution_lstm.py line 24:
combined = torch.cat((input, h), dim=1)

you just use x and h, why?

Where is the squence length

input = Variable(torch.randn(1, 512, 64, 32)).cuda()
one for batchsize, ont for channel, the last two for H and W, Where is the squence length

Error in backward

Thanks for your implementation of conv-lstm. However, there may exist some bugs in the code. I use the code as a part of my project. The convolutional features are extracted from images and are passed to the conv-lstm. Following the conv-lstm are a fully-connected layer and the loss. But the loss.backward() will report an error, which tells that the 'retain_graph' parameter should be true. However, setting retain_graph=True will consume more and more memory and slow down the program.

Why Wci,Wcf,Wco should be initialized at the beginning of each batch?

@automan000 Thanks a lot for you work! I am a little confused that in code:

41 def init_hidden(self, batch_size, hidden, shape):
42 ¦ self.Wci = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
43 ¦ self.Wcf = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
44 ¦ self.Wco = Variable(torch.zeros(1, hidden, shape[0], shape[1])).cuda()
45 ¦ return (Variable(torch.zeros(batch_size, hidden, shape[0], shape[1])).cuda(),
46 ¦ ¦ ¦ Variable(torch.zeros(batch_size, hidden, shape[0], shape[1])).cuda())

Why Wci, Wcf, Wco should be initialized?

关于input 和 step 的问题

代码的输入数据定义为: input = Variable(torch.randn(1, 512, 64, 32)).cuda() 参数分别为(batch_size, channels, height, weight)
在输入convlstm网络进行计算时有下面这个for循环:
for step in range(self.step): x= input for i in range(self.num_layers): ...

请问这个循环每次的输入x都是相同的吗?序列体现在哪里呢?有理解不对的地方希望能得到您的指正,谢谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.