jthsieh / ddpae-video-prediction Goto Github PK
View Code? Open in Web Editor NEWLearning to Decompose and Disentangle Representations for Video Prediction, NIPS 2018
License: MIT License
Learning to Decompose and Disentangle Representations for Video Prediction, NIPS 2018
License: MIT License
Hi,
It is really an interested and wonderful work!
I just want to know how you decompose the components of the video? I have read the paper carefully, but I still can not quite figure it out.
And another question is that how long it takes to complete a Moving MNIST experiment on your device?
Dear authors and maintainers,
would you please be so kind, as to add a license to the repo, such that terms of usage are clear?
Thank you in advance!
Hi,
I cannot run the bouncing ball experiment. I only created a new data set of 500 sequences and updated the paths and file names related to the data set. Could you take a look at what's going on?
[2019-02-20,14:11:07] Arguments:
- batch_size: 100
- ckpt_dir: /l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/ckpt
- ckpt_name: 200k
- ckpt_path: /l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/ckpt/bouncing_balls/crop_NC4_lr1.0e-03_bt100_200k
- content_latent_size: 128
- dset_dir: /l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/datasets
- dset_name: bouncing_balls
- dset_path: /l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/datasets/bouncing_balls
- evaluate_every: 10
- gpus: 0
- hidden_size: 128
- image_latent_size: 256
- image_size: (128, 128)
- independent_components: 0
- is_train: True
- load_ckpt_dir:
- load_ckpt_epoch: 0
- log_every: 400
- lr_decay: 1
- lr_init: 0.001
- model: crop
- n_channels: 1
- n_components: 4
- n_epochs: 50
- n_frames_input: 10
- n_frames_output: 10
- n_iters: 200000
- n_workers: 4
- ngf: 8
- num_objects: [2]
- pose_latent_size: 3
- save_every: 50
- split: train
- start_epoch: 0
- stn_scale_prior: 4.0
- when_to_predict_only: 0
Val dataset: 500
Image size: 128
[2019-02-20,14:11:09] Total epochs: 40000
Train epoch 0
Traceback (most recent call last):
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 147, in __call__
ret = self.fn(*args, **kwargs)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/DDPAE.py", line 380, in guide
self.encode(input, sample=True)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/DDPAE.py", line 315, in encode
initial_pose_mu, initial_pose_sigma = self.pose_model(input)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/networks/pose_rnn.py", line 152, in forward
encoder_outputs, hidden_states = self.encode(input)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/networks/pose_rnn.py", line 80, in encode
h = torch.cat([hidden[0][0:1], hidden[0][1:]], dim=2)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 0 and 1 in dimension 0 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:83
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 38, in <module>
_, loss_dict = model.train(*data)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/DDPAE.py", line 399, in train
loss = svi.loss_and_grads(svi.model, svi.guide, input, output)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/infer/trace_elbo.py", line 125, in loss_and_grads
for model_trace, guide_trace in self._get_traces(model, guide, *args, **kwargs):
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/infer/elbo.py", line 164, in _get_traces
yield self._get_trace(model, guide, *args, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/infer/trace_elbo.py", line 52, in _get_trace
"flat", self.max_plate_nesting, model, guide, *args, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/infer/enum.py", line 42, in get_importance_trace
guide_trace = poutine.trace(guide, graph_type=graph_type).get_trace(*args, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 169, in get_trace
self(*args, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 153, in __call__
traceback)
File "/m/work/modules/Ubuntu/14.04/amd64/common/anaconda3/latest/lib/python3.6/site-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/pyro/poutine/trace_messenger.py", line 147, in __call__
ret = self.fn(*args, **kwargs)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/DDPAE.py", line 380, in guide
self.encode(input, sample=True)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/DDPAE.py", line 315, in encode
initial_pose_mu, initial_pose_sigma = self.pose_model(input)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "/u/65/yildizc1/unix/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/networks/pose_rnn.py", line 152, in forward
encoder_outputs, hidden_states = self.encode(input)
File "/l/yildizc1/Dropbox/Academic_Stuff/gp/odegp/others/DDPAE-video-prediction_original/models/networks/pose_rnn.py", line 80, in encode
h = torch.cat([hidden[0][0:1], hidden[0][1:]], dim=2)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 2. Got 0 and 1 in dimension 0 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:83
Trace Shapes:
Param Sites:
pose_model$$$module.image_encoder.main.0.weight 8 1 4 4
pose_model$$$module.image_encoder.main.2.weight 16 8 4 4
pose_model$$$module.image_encoder.main.3.weight 16
pose_model$$$module.image_encoder.main.3.bias 16
pose_model$$$module.image_encoder.main.5.weight 32 16 4 4
pose_model$$$module.image_encoder.main.6.weight 32
pose_model$$$module.image_encoder.main.6.bias 32
pose_model$$$module.image_encoder.main.8.weight 64 32 4 4
pose_model$$$module.image_encoder.main.9.weight 64
pose_model$$$module.image_encoder.main.9.bias 64
pose_model$$$module.image_encoder.main.11.weight 128 64 4 4
pose_model$$$module.image_encoder.main.12.weight 128
pose_model$$$module.image_encoder.main.12.bias 128
pose_model$$$module.image_encoder.main.14.weight 256 128 4 4
pose_model$$$module.encode_rnn.weight_ih_l0 512 384
pose_model$$$module.encode_rnn.weight_hh_l0 512 128
pose_model$$$module.encode_rnn.bias_ih_l0 512
pose_model$$$module.encode_rnn.bias_hh_l0 512
pose_model$$$module.predict_rnn.weight_ih_l0 512 256
pose_model$$$module.predict_rnn.weight_hh_l0 512 128
pose_model$$$module.predict_rnn.bias_ih_l0 512
pose_model$$$module.predict_rnn.bias_hh_l0 512
pose_model$$$module.beta_mu_layer.weight 3 128
pose_model$$$module.beta_mu_layer.bias 3
pose_model$$$module.beta_sigma_layer.weight 3 128
pose_model$$$module.beta_sigma_layer.bias 3
pose_model$$$module.initial_pose_rnn.weight_ih_l0 512 128
pose_model$$$module.initial_pose_rnn.weight_hh_l0 512 128
pose_model$$$module.initial_pose_rnn.bias_ih_l0 512
pose_model$$$module.initial_pose_rnn.bias_hh_l0 512
pose_model$$$module.initial_pose_mu.weight 3 128
pose_model$$$module.initial_pose_mu.bias 3
pose_model$$$module.initial_pose_sigma.weight 3 128
pose_model$$$module.initial_pose_sigma.bias 3
encoder$$$module.main.0.weight 8 1 4 4
encoder$$$module.main.2.weight 16 8 4 4
encoder$$$module.main.3.weight 16
encoder$$$module.main.3.bias 16
encoder$$$module.main.5.weight 32 16 4 4
encoder$$$module.main.6.weight 32
encoder$$$module.main.6.bias 32
encoder$$$module.main.8.weight 64 32 4 4
encoder$$$module.main.9.weight 64
encoder$$$module.main.9.bias 64
encoder$$$module.main.11.weight 128 64 4 4
Sample Sites:
in pose_rnn.py, L80:
#h = torch.cat([hidden[0][0:1], hidden[0][1:]], dim=2)
#c = torch.cat([hidden[1][0:1], hidden[1][1:]], dim=2)
# these 2 lines throw a dim error, is it supposed to be:
h = hidden[0]
c = hidden[1]
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.