chiweihsiao / deepvo-pytorch Goto Github PK

View Code? Open in Web Editor NEW

333.0 333.0 103.0 5.25 MB

PyTorch Implementation of DeepVO

Python 96.74% Shell 3.26%

deepvo-pytorch's People

Stargazers

Watchers

Forkers

hansry bryan-bai mason-woo braingardenai lichunshang gaiaml zhaihongjia akshay-iyer suhitk gerrygekao weili1457355863 celljy stiphyjay hwb0314 zjgulai aoliao12138 mjerrar zacr0 shannon112 sycgis zjut-jianhuazhang dexterfixxor topgun666 holmes-alan rancheng zijianhan dada0o0 coraisch pyojinkim eric-yyjau kanhereashwin jingrsu jefersonjlima khg11102 jackyspeed parkjinhyeock han811 itking666 kingstarcraft shourenzhong indra-ipd houkuanxu maomingyang zkxslam liujiandu vloteoele xunshengliuyin liubamboo tym2103 brightyoun nivesh48 muyuman mihirk284 inkyusa karan2808 qq664956261 shiventripathi vrmn summer2474 eunju-jeong178 satoshirobatofujimoto oguzhanbsolak juwangvsu quanqhow julee24 chengwei920412 noticeable atlasgooo2 clementpinard adribruc mariakalimeri federicozappone howechong bushuyan fanyuzeng neidal ymjian41 nirajreginald zhangy10 hixiaoyu kaitung shaza-is haosunhh acsgn95 linmenwill burak172 ahmedharbii yukunchen bobe-wang ajingshou weilongye summerhuizhang junxuuuuu gavinstrunk sougato97 mohitburkule xionghccccccccccc jiajiaandwu zacon7 mc1017

deepvo-pytorch's Issues

Need too much time

I am a noob and had trouble training the net beause of limited resouce. Could some one offer me a trained model to test the result? that would be very nice, THX.

ZeroDivisionError why ?

DeepVO-pytorch-master/main.py", line 112, in
loss_mean /= len(train_dl)
ZeroDivisionError: division by zero

Execution error

Thanks a lot for your work, I'm learning this implementation. But the following error occurs during execution, I don't know why, can you help me? thanks

ValueError: Caught ValueError in DataLoader worker process 0.
groundtruth_rotation = raw_groundtruth[1][0].reshape((3, 3)).T # opposite rotation of the first frame
ValueError: cannot reshape array of size 0 into shape (3,3)

Questions about the inputs of LSTM in the DeepVO

In the official document, the input of LSTM is (seq_len, batch, input_size).However in your code, the input of LSTM is (Batch, Seq_len, input_size).Use torch.transpose(x,1,0)??

Concatenating poses in test.py

DeepVO-pytorch/test.py

Lines 81 to 104 in 14d8790

 if i == 0: 

 for pose in batch_predict_pose[0]: 

 # use all predicted pose in the first prediction 

 for i in range(len(pose)): 

 # Convert predicted relative pose to absolute pose by adding last pose 

 pose[i] += answer[-1][i] 

 answer.append(pose.tolist()) 

 batch_predict_pose = batch_predict_pose[1:] 

 # transform from relative to absolute  

 for predict_pose_seq in batch_predict_pose: 

 # predict_pose_seq[1:] = predict_pose_seq[1:] + predict_pose_seq[0:-1] 

 ang = eulerAnglesToRotationMatrix([0, answer[-1][0], 0]) #eulerAnglesToRotationMatrix([answer[-1][1], answer[-1][0], answer[-1][2]]) 

 location = ang.dot(predict_pose_seq[-1][3:]) 

 predict_pose_seq[-1][3:] = location[:] 

 # use only last predicted pose in the following prediction 

 last_pose = predict_pose_seq[-1] 

 for i in range(len(last_pose)): 

 last_pose[i] += answer[-1][i] 

 # normalize angle to -Pi...Pi over y axis 

 last_pose[0] = (last_pose[0] + np.pi) % (2 * np.pi) - np.pi 

 answer.append(last_pose.tolist())

For the section of the code here, what is the significance of checking for i == 0?

Specifically, at line 99 shown below, why only composing the last pose? Since all poses returned by the network are relative poses, shouldn't you compose all the relative poses returned by the network?

DeepVO-pytorch/test.py

Line 99 in 14d8790

last_pose = predict_pose_seq[-1]

Suggestion for improvements

Thank you for your decent work! Actually your model and your code can do a bit better if you address a few logical issues in the code.

The issues are the following:

In the dataloader you split a track to sequences and calculate relative coordinates and orientation. That is fine, but not enough. You should also rotate the sequence relative to the starting point angles.
Look deeper to how the Euler angles change when the car makes a turn over 90 degrees. In such cases you can't simply subtract the angles of the previous point from the current to get the delta because the angle values are no longer continuous.

Hello,The following website cannot be opened.

Hello,The following Hello,The following website cannot be opened. "https://drive.google.com/file/d/1l0s3rYWgN8bL0Fyofee8IhN-0knxJF22/view" cannot be opened. "https://drive.google.com/file/d/1l0s3rYWgN8bL0Fyofee8IhN-0knxJF22/view"

Train takes to much time ?

Hi can anyone please help me ? I run the main.py code and it has been running for more than a day without showing anything, I want to train myself not to use the trained model, I have an Nvidia GTX-1050ti:

Problem about performance

As is shown in the result, the performance of trained model is very unsatisfying. Actually, after 100 epochs valid loss and train loss is so close, simply improving number of epoch can do nothing to performance. My problem is are there any solutions to promote the performance of the model?

Effects of BatchNorm

Hi Chiwei, thanks for sharing your code, which is very helpful.

I was looking into it and saw that you are using BatchNorm. I don't see the DeepVO mentioning it. Did you do an experiment on this?

Also, why did you use more training data than the settings in the paper? And did you evaluate the RMSE of your model?

Thank you very much!

Performance

Hello,
Can you upload pretrained weights?
Because I trained it for 72 epochs. But the testing loss is too high and the results are not visually convincing at all.
Thanks

Problem about running main.py

@ChiWeiHsiao Hi, when I try to run main.py, it always alert the error below:

Traceback (most recent call last):
File "main.py", line 114, in
ls = M_deepvo.step(t_x, t_y, optimizer).data.cpu().numpy()
File "/root/DeepVO3/model.py", line 133, in step
optimizer.step()
File "/root/anaconda3/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/lib/python3.8/site-packages/torch/optim/adagrad.py", line 99, in step
std = state['sum'].sqrt().add_(group['eps'])
KeyError: 'eps'

I want to know how to solve this exception.
Thank you for your time.

Transformation of relative pose to absolute pose

In line 94 in test.py, the Rotation Matrix is computed using only one Euler angle:

ang = eulerAnglesToRotationMatrix([0, answer[-1][0], 0])

I'm wondering, why are theta_x and theta_z set to zero?

clean_unused_images() function is redundant and wrong on preprocess.py file ?

for example '08': ['001100', '005170'] , the 08 sequences has 4071 (5170-1100+1) pictures. the start number is still 0000000000.png and the end file's name is 0000004070.png. So the program shouldn't run clean_unused_images() function.

./test.py

thanks

images

how do you get the result for 02?
i only get the result for 04 ,05,07,09,10,it don't have 02.what should i do to get the result for 02?

DeepVO on MVSECD Dataset

Has anyone tried training DeepVO on MVSECD dataset? The problem is MVSECD only provides grayscale images and I'm unsure if DeepVO can work well with them.

array question

Hi，guy！I have this problem：
Traceback (most recent call last):
File "/home/yxh/src/DeepVO-pytorch-master/main.py", line 102, in
for _, t_x, t_y in train_dl:
File "/home/yxh/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in next
return self._process_next_batch(batch)
File "/home/yxh/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
ValueError: Traceback (most recent call last):
File "/home/yxh/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/yxh/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/yxh/src/DeepVO-pytorch-master/data_helper.py", line 201, in getitem
groundtruth_rotation = raw_groundtruth[1][0].reshape((3, 3)).T # opposite rotation of the first frame
ValueError: cannot reshape array of size 0 into shape (3,3)

what should i do？Thanks

the result

in the out_04.txt,which one is x,which one is y?

0.016617655754089355, -0.002593049081042409, -3.987591480836272e-05, 0.03105238452553749, -0.023164983838796616, 0.9082887172698975

Preprocess doesn't create .npy file

Hi,

I followed the instructions provided in the readme file. I noticed that the shell also removed the 03 images (00,01,02,04,....).
On running the preprocess.py. I couldn't get any .npy file saved.

how to draw the picture?

FlowNet link cracked

The link for pretrained weight of FlowNet ( CNN part ) have been cracked, could you please share the file?

Hello,i got result about video 05

I reading the paper ,find it said due to overfit ,but not mention the details

I want to know your results,and can you tell me What measures are taken ？

Large time per iteration / slow computation

Hello, can you tell me how long does your training takes per epoch with raw code from repo, without changing crucial parts? Because for me, it is running a long time, 2 hours per epoch, or something like that. I have imported pretrained FlowNet from the link in description, and set num_workers=0 because of the error I get when I set them to any number higher than zero.
What are possible ways of improving training speed, for this code in particular?
I am trying to learn as much as I can about Deep Learning, and this repo is going to help me a lot in my study field (robotics).

Edit:
I just tested few things, it seems that get_loss(), and step() function in general, take relativly long time to compute. Is that the case for you too?
With batch_size = 8, n_workers = 8 I get like 10-12sec per iteration and I even resized images to very small resolution.
Any ides on what is going on?

ZeroDivisionError: division by zero

Traceback (most recent call last):
File "main.py", line 117, in
loss_mean /= len(train_dl)
ZeroDivisionError: division by zero

Data preprocessing and provided model

Hi and thanks for sharing the code. Makes a great school for me!

I have noticed one issue with regards to the currently committed code in conjunction with the committed pickle files for the train and validation data used.

Although everything runs smoothly when it comes to fetching and preprocessing data, as well as using the preprocessed data to train a new model from scratch, it is impossible for me to reproduce a model with performance, even remotely, close to the performance your provided pretrained model has. So in my effort to reproduce exactly what you do, I tried using your data pickle files, which you kindly provide in the datainfo subdir. Of course, the (absolute) path of the images needed some fixing to work for me but this was no issue. The issue is that the poses in your provided pickle files are lists of length 6, presumably angles and positions. But, the ImageSequenceDataset __getitem__ method needs a 15-long list (which, I have come to understand, is the 6-number part above, i.e. 3 angles and 3 positions, along with the flattened out rotational matrix.) Your ImageSequenceDataset class seems to use the rotational matrix to rotate all poses of a given sequence w.r.t. the first frame of that sequence.

What I am, effectively, saying is that the poses in the committed pickle files don't work with the committed code, at the moment of writing this. And when I use the committed code without your pickle files, I get seemingly some convergence in the training, when looking at train/valid losses, but the final model produces absolutely non-sensical results on the xy-plane (or is it xz-plane?).

Could you give a hint as to which hyper parameters and code commit you used to produce the provided model?

Is the ground truth of pose "relative" or "absolute"?

What is that original paper said about that? And what is the output pose of the network? relative or absolute? I am quite confused. hope get your reply, thanks :-)

Results Using Pretrained Model

Hi~

I've used the pretrained weights that you provided for evaluation on sequence 09 and 10, but the results are quite different from what you showed on the repo page. The only thing I modified is the batch size during inference, which I changed from 8 to 2. Below is my visualization on sequence 09, do you know why this is the case?

Another question is that do you have quantitative evaluation results on every sequence using kitti eval code? The results stated in the original paper are so good that I cannot reproduce it at all, therefore quite doubt their results.

Thank you~

with cpu

I have this question。
FileNotFoundError: [Errno 2] No such file or directory: 'models/t000102050809_v04060710_s5x7_rnn1000_lr0.0005_optAdagrad.model.train'

I don't have this models , what should i do ?

Pretrained FlowNet weight missing

result02

why i got the result like this ？

hello,could you tell me how to fix the problem you meet.Thx> try flownets_EPE1.951.pth.tar. Also it seems that your NN is undertrained.

try flownets_EPE1.951.pth.tar. Also it seems that your NN is undertrained.

Thank you. I've got the normal trajectory now. How to calculate the evaluation index of KITTI SLAM?

Originally posted by @ujgygy656565 in #21 (comment)

Pre-trained model load error

Hi @ChiWeiHsiao @alexart13 @daiyk

Thanks for your great work. I have a question about run test.py. When I load the trained model,it works well. But the when I change it to the optimizer, it broke down. It shows...

Traceback (most recent call last):
File "test.py", line 32, in
M_deepvo.load_state_dict(torch.load(load_optimizer_path))
File "/home/yanhzhan/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 777, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DeepVO:
Missing key(s) in state_dict: "conv1.0.weight", "conv1.1.weight", "conv1.1.bias", "conv1.1.running_mean", "conv1.1.running_var", "conv2.0.weight", "conv2.1.weight", "conv2.1.bias", "conv2.1.running_mean", "conv2.1.running_var", "conv3.0.weight", "conv3.1.weight", "conv3.1.bias", "conv3.1.running_mean", "conv3.1.running_var", "conv3_1.0.weight", "conv3_1.1.weight", "conv3_1.1.bias", "conv3_1.1.running_mean", "conv3_1.1.running_var", "conv4.0.weight", "conv4.1.weight", "conv4.1.bias", "conv4.1.running_mean", "conv4.1.running_var", "conv4_1.0.weight", "conv4_1.1.weight", "conv4_1.1.bias", "conv4_1.1.running_mean", "conv4_1.1.running_var", "conv5.0.weight", "conv5.1.weight", "conv5.1.bias", "conv5.1.running_mean", "conv5.1.running_var", "conv5_1.0.weight", "conv5_1.1.weight", "conv5_1.1.bias", "conv5_1.1.running_mean", "conv5_1.1.running_var", "conv6.0.weight", "conv6.1.weight", "conv6.1.bias", "conv6.1.running_mean", "conv6.1.running_var", "rnn.weight_ih_l0", "rnn.weight_hh_l0", "rnn.bias_ih_l0", "rnn.bias_hh_l0", "rnn.weight_ih_l1", "rnn.weight_hh_l1", "rnn.bias_ih_l1", "rnn.bias_hh_l1", "linear.weight", "linear.bias".
Unexpected key(s) in state_dict: "param_groups", "state".

I don't know how to fix it.

Best regards,
Yu

loss

dear，i have some trouble about loss , can you tell me your wechat account, i have some questiones to ask you,thanks

No license?

Thanks for publishing your work. I can't find any license attached to the repo, though. By default, it means "If a repository has no license, then all rights are reserved and it is not Open Source or Free. You cannot modify or redistribute this code without explicit permission from the copyright holder." Is it what you really meant? If not, could you, please, add some more permissive license otherwise?

Groundtruth reshape error

I'm getting this error when I try to run main.py, can anybody please help out?

### roundtruth_rotation = raw_groundtruth[1][0].reshape((3, 3)).T # opposite rotation of the first frame
ValueError: cannot reshape array of size 0 into shape (3,3)

params problem

Hello, I want to know how to calculate the img-means and stds. The value calculated by preprocess is not the value in params, and the value calculated there has no good training effect. Thank you

The call of LSTM may be different from official documentation of Pytorch

Hello, I have cloned your code and run successfully using trained model you provide. I got resonable results. However I noticed that the call of LSTM in your implentation is different from what offical documentation says.
According to official documentation, the input of LSTM should be of shape (seq_len, batch, input_size), however in your code, it's of shape (batch_size, seq_len, input_size).
I wonder whether I referenced documentation with mismatched version or there may be some error in your code.
Thanks for the share of your implementation.

ref documentation: https://pytorch.org/docs/master/generated/torch.nn.LSTM.html

Sorting Issue of input images

Hello,
thanks for your awesome work. However there is a major issue i guess, in your data_helper.py

DeepVO-pytorch/data_helper.py

Line 22 in bb43825

fpaths.sort()

. When loading image paths and sorting them, considering your single frames are named:

li = ['00000.png', '00001.png', '00002.png','000010.png', '0000100.png']
li.sort()
will give you:
['00000.png', '00001.png', '000010.png', '0000100.png', '00002.png']
which is not the natural order when someone sliced a video to single frames.

I'm using https://pypi.org/project/natsort/ for this issue, giving you

from natsort import natsorted
li_sorted = natsorted(li)
['00000.png', '00001.png', '00002.png', '000010.png', '0000100.png']

is the relative pose computation correct?

    groundtruth_sequence[1:] = groundtruth_sequence[1:] - groundtruth_sequence[0] # get relative pose w.r.t. the first frame in the sequence

I saw you got the relative pose by minus operation, but shouldn't we got the relative pose by times operation of two absolute pose?

such as 2T1 = inv(wT2) dot (wT1)

Train loss and test loss

During training the trian loss is very small like around 0.3, but the smallest test loss is aroung 90.
Then I used trained model provided by alexart13 on test this model on training dataset ['01','02', '05', '08', '09']. The test.py still gave me three-digit even four-digit loss.
Anyone know why?

I can not download the data via your "DeepVO-pytorch/KITTI/downloader.sh".

Can share your data for me?
This is my email : [email protected]!
I hope you can contact with me!
Thank You!

train problem

Hello, I am a novice. I have two questions. I set resume to false. I want to train it from the beginning. But when I trained 100 to 200 epochs, the results of the tests and visualize are very poor. I haven't changed your parameters and code. What's the reason? The second question is, the img-means and stds in params are calculated by preprocess? Thank you very much.

Question about KITTI sequences used

Hello,
The README file says that KITTI/downloader.sh "will only keep the left camera colour images (image_03 folder) and delete other data" but that folder actually contains the right colour camera images.

Is there an error in the description/code?

Thanks for your work and attention.

I want to know the what's the meaning of 'seq_len' in param.py

Hello, guys. There's a parameter 'seq_len' in param.py. I want to know the what's the meaning of 'seq_len' . Why it's (5,7)? Should it be a single number, if it's the sequence length? @ChiWeiHsiao

dowloader.sh is stuck at raw images download

This is the output

--2022-07-07 12:45:42-- https://s3.eu-central-1.amazonaws.com/avg-kitti/raw_data/2011_09_26_drive_0067/2011_09_26_drive_0067_sync.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.171.53
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.171.53|:443... connected.
HTTP request sent, awaiting response... 404 Not Found
2022-07-07 12:45:42 ERROR 404: Not Found.

--2022-07-07 12:45:42-- http://kitti.is.tue.mpg.de/kitti/raw_data/2011_09_26_drive_0067/2011_09_26_drive_0067_sync.zip
Resolving kitti.is.tue.mpg.de (kitti.is.tue.mpg.de)... 35.152.66.6
Connecting to kitti.is.tue.mpg.de (kitti.is.tue.mpg.de)|35.152.66.6|:80... failed: Operation timed out.
Retrying.

I stopped the process after a bunch of tries

	if i == 0:
	for pose in batch_predict_pose[0]:
	# use all predicted pose in the first prediction
	for i in range(len(pose)):
	# Convert predicted relative pose to absolute pose by adding last pose
	pose[i] += answer[-1][i]
	answer.append(pose.tolist())
	batch_predict_pose = batch_predict_pose[1:]

	# transform from relative to absolute

	for predict_pose_seq in batch_predict_pose:
	# predict_pose_seq[1:] = predict_pose_seq[1:] + predict_pose_seq[0:-1]
	ang = eulerAnglesToRotationMatrix([0, answer[-1][0], 0]) #eulerAnglesToRotationMatrix([answer[-1][1], answer[-1][0], answer[-1][2]])
	location = ang.dot(predict_pose_seq[-1][3:])
	predict_pose_seq[-1][3:] = location[:]

	# use only last predicted pose in the following prediction
	last_pose = predict_pose_seq[-1]
	for i in range(len(last_pose)):
	last_pose[i] += answer[-1][i]
	# normalize angle to -Pi...Pi over y axis
	last_pose[0] = (last_pose[0] + np.pi) % (2 * np.pi) - np.pi
	answer.append(last_pose.tolist())

chiweihsiao / deepvo-pytorch Goto Github PK

deepvo-pytorch's People

Stargazers

Watchers

Forkers

deepvo-pytorch's Issues

Recommend Projects

Recommend Topics

Recommend Org