Giter VIP home page Giter VIP logo

self-supervised-depth-completion's People

Contributors

fangchangma avatar fengziyue avatar weblucas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

self-supervised-depth-completion's Issues

About extracting trained model

When I downloaded the trained model, I could not extract the 'tar' file. I was wondering if there is something wrong with your 'tar' file.

Updated data_structure

Hey! @fangchangma I find your work really interesting and I am evaluating your approach for the task for single view depth completion.

I have a query regarding the data structure that you updated on 1st of October,
Initially, the data structure used to be like data/kitti_depth and data/kitti_rgb.

As in your updated data structure, you have included two new subfolders in data that is data/data_depth_velodyne and data/depth_selection.

Can you explain me where we need to download the data for the respective subfolders?

Pretrained model got poor result (RMSE=1343.609)

Hi @fangchangma Thanks for sharing the code. I evaluated the pretrained model provided in readme. The result is not as good as reported in the paper (rmse 1343 vs 814). It was a clean clone and I followed the data folder structure. I attached the command and the screenshot of the results. Please let me know if there is an error or if I missed something here. Thank you.

python main.py --evaluate pretrain/mode=sparse+photo.w1=0.1.w2=0.1.input=gd.resnet34.criterion=l2.lr=1e-05.bs=16.wd=0.pretrained=False.jitter=0.1.time=2019-02-26@07-50/model_best.pth.tar

=> output: ../results/mode=sparse+photo.w1=0.1.w2=0.1.input=gd.resnet34.criterion=l2.lr=1e-05.bs=16.wd=0.pretrained=False.jitter=0.1.time=2019-05-08@10-21
Val Epoch: 8 [990/1000]	lr=0 t_Data=0.001(0.001) t_GPU=0.014(0.023)
	RMSE=1086.59(1347.03) MAE=308.10(359.76) iRMSE=4.29(4.27) iMAE=1.50(1.64)
	silog=4.67(5.24) squared_rel=0.00(0.01) Delta1=0.994(0.992) REL=0.018(0.020)
	Lg10=0.007(0.008) Photometric=0.000(0.000) 
=> output: ../results/mode=sparse+photo.w1=0.1.w2=0.1.input=gd.resnet34.criterion=l2.lr=1e-05.bs=16.wd=0.pretrained=False.jitter=0.1.time=2019-05-08@10-21
Val Epoch: 8 [1000/1000]	lr=0 t_Data=0.001(0.001) t_GPU=0.014(0.023)
	RMSE=1005.17(1343.61) MAE=262.70(358.79) iRMSE=4.57(4.28) iMAE=1.57(1.64)
	silog=4.98(5.24) squared_rel=0.00(0.01) Delta1=0.993(0.992) REL=0.018(0.020)
	Lg10=0.007(0.008) Photometric=0.000(0.000) 
*
Summary of  val round
RMSE=1343.609
MAE=358.790
Photo=0.000
iRMSE=4.277
iMAE=1.642
squared_rel=0.006554501281207195
silog=5.2404233943858145
Delta1=0.992
REL=0.020
Lg10=0.008
t_GPU=0.023
(best rmse is 1343.609)
*

question about depth-estimation results

Hi, great papers, thanks a lot for sharing!

I have a question - have you done any evaluations of your latest work, but in a depth-estimation (not completion) setup? unsupervised or supervised?

any idea/thoughts of how such SOTA methods would compare to your work?

Thanks a lot!
Z.

dataset

from the kiti dataset, how is the data to be distributed in the data folder, with kitti_depth and kitti_rgb as its sub-folders?? I want to test he pre-trained model only. Please tell me what data do I need

cuda memory problem

RuntimeError: CUDA out of memory. Tried to allocate 52.25 MiB (GPU 0; 7.92 GiB total capacity; 6.71 GiB already allocated; 53.94 MiB free; 35.80 MiB cached)
File "main.py", line 247, in
main()
File "main.py", line 230, in main
result, is_best = iterate("val", args, val_loader, model, None, logger, checkpoint['epoch'])
File "main.py", line 105, in iterate
pred = model(batch_data)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/jadoo/dense_lidar/self-supervised-depth-completion/model.py", line 119, in forward
conv3 = self.conv3(conv2) # batchsize * ? * 176 * 608
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/container.py", line 92, in forward
input = module(input)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torchvision/models/resnet.py", line 45, in forward
out = self.conv1(x)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/module.py", line 489, in call
result = self.forward(*input, **kwargs)
File "/home/jadoo/pytorch/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 320, in forward
self.padding, self.dilation, self.groups)

Save output depth map

Hey,
I want to save the depth map estimated by your approach. How is that possible? Right now if I am running

python3 main.py --evaluate /home/username/Downloads/model_best.pth.tar --val select

I just get the error-results printed in the terminal and in the result folder a summary. I want to access each estimated depth map. How is that possible?

Best

Error when import the pre-train model

I download the model trained with semi-dense lidar ground truth, from:supervised model.

When I run

torch.load('supervised/model_best.pth.tar')

This is a error:

Traceback (most recent call last):
  File "model.py", line 173, in <module>
    main()
  File "model.py", line 164, in main
    torch.load('supervised/model_best.pth.tar')
  File "/home/S/.local/lib/python3.5/site-packages/torch/serialization.py", line 367, in load
    return _load(f, map_location, pickle_module)
  File "/home/S/.local/lib/python3.5/site-packages/torch/serialization.py", line 538, in _load
    result = unpickler.load()
ImportError: No module named 'metrics'

Could your tell me why?
Thanks~

The result cannot be reproduced

Hi, fangchangma.
Thank you for your nice work. But I can't seem to reproduce the self-supervised results. Could you provide your training log and detailed training plan and hyperparameter configuration?
Looking forward to your reply.

Why is self-supervised worse than supervised?

On page 9 of the paper figure 6b, on the rightmost point, the self-supervised method receives semi-dense lidar ground truth, which is no longer "sparse depth loss"; I don't understand why it performs worse than the supervised method which has the same ground truth supervision. The self-supervised one has additional losses such as photometric loss, etc, so it should at least perform as well as the supervised one in my opinion.

How do you explain this?

More Information

Hi, i m curious about your work. i have already read your paper . Is there anything new?
When will you update the repository?

About batch_size and cuda memory

Hello,
When I am training the model, there will be a problem about "cuda: out of memory". I try to reduce the batch size, but the batch size does not seem too small for this work. Can you give me some advice about the minimum batch size ?

inference

What should I do if I only want to use your code for a few images?

Use Stereo Pair Instead of Temporal Pair for Self-Supervised Training?

Hi @fangchangma,

Thank you for your impressive work. I just have one question regarding to your training setting.
In the paper, you used photometric consistency between current frame and next frame as the cue for self-supervised training.
But following similar idea, if using a stereo pair can accomplish such supervision as well by maintaining the photometric consistency between left & right image, right?
Have you try training in such way? I think it is computationally cheaper.
I am looking forward to your reply.

Best,

training with vlp-16 dataset

Hi Fangchang,

Our lab is currently working on a project which requires generating depth maps from our vlp-16 lidar and camera setting. Your work looks great as the depth map solution. Since we got different size images as input, I think what we need to do to use this network is (1) read in our own calibration information (K) and (2) crop input images as (width, high) both multiples of 16 (since we got errors when going through decode layers with some other sizes), is that right?

We've tested with a rather small dataset (only ~700 frames) and got results like the figure showing below.
We are wondering if the dataset is too small or the depth info from vlp-16 is too sparse since the results remain clear projected lines. It would be great if you have any suggestions, thanks!

comparison_best

Error while loading "calib_cam_to_cam.txt" - can not reshape the array.

I am using your pre-trained models for testing on the validation set "val_selection_cropped". But while loading the "calib_cam_to_cam.txt" certain errors come. I attaching the screenshot of the error.

error_calib_to_cam

I am running the script with this command in the terminal: -

python3 /scratch/gjain2s/Approaches/sparse_dense/self-supervised-depth-completion/main.py --data-folder /scratch/gjain2s/Approaches/sparse_dense/data --evaluate /home/gjain2s/self_supervised/model_best.pth.tar --val select

Any help would be highly appreciated.

python main.py --evaluate [checkpoint-path]

An error occurred when entering such a parameter on the command line,what can we input about [checkpoint -path]?Can you give us an example?

Namespace(batch_size=1, criterion='l2', epochs=11, evaluate='[checkpoint-path]', i
nput='gd', jitter=0.1, layers=34, lr=1e-05, pretrained=False, print_freq=10, rank_metric='rmse', result='..\results', resume='', start_epoch=0, train_mode='dense', use_d=True, use_g=True, use_pose=False, use_rgb=False, val='select', w1=0, w2=0, weight_decay=0, workers=4)
=> no model found at '[checkpoint-path]'

Training doesn't converge

Hi Fangchang:

Thank you so much for sharing this great project!

I have tested your pre-trained self-supervised model, it's RMSE is around 1300, matched with your paper.
But when I try to train the model with this command:
python main.py --train-mode sparse+photo
on 2 Tesla-V100 GPU for around 15 epochs, it can only converge to RMSE ~8k-9k and never further. I didn't change any hyper parameter from your code, just the batch-size is smaller than you mentioned (8).

Are there any parameters or options I need to change from this Github repo? Or do you have any suggestions on training?

Thank you so much!

Sincerely,
Ziyue Feng

Mismatch between comment and code

Hi,
Thank you for sharing your code with us! I am trying to evaluate the method on our own dataset. We gathered larger images and thus have to crop/resize them. When looking at the code, the comment in kitti_loader.py states:

note: we will take the center crop of the images during augmentation
# that changes the optical centers, but not focal lengths
https://github.com/fangchangma/self-supervised-depth-completion/blob/master/dataloaders/kitti_loader.py#L29

The optical center is then adjusted. However, in lines 145 and 168, a bottom crop is applied to the images. Thus, if I understand the code correctly, the full crop distance has to be subtracted from the focal centers.

Can you check if my understanding in this regard is correct?

Kind regards,
Chris

About result visualization

image
Hello,
In the evaluation results, I found that there is content about the visualization of the results, as shown in the picture. What I don't know is what the fourth column is. It seems to be a semi-dense depth map of the annotations, but I used the self-supervised mode(sparse+photo), which should not use annotation data. Can you answer my doubts? Thank you.

To much warning.

When I training the net, the warning raise.
[W IndexingUtils.h:20] Warning: indexing with dtype torch.uint8 is now deprecated, please use a dtype torch.bool instead. (function expandTensors)
Can you tell me how to fix it ?

Running Error in train mode sparse+photo

Hi, Fangchang Ma:
After downloading the dataSet required and put them at the tree structure showed in readme, I try to run the demo using command "python main.py --train-mode sparse+photo -b 6". It meets error as follow:

'''
=> output: ../results/mode=sparse+photo.w1=0.1.w2=0.1.input=gd.resnet34.criterion=l2.lr=1e-05.bs=6.wd=0.pretrained=False.jitter=0.1.time=2021-05-10@12-35
Train Epoch: 0 [9290/14317] lr=1e-05 t_Data=0.009(0.008) t_GPU=0.395(0.402)
RMSE=3337.93(10104.38) MAE=1233.75(5881.38) iRMSE=11.95(inf) iMAE=5.93(inf)
silog=11.96(nan) squared_rel=0.02(0.24) Delta1=0.954(0.600) REL=0.066(0.333)
Lg10=0.029(nan) Photometric=39.012(56.714)

Traceback (most recent call last):
File "main.py", line 362, in
main()
File "main.py", line 349, in main
epoch) # train for one epoch
File "main.py", line 171, in iterate
for i, batch_data in enumerate(loader):
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1065, in _next_data
return self._process_data(data)
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 2.
Original Traceback (most recent call last):
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/hsg/data/software/anaconda/envs/hsg/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/hsg/data/proj/SSDC/self-supervised-depth-completion/dataloaders/kitti_loader.py", line 306, in getitem
rgb, sparse, target, rgb_near = self.getraw(index)
File "/home/hsg/data/proj/SSDC/self-supervised-depth-completion/dataloaders/kitti_loader.py", line 300, in getraw
self.paths['gt'][index] is not None else None
File "/home/hsg/data/proj/SSDC/self-supervised-depth-completion/dataloaders/kitti_loader.py", line 156, in depth_read
depth_png = np.array(img_file, dtype=int)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'PngImageFile'
'''

Could anyone help me to solve the problem?
Or provide some suggestion?

Thanks!

How to test with KITTI object detection dataset

I am trying to use your code with KITTI object detection dataset.

  1. Generate depth map by projecting the Lidar points to image plane
  2. Put image_2 and generated depth map to test_depth_completion_anonymous folder
  3. Run with test_completion mode

However, I got the following error

Traceback (most recent call last):
  File "main.py", line 248, in <module>
    main()
  File "main.py", line 231, in main
    result, is_best = iterate("test_completion", args, val_loader, model, None, logger, checkpoint['epoch'])
  File "main.py", line 105, in iterate
    pred = model(batch_data)
  File "/data/ssd/public/jlliu/pythonlib/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/ssd/public/jlliu/pythonlib/lib/python2.7/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/data/ssd/public/jlliu/pythonlib/lib/python2.7/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/ssd/public/jlliu/depth_completion/self-supervised-depth-completion/model.py", line 126, in forward
    y = torch.cat((convt5, conv5), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 47 and 48 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:83

Did I miss something?

colorize the depth map

hello fangchangma
would you please tell me that how to colorize the depth map in your papper
I get the gray depth image and don't konw how to compare with your result which is colorized like monodepth results.

Creating the data folder structure

@fangchangma I have two questions

  1. I downloaded the kitii dataset which has a folder structure like this

image

Now, do I have to delete |-- val folder and replace with the following?

image

if my assumption is correct then the above steps will make dir structure like this

image

  1. Where can I get the kitti_rgb data? Do I need to download from here for all the drive data that is there is kitti depth dataset? If yes, then do I need to copy the same folder
    image again?

some problem about photometric_loss

Hi,

Thanks for open-sourcing this great piece of work!

I am trying to implement your code and faced some problem,I see that in the main.py file is calculating loss2 (photometric_loss), you use rgb_curr_ (the image of the current frame), warped_ (the image of the neighboring frame predicted by the current frame image), and proofread through the mask to calculate the photometric_loss . So my question is

  1. Is my understanding of rgb_curr_ and warped_ correct? If not, I hope to get your corrections.

  2. Why use the current frame and predicted neighboring frame images to calculate photometric_loss, instead of using the current frame and predicted current frame to calculate photometric_loss.

  3. In your paper, I have seen guidance on using RGB images for depth prediction, Is it the only way to calculate the photometric_loss using the RGB guide? If not, I hope you can give me your advice. Do you have any suggestions?

I have just come into contact with this knowledge, and there may be something wrong. I hope you forgive me.

Thanks for the help!

The dataset used

hi, I wonder how many data you used? The whole depth completion datasst in KITTI about 85898 training data or just choose one sequence like 2011_09_26 to train the model ?
And you the image size feed the net 352x1216, with the limit of hardware, it it proper to downsample t0 176* 608 or 88 * 304 and then feed the network?
Is it necessary to do data augmentation which not mentioned in your paper

RuntimeError

When I try to run with test_completion split. I got the following error:

Traceback (most recent call last):
  File "main.py", line 248, in <module>
    main()
  File "main.py", line 231, in main
    result, is_best = iterate("test_completion", args, val_loader, model, None, logger, checkpoint['epoch'])
  File "main.py", line 105, in iterate
    pred = model(batch_data)
  File "/data/ssd/public/jlliu/pythonlib/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/ssd/public/jlliu/pythonlib/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 141, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/data/ssd/public/jlliu/pythonlib/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/ssd/public/jlliu/self-supervised-depth-completion/model.py", line 126, in forward
    y = torch.cat((convt5, conv5), 1)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 1. Got 47 and 48 in dimension 2 at /pytorch/aten/src/THC/generic/THCTensorMath.cu:83

Any update on the release?

Hi Fangchang,
Wondering whether there is any update on the release? Is it possible to release partial implementation of the paper, for example, training and evaluation on the supervised learning? Also releasing the best model of your network will be very helpful.

Thanks!

About dataset

Hello,
This is a great project, I am very interested in it, but I found that there is no data set that can be used directly. Can you share it?
Thank you

Depth from image

How is depth image generated from .png files for velodyne scans?

Clip output in model.py

Hey,

Why are you clipping the output of the model during eval (0.9m) but not during training?

Thanks

AttributeError

'Namespace' object has no attribute 'data_folder'

The above error occurred during the test.

What should be fixed?

Some Question about '6.4 On Input Sparsity' in your ICRA paper

Hello!

Thank your for your gread work. When I read the paper, I met a question about chapter 6 On Input Sparsity.

In figure 6 you show the result when you trained with self-supervised framework, 'using both RGB and sparse depth yields the same level of accuracy as using sparse depth only'.

Could you tell me if the following guess is correct?

When we have only LiDAR sparse input, we have only 'depth Loss' and 'Smoothness Loss' during training. And Network Architecture in Figure 2 only have 32 channels LiDAR input.

If my guess is right, The input of your paper is Degenerate to the same as Sparsity Invariant CNNs(Only LiDAR). But in this case, your network gets better results. So how do you prove that it is the reason for Self-Supervised framework or Photometric Loss function, not because your network is optimized for lidar?

Thank you for your help!

Error running main.py

When running main.py, I get the following error in line 262

Error message: Invalid syntax end='')
I'm not sure why it doesn't like the end='') part of the code

dataset extracting

the scripts in the download folder just extracting the dara_rgb, how could I extract other files?

Tools for estimating cam pose

Hi,

Great work! I wonder which tools you are using for solving PnP /w RANSAC to estimate the camera pose. Could please provide a snip-shot for these part of codes?

Data augmentation and normalization

Your paper is really interesting, and I have tried implementing the network following your descriptions there.

For the supervised network, I was wondering if you have used any data augmentation during training. If so, what kind? Also, do you normalize the depths in any way (both the input and the ground truth)?

Use your pretrained model: GPU run out of memory. 8.95 gb already allocated

Hey,
I just want to use your pretrained model and create some results (on val_selection_cropped). For some strange reason I get the following error:

RuntimeError: CUDA out of memory. Tried to allocate 210.00 MiB (GPU 0; 10.91 GiB total capacity; 8.95 GiB already allocated; 194.06 MiB free; 9.59 GiB reserved in total by PyTorch)

How can I avoid this? When I dont run your code the GPU usage is low (approx 500 Mb). Looks strange that a 11 Gb GPU is not enough for your code.

I use the following command:

python3 main.py --evaluate /home/username/Downloads/model_best.pth.tar --val select

I have nothing changed in any file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.