zju3dv / deltar Goto Github PK

View Code? Open in Web Editor NEW

122.0 17.0 6.0 167 KB

Code for "DELTAR: Depth Estimation from a Light-weight ToF Sensor And RGB Image", ECCV 2022

License: GNU General Public License v3.0

Python 100.00%

depth-estimation tof-sensors

deltar's Introduction

DELTAR: Depth Estimation from a Light-weight ToF Sensor And RGB Image

Project Page | Paper

DELTAR: Depth Estimation from a Light-weight ToF Sensor And RGB Image
Yijin Li, Xinyang Liu, Wenqi Dong, Han Zhou, Hujun Bao, Guofeng Zhang, Yinda Zhang, Zhaopeng Cui
ECCV 2022

Download Link

We provide the download link [google drive, baidu(code: 1i11)] to

pretrained model trained on NYU.
ZJUL5 dataset.
demo data.

Run DELTAR

Installation

conda create --name deltar --file requirements.txt

Prepare the data and pretrained model

Download from the above link, and place the data and model as below:

deltar
├── data
│   ├── demo
│   └── ZJUL5
└── weights
    └── nyu.pt

Evaluate on ZJUL5 dataset

python evaluate.py configs/test_zjuL5.txt

Run the demo

python evaluate.py configs/test_demo.txt
python scripts/make_gif.py --data_folder data/demo/room --pred_folder tmp/room

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@article{deltar,
  title={DELTAR: Depth Estimation from a Light-weight ToF Sensor and RGB Image},
  author={Li Yijin and Liu Xinyang and Dong Wenqi and Zhou han and Bao Hujun and Zhang Guofeng and Zhang Yinda and Cui Zhaopeng},
  booktitle={European Conference on Computer Vision (ECCV)},
  year={2022}
}

Acknowledgements

We would like to thank the authors of Adabins, LoFTR and Twins for open-sourcing their projects.

deltar's People

Contributors

Stargazers

Watchers

Forkers

wei-tian nicbair staccats windb3ll scissorsf thomacdebabo

deltar's Issues

discretize the distribution by sampling depth hypotheses

Hello professor, your work has really enlighten me. However, I have some questions. In your paper 4.1 chapter, you write "discretize the distribution by sampling depth hypotheses". Could you please tell me what does that mean or how did that? Thank you!

复现testdemo出了点问题

evaluate.py: error: unrecognized arguments: --checkpoint_path /home/user/fyy/deltar/checkpoints/nyu.pt
这条指令是导入训练的权重吗，为什么是无效的呢

Capture system for L5

Hi ,
Thanks for the great work! I'm curious how to access the histogram data from the L5 sensor? I really appreciate you can share your capture system:)

Regards.

关于数据增强

大佬您好，我看到你的数据增强部分好像只有Totensor?其余的数据增强部分是没有加吗

Different results when I train my own model & bugs

Thank you for the nice work. The evaluation and demo with your weights and configuration works like in the paper. I have installed the library versions provided in the requirements.txt

My problem now is that I want to adapt, e.g. shrink, it to fit my purpose therefore I have to train my own models. So I tried to train the entire DELTAR model as it is provided by you. I use 25k nyu images. I have tweaked the code here and there to fit my data and I changed it with the two additional remarks from below. In this setting I trained it for 50 epochs, but I have also trained for 25 epochs with the settings provided in the configs for nyu. I then get the following results for validation:

and training loss:

images from the training set looks good though:

but it does not generalize that well:

My questions now are:

How do you load/preprocess the nyu dataset?
I used this python version of the toolbox: https://github.com/Lucass97/nyuv2_raw_dataset_extractor
Do you have other suggestion what to watch out for?

Additional remarks:

I found that the pointNet parameters where not trained in the network before as they are not added to the parameters for the optimizer in

    def get_10x_lr_params(self):  # lr learning rate
        modules = [self.decoder, self.depth_head, self.conv_out]

In the nyu dataloader in "train" mode the image should be normalized t0 [0,1] (image = np.array(image, dtype=np.float32) / 255.0) before the random_crop(...) and train_preprocess(...) augmentation as it is clipped there to [0,1]

Interpolation for solving misalignment

How did you do on Interpolation for solving misalignment? Could you please describe some more details or some code snippets?

cannot make the same result as the repo example image

hello,
first, thanks for sharing your work

i attempted to reproduce the same result as the example image from readme.md
but there is a big difference in quality

i used the provided weight, data, and config
should i use different parameters to reproduce the result?

i attatched the result that i made

please understand that the file is compressed due to file size issue

关于模型fine-tune的问题

大佬，你的工作真的太有意义了。我有个疑问，我看了论文注意到最后是在ZJU-L5数据上面进行fine-tune，这个数据量是非常小的，总共才1000多张吧，这么小的数据量fine-tune的时候不会过拟合吗？因为我目前也做了单目深度估计模型的fine-tune，我混合了几个公开数据大概20多万张，训练了一个模型，然后加载我训练好的模型在我自己采集的2000多张数据上面fine-tune的时候过拟合了。我不清楚是由于数据量太小过拟合了，还是其他什么原因。如果可以的话，可以多说一些fine-tune的细节吗？比如一些fine-tune的示例代码。

关于tof数据问题

大佬您好，我发现训练时候模拟的tof数据是像素相对于相机平面的距离，但是真实tof传出来的数据是像素相对于相机中心的距离，您在处理tof数据的时候是将原始tof数据转换成了相对于相机平面的距离吗？

关于tof输入

您好，我想问一下tof输入是否有归一化？好像看代码依旧是真实深度值？

关于参数--simu_max_distance

感谢您的工作！
我在实验过程中发现，构建直方图的部分使用了参数--simu_max_distance=4，但是--max_depth=10。
想问一下为什么要这样设置，以及这两个参数的关系。

How to compute the relative rotation and translation between the color camera and the L5 sensor？

Hello，your work is excellent and inspiring. I want to build a system to build dataset likes yours. Will you released the code of compute the relative rotation and translation between the color camera and the L5 sensor? thanks

补充材料的下载链接失效的问题

感谢大佬的工作！
项目主页中的补充材料下载链接失效了，请问作者可以提供一个新的下载链接吗？
非常感谢！！

Setting up conda environment fails

conda create --name deltar --file requirements.txt does not work for me as it does not find a lot of packages. PackagesNotFoundError: The following packages are not available from current channels:
I installed conda on my wsl with Ubuntu-22.04.
Which conda channels do you use?

Can I retrain the same algorithm for a different resolution ToF sensor?

Hi Authors, Awesome paper, I have pretty much the question said in the subject. I have a ToF sensor which has a depth image resolution of 224x172? Can I retrain the algorithm for this sensor and use it for RGB-D mapping while this algorithm gives super resolved depth image in real time?

谢谢，期待您的回复