Giter VIP home page Giter VIP logo

gradient-based-depth-map-fusion's Introduction

Multi-resolution Monocular Depth Map Fusion by Self-supervised Gradient-based Composition

This repository contains code and models for our paper:

[1] Yaqiao Dai, Renjiao Yi, Chenyang Zhu, Hongjun He, Kai Xu, Multi-resolution Monocular Depth Map Fusion by Self-supervised Gradient-based Composition, AAAI 2023

Changelog

  • [November 2022] Initial release of code and models

Setup

  1. Download the code.
 git clone https://github.com/YuiNsky/Gradient-based-depth-map-fusion.git
 cd Gradient-based-depth-map-fusion
  1. Set up dependencies: 2.1 Create conda virtual environment.

    conda env create -f GBDF.yaml
    conda activate GBDF

    2.2 Install pytorch in virtual environment.

    pip install torch==1.9.1+cu111 torchvision==0.10.1+cu111 torchaudio==0.9.1 -f https://download.pytorch.org/whl/torch_stable.html
  2. Download fusion model model_dict.pt and place in the folder models.

  3. Download one or more backbone pretrained model.

    ​ LeRes: res50.pth or res101.pth, place in the folder LeRes.

    ​ DPT: dpt_hybrid-midas-501f0c75.pt, place in the folder dpt/weights.

    ​ SGR: model.pth.tar , place in the folder SGR.

    ​ MiDas: model.pt, place in the folder MiDaS.

    ​ NeWCRFs: model_nyu.ckpt, place in the folder newcrfs.

  4. The code was tested with Python 3.8, PyTorch 1.9.1, OpenCV 4.6.0.

Usage

  1. Place one or more input images in the folder input.

  2. Run our model with a monocular depth estimation method:

    python run.py -p LeRes50
  3. The results are written to the folder output, every result is the combination of input image, backbone prediction and our prediction.

​ Use the flag -p to switch between different backbones. Possible options are LeRes50 (default), LeRes101, SGR, MiDaS, DPT and NeWCRFs.

Evaluation

Our evaluation contains three published high resolution datasets, which are Multiscopic, Middleburry2021 and Hypersim.

To evaluate our model on Multiscopic, you can download this dataset here. You need to download the test dataset, rename it as multiscopic and place it in folder datasets.

To evaluate our model on Middleburry2021, you can download this dataset here. You need to unzip the dataset, rename it as 2021mobile and place it in folder datasets.

To evaluate our model on Hypersim, you can download the whole dataset here. We also provide the evaluation subsets hypersim. You need to download the subsets and place it in folder datasets.

Then you can evaluate our fusion model with specified monocular depth estimation method and dataset:

python eval.py -p LeRes50 -d middleburry2021

Use the flag -p to switch between different backbones. Possible options are LeRes50 (default), SGR, MiDaS, DPT and NeWCRFs.

Use the flag -d to switch between different datasets. Possible options are middleburry2021 (default), multiscopic and hypersim.

Training

Our model was trained based on backbone LeRes and dataset HR-WSI, we use guided filter to preprocess the dataset and select high quality results as our training datasets based on canny edge detection. You need to download the preprocessed dataset HR and place it in the folder datasets.

Then you can train our fusion model using GPU:

python train.py

Citation

Please cite our papers if you use this code.

@article{dai2022multi,
  title={Multi-resolution Monocular Depth Map Fusion by Self-supervised Gradient-based Composition},
  author={Dai, Yaqiao and Yi, Renjiao and Zhu, Chenyang and He, Hongjun and Xu, Kai},
  journal={arXiv preprint arXiv:2212.01538},
  year={2022}
}

License

MIT License

gradient-based-depth-map-fusion's People

Contributors

yuinsky avatar

Stargazers

 avatar Yasine Deghaies avatar  avatar Ishan avatar  avatar Richard Meng avatar Nakkyu Baek avatar Min Hyeok Lee avatar  avatar LeonLu avatar Nuri Ryu avatar RocheL avatar reza shahriari avatar Jiaxi Zeng avatar hhhs avatar Javier avatar PENG TAO avatar Henry avatar Krishna Sai Tarun Chandrupatla avatar Striper avatar kinpz avatar Eresian avatar A G-g-ghost! avatar 毕 成 avatar Yu Yamaguchi avatar Pablo Vela avatar VirtualPierogi avatar  avatar Kaiser Roy avatar  avatar Jean-Philippe Deblonde avatar  avatar Kazuaki Tsuboi avatar Katsuya Hyodo avatar  avatar  avatar  avatar Qijin She avatar  avatar Renjiao Yi avatar

Watchers

 avatar

gradient-based-depth-map-fusion's Issues

Link to paper

Looks very good! I would love to read the paper. Do you have a link?

ZoeDepth

It would be cool if you add latest SOTA https://github.com/isl-org/ZoeDepth to this repo.
I've tried to it myself, but failed to run ZoeDepth with different resolutions (it seems that image always resized to the same size somewhere inside model)

about d3r and ord metric

Is there any possibility of releasing the d3r and ord evaluation code, I tried using the code directly from previous work and found that the results were not consistent.

Work with omnidata normal/depth DPT networks weights?

Excuse my ignorance, I would love to use the weights from omnidata, my understanding is that they use the same hybrid DPT network architecture to generate high resolution normal and depth maps.

I see that they use pytorch lightning and that would require converting the ckpt models to .pt, but I guess my bigger question is would this also work for the normal estimation network? I don't fully understand how this model works so I wanted to ask. Thanks!

Evaluation on KITTI

Hi,

I am trying to reproduce the KITTI experiment from your paper to compare to in my upcoming paper.
A couple of questions:

  1. Did you use the same fusion model (the one trained with LeRes on the HR dataset with the link in this github) on all datasets in the paper? (including KITTI)
  2. Can you share the specific details of the KITTI experiment (low-res and high-res dimension input to the fusion net, scale-shift (alignment) method for evaluation). Also, if you could share the evolution script for KITTI it will be much appreciated.

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.