Giter VIP home page Giter VIP logo

comoda's Introduction

CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences

Official pytorch repository for the depth adaptation method described in
CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences
By Yevhen Kuznietsov, Marc Proesmans and Luc Van Gool at ESAT-PSI, KU Leuven

Environment setup

The main prerequisites are python 3 and pytorch 1.3 or compatible.
For conda users run:

conda create --name <env_name> --file  requirements.txt

The code might work with other package versions as well.

Data preparation

The method expects the following data format:

|-- <data_path>
    |-- <sequence_1>  
        |-- %10d.png  
        |-- %10d_kin.txt  
        |-- %10d_<segsuffix>.png  
    
    ...  
    |-- <sequence_n>

Where <data_path> is the root directory for the data;

<sequence> is a name of the sequence / scene as in test_scenes.txt;

%10d.png is a triplet of consecutive sequence frames stitched horizontally. The triplet index corresponds to the index of the central frame of the triplet;

%10d_kin.txt is a text file containing 4 values: velocity at the previous frame, time between the previous and the current frames, velocity at the current frame, time between the current and the next frame. It is possible to tun the adaptation without velocity supervision;

%10d_<segsuffix>.png is a triplet of binary masks for possibly moving objects corresponding to respective RGB triplet. These masks can be generated using either off-the-shelf semantic segmentation or object detection models (bounding box lvl is enough). We tested the bounding boxes generated by Mask-RCNN and so called YOLOv5m. It is possible to run the adaptation without semantic information.

The script for preparing the data is going to be be uploaded soon.

Running the method

Due to the licensing issues, running CoMoDA is currently a bit tricky. We hope to resolve them in the nearest future.

Folow these steps:

  1. These third party modules must be made available: layers and networks (the implementation of the latter may vary, e.g. if a different architecture is used);
  2. Download the pre-trained depth estimation model (e.g., Monodepth2). In future, we might provide the weights for the pre-trained scale-aware modification of Monodepth2 from the paper;
  3. Run
python CoMoDA.py --data_path <data_root_dir> --seq_file <seq_file> --seq_dir <seq_dir> --buf_path <buf_path> --log_dir <log_dir> --load_weights_folder <path_to_pretrained_model> --png

Where <seq_file> is the path to the file containing the list of videos or sequences adaptation was performed on (e.g., test_scenes.txt);

<seq_dir> is a dir, which contains a file with the ordered frame list for every adapted sequence or video (e.g., test_seqs);

<buf_path> is the path to the file with the pre-generated list of samples for experience replay (e.g., rb_1234.txt);

<log_dir> is the dir to write logs, the predictions will be saved to <log_dir>/preds.

The core of the method is model-agnostic, so the code can be modified for adapting any reasonable depth estimation model.

Evaluation on Eigen split of KITTI

  1. Run
python seqpred2eigen.py <sequence_list_file> <predictions_dir> <output_file_path>

    to generate inverse depths for Eigen test set.

    Where <sequence_list_file> is the path to the file containing the list of videos or sequences adaptation
    was performed on (e.g., test_scenes.txt);

    <predictions_dir> is the directory with the predictions produced by the method;

    <output_file_path> is the desired path and file name for depth predictions compatible with Eigen test set (e.g.,
    "eigen.npz").

  1. Run the evaluation script from Monodepth2 with the file from the previous step or eigen.npz as an input

Results on Eigen split of KITTI

The numbers for CoMoDA were obtained using ResNet18 backbone and image resolution of 640x192 px.

* means that no median rescaling is applied. Ref indicates if any kind of test-time model updates are performed.

Citation

If you find the depth adaptation method useful, please consider citing:

@InProceedings{comoda,
author    = {Kuznietsov, Yevhen and Proesmans, Marc and Van Gool, Luc},
title     = {CoMoDA: Continuous Monocular Depth Adaptation Using Past Experiences},
booktitle = {Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
month     = {January},
year      = {2021},
pages     = {2907-2917}
}

License

This work is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License.

Acknowledgements

This work was supported by Toyota, and was carried out at the TRACE Lab at KU Leuven (Toyota Research on Automated Cars in Europe - Leuven).

comoda's People

Contributors

yevkuzn avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.