Giter VIP home page Giter VIP logo

mlf-vo's Introduction

MLF-VO

This repo implements the network described in the ICRA2022 paper:

Self-Supervised Ego-Motion Estimation Based on Multi-Layer Fusion of RGB and Inferred Depth

Zijie Jiang, Hajime Taira, Naoyuki Miyashita and Masatoshi Okutomi

Please also visit our project page.

If you find our work useful for your research, please consider citing the following paper:

@inproceedings{jiang2022mlfvo,
  title={Self-Supervised Ego-Motion Estimation Based on Multi-Layer Fusion of RGB and Inferred Depth},
  author={Jiang, Zijie and Taira, Hajime and Miyashita, Naoyuki and Okutomi, Masatoshi},
  booktitle={2022 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2022}
}

Updates

  • 2022.05.20. Pretrained weights and demo for testing are open.

Requirements

Our experiments are conducted on a machine installed with Ubuntu 18.04, Pytorch 1.7.1, CUDA 10.1. You can install other required packages by:

pip install requirements.txt

Since our demo partly depends on Monodepth2 and CEN, clone this repository with --recurse option. Please also refer their original repositories.

Prepare dataset

Please download the KITTI odometry benchmark from their site and create a soft link by executing the following command:

ln -s /path/to/dataset/data_odometry_color/dataset data

Odometry evaluation

Download and place the pretrained weights of models following:

wget http://www.ok.sc.e.titech.ac.jp/res/MLF-VO/models.zip
unzip -d data models.zip
rm models.zip

The demo specifies several paths and configurations in configs/run_odometry.yaml. Please modify line 2, 6, or 9 to use your own paths to dataset, model weights, or outputs.

python run_odometry -c configs/run_odometry.yaml

The estimated trajectory will be exported to data/outputs/09.txt. Each line in the file represents the estimated camera, corresponding to each frame in the sequence, in the following format:

T11 T12 T13 T14 T21 T22 T23 T24 T31 T32 T33 T34
...

where T11, T12, ... , T34 are the elements of 3x4 camera transformation matrix (from camera to world coordinate system):

T11 T12 T13 T14
T21 T22 T23 T24
T31 T32 T33 T34

We recommend to use this toolbox to evaluate the inferred trajectory.

Training

Please checkout to the dev branch, and run

python train.py --data_path ~/Documents/datasets/kitti_odom/dataset --log_dir ~/Documents/MLF-VO/KITTI --model_name test --num_epochs 20 --dataset kitti_odom --split odom

Please change L55 in trainer.py to the wanted type of the PoseNet.

self.models['pose'] = posenet_type_dict[config_dict.model.type]()

Acknowledgement

We are grateful to the authors of Monodepth2 and CEN for publicly sharing their codes.

License

MLF-VO is released under GPLv3 License.

mlf-vo's People

Contributors

beniko95j avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.