Giter VIP home page Giter VIP logo

rel_pose's Introduction

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs (3DV 2022)

Chris Rockwell, Justin Johnson and David F. Fouhey

Project Website | Paper | Supplemental

drawing

Overview

We propose three small modifications to a ViT via the Essential Matrix Module, enabling computations similar to the Eight-Point algorithm. The resulting mix of visual and positional features is a good inductive bias for pose estimation.

Installation and Demo

Anaconda install:

  • You'll need to build lietorch using c++; the c++ version needs to be compatable with the version g++ which built Pytorch.
  • My setup steps on a GTX 1080 Ti (gcc 8.4.0, CUDA 10.2):

Download & extract pretrained models replicating paper results:

wget https://fouheylab.eecs.umich.edu/~cnris/rel_pose/modelcheckpoints/pretrained_models.zip --no-check-certificate
unzip pretrained_models.zip

Demo script to predict pose on arbitrary image pair:

python demo.py --img1 demo/matterport_1.png --img2 demo/matterport_2.png --ckpt pretrained_models/matterport.pth
python demo.py --img1 demo/interiornet_t_1.png --img2 demo/interiornet_t_2.png --ckpt pretrained_models/interiornet_t.pth
python demo.py --img1 demo/streetlearn_t_1.png --img2 demo/streetlearn_t_2.png --ckpt pretrained_models/streetlearn_t.pth

Script to generate epipolar lines: (please modify pose / image pair inside script)

python generate_epipolar_imgs.py

Evaluation

Download and setup data following the steps of Jin et al. (Matterport) and Cai et al. (InteriorNet and StreetLearn).

  • StreetLearn and InteriorNet require rendering. See here for more details. You'll need to clone their repo for rendering as well as to get dataset metadata.
  • Matterport does not require rendering:
wget https://fouheylab.eecs.umich.edu/~jinlinyi/2021/sparsePlanesICCV21/split/mp3d_planercnn_json.zip
wget https://fouheylab.eecs.umich.edu/~jinlinyi/2021/sparsePlanesICCV21/data/rgb.zip
unzip mp3d_planercnn_json.zip; unzip rgb.zip
  • You'll need to update MATTERPORT_PATH and INTERIORNET_STREETLEARN_PATH to root directories once datasets are setup.

Evaluation scripts are as follows:

sh scripts/eval_matterport.sh
sh scripts/eval_interiornet.sh
sh scripts/eval_interiornet_t.sh
sh scripts/eval_streetlearn.sh
sh scripts/eval_streetlearn_t.sh

Training

Data setup is the same as in evaluation. Training scripts are as follows:

sh scripts/train_matterport.sh
sh scripts/train_interiornet.sh
sh scripts/train_interiornet_t.sh
sh scripts/train_streetlearn.sh
sh scripts/train_streetlearn_t.sh

Small Data

To train with small data, simply add the argument --use_mini_dataset to train script; eval is unchanged.

Pretrained models trained on the small InteriorNet and StreetLearn datasets are here.

wget https://fouheylab.eecs.umich.edu/~cnris/rel_pose/modelcheckpoints/pretrained_models_mini_dataset.zip --no-check-certificate
unzip pretrained_models_mini_dataset.zip

Coordinate Convention

  • Our Matterport-trained model follows the habitat camera coordinate convention. The circle represents "in towards you"

drawing

Citation

If you use this code for your research, please consider citing:

@inProceedings{Rockwell2022,
  author = {Chris Rockwell and Justin Johnson and David F. Fouhey},
  title = {The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs},
  booktitle = {3DV},
  year = 2022
}

Special Thanks

Thanks to Linyi Jin, Ruojin Cai and Zach Teed for help replicating and building upon their works. Thanks to Mohamed El Banani, Karan Desai and Nilesh Kulkarni for their many helpful suggestions. Thanks to Laura Fink and UM DCO for their tireless support with computing!

rel_pose's People

Contributors

crockwell avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.