Giter VIP home page Giter VIP logo

relpose-gnn's Introduction

Visual Camera Re-Localization using Graph Neural Networks and Relative Pose Supervision

Mehmet Özgür Türkoǧlu, Eric Brachmann, Konrad Schindler, Gabriel J. Brostow, Áron Monszpart - 3DV 2021.

[Paper on ArXiv] [Paper on IEEE Explore] [Presentation (long)] [Presentation (short)] [Poster]

🌌 Overview

Method overview

Relative pose regression. We combine the efficiency of image retrieval methods and the ability of graph neural networks to selectively and iteratively refine estimates to solve the challenging relative pose regression problem. Given a query image, we first find similar images to it using a differentiable image retrieval method NetVLAD. We preserve the diversity of neighbors by strided subsampling before building a fully connected Graph Neural Network (GNN). Node representations xi are initialized from ResNet34, and are combined using MLP-s into edge features eij. Finally, the relative pose regression layer maps the refined edge representations into relative poses between image pairs. Edge dropout is only applied at training time.

📈 Results

Trained on 7 training sets
Chess
pred. poses: relpose_gnn__multi_39_chess_0.09_2.9.npz
Fire
pred. poses: relpose_gnn__multi_39_fire_0.23_7.4.npz
Heads
pred. poses: relpose_gnn__multi_39_heads_0.13_8.5.npz
Office
pred. poses: relpose_gnn__multi_39_office_0.15_4.1.npz
Pumpkin
pred. poses: relpose_gnn__multi_39_pumpkin_0.17_3.3.npz
Kitchen
pred. poses: relpose_gnn__multi_39_redkitchen_0.20_3.6.npz
Stairs
pred. poses: relpose_gnn__multi_39_stairs_0.23_6.4.npz

✏️ 📄 Citation

If you find our work useful or interesting, please cite our paper:

@inproceedings{turkoglu2021visual,
  title={{Visual Camera Re-Localization Using Graph Neural Networks and Relative Pose Supervision}},
  author={T{\"{u}}rko\u{g}lu, Mehmet {\"{O}}zg{\"{u}}r and 
          Brachmann, Eric and 
          Schindler, Konrad and 
          Brostow, Gabriel and 
          Monszpart, \'{A}ron},
  booktitle={International Conference on 3D Vision ({3DV})},
  year={2021},
  organization={IEEE}
}

Reproducing results: 7-Scenes

Source code

export RELPOSEGNN="${HOME}/relpose_gnn" 
git clone --recurse-submodules --depth 1 https://github.com/nianticlabs/relpose-gnn.git ${RELPOSEGNN}

Setup

We use a Conda environment that makes it easy to install all dependencies. Our code has been tested on Ubuntu 20.04 with PyTorch 1.8.2 and CUDA 11.1.

  1. Install miniconda with Python 3.8.
  2. Create the conda environment:
    conda env create -f environment-cu111.yml
  3. Activate and verify the environment:
    conda activate relpose_gnn
    python -c 'import torch; \
               print(f"torch.version: {torch.__version__}"); \
               print(f"torch.cuda.is_available(): {torch.cuda.is_available()}"); \
               import torch_scatter; \
               print(f"torch_scatter: {torch_scatter.__version__}")'

Set more paths

export SEVENSCENES="/mnt/disks/data-7scenes/7scenes"
export DATADIR="/mnt/disks/data"
export SEVENSCENESRW="${DATADIR}/7scenes-rw"
export PYTHONPATH="${RELPOSEGNN}:${RELPOSEGNN}/python:${PYTHONPATH}"

I. Prepare the 7-Scenes dataset

  1. Download

    mkdir -p "${SEVENSCENES}" || (mkdir -p "${SEVENSCENES}" && chmod go+w -R "${SEVENSCENES}")
    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
      test -f "${SEVENSCENES}/${SCENE}.zip" || \
        (wget -c "http://download.microsoft.com/download/2/8/5/28564B23-0828-408F-8631-23B1EFF1DAC8/${SCENE}.zip" -O "$SEVENSCENES/$SCENE.zip" &)
    done
  2. Extract

    find "${SEVENSCENES}" -maxdepth 1 -name "*.zip" | xargs -P 7 -I fileName sh -c 'unzip -o -d "$(dirname "fileName")" "fileName"'
    find "${SEVENSCENES}" -mindepth 2 -name "*.zip" | xargs -P 7 -I fileName sh -c 'unzip -o -d "$(dirname "fileName")" "fileName"'

II. Image retrieval

For graph construction we incorporate the NetVLAD image retrieval CNN model. It is based on this repository: https://github.com/sfu-gruvi-3dv/sanet_relocal_demo. You'll need preprocessed .bin files (train_frames.bin, test_frames.bin) for each scene.

Pre-processed

Coming soon...

Generate yourself

for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
  python python/external/sanet_relocal_demo/seq_data/seven_scenes/scenes2seq.py \
    "${SEVENSCENES}/${SCENE}" \
    --dst-dir "${SEVENSCENESRW}/${SCENE}"
done

III. Graph generation

Before starting to train the model, train and test graphs should be generated to speed up the dataloaders, and not have to run NN search during training.

III.A. Pre-processed

  1. Download

    • Test

      mkdir -p "${SEVENSCENESRW}" || (mkdir -p "${SEVENSCENESRW}" && chmod go+w -R "${SEVENSCENESRW}")
      for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
        wget -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/data/${SCENE}_fc8_sp5_test.tar" \
             -O "${SEVENSCENESRW}/${SCENE}_fc8_sp5_test.tar"
      done
    • Train

      for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
        wget -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/data/${SCENE}_fc8_sp5_train.tar" \
             -O "${SEVENSCENESRW}/${SCENE}_fc8_sp5_train.tar"
      done
  2. Extract

    (cd "${SEVENSCENESRW}"; \
     find "${SEVENSCENESRW}" -mindepth 1 -maxdepth 1 -name "*.tar" | xargs -P 7 -I fileName sh -c 'tar -I pigz -xvf "fileName"')

III.B. Generate yourself

  • For testing a model

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python python/niantic/datasets/dataset_7Scenes_multi.py \
         "${SCENE}" \
         "test" \
         --data-path "${SEVENSCENES}" \
         --graph-data-path "${SEVENSCENESRW}" \
         --seq-len 8 \
         --sampling-period 5 \
         --gpu 0
    done
  • For training a multi-scene model (Table 1. in paper)

    python python/niantic/datasets/dataset_7Scenes_multi.py \
      multi \
      "train" \
      --data-path "${SEVENSCENES}" \
      --graph-data-path "${SEVENSCENESRW}" \
      --seq-len 8 \
      --sampling-period 5 \
      --gpu 0
  • For training a single-scene model (Table 1. in supplementary)

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python python/niantic/datasets/dataset_7Scenes_multi.py \
         "${SCENE}" \
         train \
         --data-path "${SEVENSCENES}" \
         --graph-data-path "${SEVENSCENESRW}" \
         --seq-len 8 \
         --sampling-period 5 \
         --gpu 0
    done

Evaluation

Pre-trained

  1. Download pre-trained model trained with entire 7-Scenes training scenes (Table 1 in the paper)

    wget \
     -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/models/relpose_gnn__multi_39.pth.tar" \
     -O "${DATADIR}/relpose_gnn__multi_39.pth.tar"
  2. Evaluate on each 7scenes test scene

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python -u ${RELPOSEGNN}/python/niantic/testing/test.py \
         --dataset-dir "${SEVENSCENES}" \
         --test-data-dir "${SEVENSCENESRW}" \
         --weights "${DATADIR}/relpose_gnn__multi_39.pth.tar" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --test-scene "${SCENE}"
    done
  3. Download pre-trained models trained with 7-Scenes' 6 training scenes (Table 2 in the paper)

    wget \
     -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/models/6Scenes_${SCENE}_epoch_039.pth.tar" \
     -O "${DATADIR}/6Scenes_${SCENE}_epoch_039.pth.tar"
  4. Evaluate each model on a corresponding remaining scene

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python -u ${RELPOSEGNN}/python/niantic/testing/test.py \
         --dataset-dir "${SEVENSCENES}" \
         --test-data-dir "${SEVENSCENESRW}" \
         --weights "${DATADIR}/6Scenes_${SCENE}_epoch_039.pth.tar" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --test-scene "${SCENE}"
    done
  5. Download pre-trained models trained with 7-Scenes' single training scene (Table 1. in the supp.)

    wget \
     -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/models/1Scenes_${SCENE}_epoch_039.pth.tar" \
     -O "${DATADIR}/1Scenes_${SCENE}_epoch_039.pth.tar"
  6. Evaluate each model on the same scene

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python -u ${RELPOSEGNN}/python/niantic/testing/test.py \
         --dataset-dir "${SEVENSCENES}" \
         --test-data-dir "${SEVENSCENESRW}" \
         --weights "${DATADIR}/1Scenes_${SCENE}_epoch_039.pth.tar" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --test-scene "${SCENE}"
    done

Train yourself

  1. 7 scenes training (Table 1. in the paper)

    python -u ${RELPOSEGNN}/python/niantic/training/train.py \
      --dataset-dir "${SEVENSCENES}" \
      --train-data-dir "${SEVENSCENESRW}" \
      --test-data-dir "${SEVENSCENESRW}" \
      --save-dir "${DATADIR}" \
      --gpu 0 \
      --experiment 0 \
      --test-scene multi
  2. 6 scenes training (Table 2. in the paper)

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python -u ${RELPOSEGNN}/python/niantic/training/train.py \
         --dataset-dir "${SEVENSCENES}" \
         --train-data-dir "${SEVENSCENESRW}" \
         --test-data-dir "${SEVENSCENESRW}" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --experiment 1 \
         --test-scene "${SCENE}"
    done
    
  3. Single scene training (Table 1. in the supp.)

    for SCENE in "chess" "fire" "heads" "office" "pumpkin" "redkitchen" "stairs"; do
       python -u ${RELPOSEGNN}/python/niantic/training/train.py \
         --dataset-dir "${SEVENSCENES}" \
         --train-data-dir "${SEVENSCENESRW}" \
         --test-data-dir "${SEVENSCENESRW}" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --experiment 2 \
         --train-scene "${SCENE}" \
         --test-scene "${SCENE}" \
         --max-epoch 100
    done
    
    

Reproducing results: Cambridge Landmarks

Graph generation

Before starting to train the model, train and test graphs should be generated to speed up the dataloaders, and not have to run NN search during training.

A. Pre-processed

  1. Download

    • Test
      mkdir -p "${CAMBRIDGERW}" || (mkdir -p "${CAMBRIDGERW}" && chmod go+w -R "${CAMBRIDGERW}")
      for SCENE in "KingsCollege", "OldHospital", "StMarysChurch", "ShopFacade", "GreatCourt"; do
        wget -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/data/${SCENE}_fc8_sp3_test.tar" \
             -O "${CAMBRIDGERW}/${SCENE}_fc8_sp3_test.tar"
      done
    • Train
      for SCENE in "KingsCollege", "OldHospital", "StMarysChurch", "ShopFacade", "GreatCourt"; do
        wget -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/data/${SCENE}_fc8_sp3_train.tar" \
             -O "${CAMBRIDGERW}/${SCENE}_fc8_sp3_train.tar"
      done
  2. Extract

    (cd "${CAMBRIDGERW}"; \
     find "${CAMBRIDGERW}" -mindepth 1 -maxdepth 1 -name "*.tar" | xargs -P 7 -I fileName sh -c 'tar -I pigz -xvf "fileName"')

Evaluation

Pre-trained

  1. Download pre-trained model trained with entire Cambridge training scenes (Table 3 in the paper)
    wget \
     -c "https://storage.googleapis.com/niantic-lon-static/research/relpose-gnn/models/relpose_gnn_cambridge_epoch_149.pth.tar" \
     -O "${DATADIR}/relpose_gnn_cambridge_epoch_149.pth.tar"
  2. Evaluate on each Cambridge test scene
    for SCENE in "KingsCollege", "OldHospital", "StMarysChurch", "ShopFacade", "GreatCourt"; do
       python -u ${RELPOSEGNN}/python/niantic/testing/test.py \
         --dataset-dir "${CAMBRIDGE}" \
         --test-data-dir "${CAMBRIDGERW}" \
         --weights "${DATADIR}/relpose_gnn_cambridge_epoch_149.pth.tar" \
         --save-dir "${DATADIR}" \
         --gpu 0 \
         --test-scene "${SCENE}"
    done

Train yourself

  1. Cambridge training (Table 3 in the paper)
    python -u ${RELPOSEGNN}/python/niantic/training/train.py \
      --dataset-dir "${CAMBRIDGE}" \
      --train-data-dir "${CAMBRIDGERW}" \
      --test-data-dir "${CAMBRIDGERW}" \
      --save-dir "${DATADIR}" \
      --gpu 0 \
      --experiment 0 \
      --test-scene multi

🤝 Acknowledgements

We would like to thank Galen Han for his extensive help with this project.
We also thank Qunjie Zhou, Luwei Yang, Dominik Winkelbauer, Torsten Sattler, and Soham Saha for their help and advice with baselines.

👩‍⚖️ License

Copyright © Niantic, Inc. 2021. Patent Pending. All rights reserved. Please see the license file for terms.

relpose-gnn's People

Contributors

0zgur0 avatar amonszpart avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

relpose-gnn's Issues

Image query

Thank you for your great job, but I still have some questions. About the image retrieval process, including the query image in the database? Or enquiry to whether the images including the image itself

Node order invariance in generating initial edge features

edge_feat = torch.cat(
(e[torch.min(edge_index, 0)[0], ...],
e[torch.max(edge_index, 0)[0], ...]),
dim=1)

Hello,
Thanks for sharing the code for this cool paper. I was going through your implementation and can't really understand why edge direction invariance is introduced in creating the initial edge features.

Since we are regressing relative pose, shouldn't we ensure that features are different for different edge directions? The later edge updates in the GNN do account for the edge direction. But why is this invariance explicitly introduced in creating the initial edge features?

unable to reproduce paper's results

Hi,

thanks alot for you great paper.
I am trying to run your pre-trained model over 7scenes.

when I try to run the original code I get below error:
RuntimeError: Error(s) in loading state_dict for PoseNetX_R2:
Unexpected key(s) in state_dict: "gnn2.mlp.0.weight", "gnn2.mlp.0.bias", "gnn2.mlp.2.weight", "gnn2.mlp.2.bias", "gnn2.mlp_updating.0.weight", "gnn2.mlp_updating.0.bias", "gnn2.mlp_updating.2.weight", "gnn2.mlp_updating.2.bias", "gnn2.edge_model.edge_mlp.0.weight", "gnn2.edge_model.edge_mlp.0.bias", "gnn2.edge_model.edge_mlp.2.weight", "gnn2.edge_model.edge_mlp.2.bias", "gnn2.att.g.weight", "gnn2.att.g.bias", "gnn2.att.theta.weight", "gnn2.att.theta.bias", "gnn2.att.phi.weight", "gnn2.att.phi.bias", "gnn2.att.W.weight", "gnn2.att.W.bias".

In order for the model to load properly, I changed in test.py line 162 from R2 to R3

from:
elif self.model_name == 'R3':
self.model = PoseNetX_R2(
to:
self.model = PoseNetX_R3(

and then model loads properly, but I am getting different results than in paper, for example for chess scene I get: 0.31m/14.18 degrees vs. paper 0.08m/2,7 degrees

python -u ${RELPOSEGNN}/python/niantic/testing/test.py
--dataset-dir "${SEVENSCENES}"
--test-data-dir "${SEVENSCENESRW}"
--weights "${DATADIR}/relpose_gnn__multi_39.pth.tar"
--save-dir "${DATADIR}"
--gpu 0
--test-scene "${SCENE}"

2023-09-09 18:16:05.220 | INFO | main:init:137 - Dataset: 7Scenes
2023-09-09 18:16:05.221 | INFO | main:init:138 - Test scene: chess
2023-09-09 18:16:05.221 | INFO | main:init:139 - Test data dir: relpose_gnn/7scenes-rw/
2023-09-09 18:16:05.221 | INFO | main:init:140 - Test dataset size: 2000
2023-09-09 18:16:05.221 | INFO | main:init:141 - Images sizes: 256, 341
2023-09-09 18:16:05.221 | INFO | main:init:142 - Number of nodes in the graph: 8, fc
2023-09-09 18:16:05.221 | INFO | main:init:143 - Number of nodes in the graph - test: 8 fc
2023-09-09 18:16:05.221 | INFO | main:init:145 - Use RP loss: True
2023-09-09 18:16:05.221 | INFO | main:init:146 - Use RP model: True
2023-09-09 18:16:05.221 | INFO | main:init:147 - srx: 0.0
2023-09-09 18:16:05.221 | INFO | main:init:148 - srq: -3
2023-09-09 18:16:05.221 | INFO | main:init:149 - edge_keep_factor: 0.5
2023-09-09 18:16:05.221 | INFO | main:init:150 - gnn_recursion: 2
2023-09-09 18:16:05.222 | INFO | main:init:151 - droprate: 0.5
2023-09-09 18:16:05.222 | INFO | main:init:152 - gpu: 0
2023-09-09 18:16:06.938 | INFO | main:init:175 - Num parameters: 118861132
2023-09-09 18:16:53.399 | INFO | main:eval_RP:275 - [Scene: chess, set: test, relpose_gnn__multi_39.pth.tar] Error in translation: median 0.31 m, mean 0.39 m Error in rotation: median 14.18 degrees, mean 16.58 degrees

any ideas?
what is the difference with PoseNetX_R2 and PoseNetX_R3?
are you sure you uploaded the correct pre-trainefd model to the git?

Thanks,
Ofer

Question about pose normalization

Thanks for sharing the code for the paper!
However, when going through the code about data loading, I feel confused about the normalization procedure:
In python\niantic\datasets\seven_scenes.py Line 121, the poses seem to undergo a normalization operation, why?
Will it influence the evaluation since the scale of camera translation is modified?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.