Giter VIP home page Giter VIP logo

llt's Introduction

Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

This repository represents the official implementation of the ECCV'2022 paper: "Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation" by Ziming Wang*, Xiaoliang Huo*, Zhenghao Chen, Jing Zhang, Lu Sheng† and Dong Xu (*equal contributions, †corresponding author)

Contact

If you have any questions, please let us know:

Introduction

In this work, we propose a new Geometry-Aware Visual Feature Extractor (GAVE) that employs multi-scale local linear transformation to progressively fuse these two modalities, where the geometric features from the depth data act as the geometry-dependent convolution kernels to transform the visual features from the RGB data. The resultant visual-geometric features are in canonical feature spaces with alleviated visual dissimilarity caused by geometric changes, by which more reliable correspondence can be achieved. The proposed GAVE module can be readily plugged into recent RGB-D point cloud registration framework.

pipeline

Main Results

We train our module under two different setups:

  • Training on the 3D Match dataset for 14 epochs, and testing on the ScanNet test set.
  • Training on the ScanNet dataset for 1 epoch, and testing on the ScanNet test set.

The overall results are shown in the chart below along with the checkpoints' download links:

Train Set Rotation Translation Chamfer Distance Ckpts
accuracy error accuracy error accuracy error
10° 45° Mean Med. 5cm 10cm 25cm Mean Med. 1mm 5mm 10mm Mean Med.
3D Match 93.4 96.5 98.8 3.0 0.9 76.9 90.2 96.7 6.4 2.4 86.4 95.1 96.8 5.3 0.1 ckpt
ScanNet 95.5 97.6 99.1 2.5 0.8 80.4 92.2 97.6 5.5 2.2 88.9 96.4 97.6 4.6 0.1 ckpt

Here are several visualization examples of our method comparing to our baseline and Ground Truth:

This code has been trained/tested on:

  • Python 3.6.13, PyTorch 1.7.1, CUDA 11.0.3, gcc 9.3.0, Tesla V100-PCIE-32GB

Environment Setup

# create a conda environment and activate it
conda create --name GAVE python=3.10
conda activate GAVE

# install pytorch (any version that match your CUDA version)
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
conda install matplotlib tensorboard

# install pytorch3d
conda install pytorch3d -c pytorch3d

# install open3d
pip -m pip install open3d

# install Minkowski Engine 
git clone https://github.com/NVIDIA/MinkowskiEngine
cd MinkowskiEngine
python setup.py install --blas=openblas --blas_include_dirs=${CONDA_PREFIX}/include

# install other dependencies
python -m pip install nibabel opencv-python easydict pre-commit

If any error happens when installing the Minkowski Engine, please follow the official instruction.

Datasets Setup

Following the UnsupervisedR&R, we ues two datasets for training in our work: 3DMatch and ScanNet, and evaluate only in ScanNet test set.

For the download and pre-processing procedure, please refer to UR&R's instruction.

After downloading the datasets, make sure to update the paths in GAVE/datasets/builder.py with the dataaset root directories.

Get Started

Training

You can modify the settings of models in GAVE/configs/config.py and appoint the GPU in train.py:

# Training on 3DMatch dataset
python -u train.py --mode train --config_path 3DMatch.yaml
# Training on ScanNet dataset
python -u train.py --mode train --config_path ScanNet.yaml

Inference

You can evaulate checkpoints using the following command:

python -u evaluate.py mine --checkpoint ckpt_path/ckpt_name.pkl --progress_bar --boost_alignment

Acknowledgments

This repo heavily benefits from the UnsupervisedR&R. We would like to thank Mohamed El for his excellent work.

Citation

@inproceedings{wang2022improving,
  title={Improving rgb-d point cloud registration by learning multi-scale local linear transformation},
  author={Wang, Ziming and Huo, Xiaoliang and Chen, Zhenghao and Zhang, Jing and Sheng, Lu and Xu, Dong},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XXXII},
  pages={175--191},
  year={2022},
  organization={Springer}
}

llt's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

baitianyu-kun

llt's Issues

Code Please

Thank you for your amazing work. When will the code be released?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.