Automatically Annotating Indoor Images with CAD Models via RGB-D Scans

arXiv

This repository contains the code and data for the WACV23 paper "Automatically Annotating Indoor Images with CAD Models via RGB-D Scans".

CAD Model and Pose Annotations for ScanNet

CAD model and pose annotations for the ScanNet dataset are available here. Annotations are automatically generated using scannotate and HOC-Search. The quality of these annotations was verified in several verification passes, with manual re-annotations performed for outliers to ensure that final annotations are of high quality.

Installation Requirements and Setup

Clone this repository. Create and activate the virtual environment.

Note: We tested the code using PyTorch v1.7.1, PyTorch3D v0.6.2 and Cuda 10.1. The following installation guide is customized to these specific versions. You may have to install different versions according to your system specifications. For general information about how to install PyTorch3D see the official installation guide.

The runtime dependencies can be installed by running:

conda create -n scannotate python=3.9
conda activate scannotate
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch -c nvidia -c conda-forge
conda install -c fvcore -c iopath -c conda-forge fvcore iopath

For the CUB build time dependency, which you only need if you have CUDA older than 11.7, run:

conda install -c bottler nvidiacub

After installing the above dependencies, run the following commands:

pip install scikit-image matplotlib imageio plotly opencv-python open3d trimesh==3.10.2
conda install pytorch3d==0.6.2 -c pytorch3d

The corresponding environment file can be found at environment.yml.

Data Preprocessing

Download the ScanNet example here. Extract the folders extracted, preprocessed, scans and copy them to /data/ScanNet. Note that by downloading the example you agree to the ScanNet Terms of Use.

This data example additionally contains the already preprocessed input scan, e.g. 3D bounding box and instance segmentation for the target objects as well as the 3D scan transformed into the PyTorch3D coordinate system.
Download the ShapenetV2 dataset by signing up on the website. Extract ShapeNetCore.v2.zip to /data/ShapeNet.

Preprocessing ShapeNet CAD Models

To center and scale-normalize the downloaded ShapeNet CAD models, run:

bash run_shapenet_prepro.sh gpu=0

The gpu argument specifies which GPU should be used for processing. By default, code is executed on CPU.

After the above-mentioned steps the /data folder should contain the following directories:

- data
    - ScanNet
        - extracted
        - preprocessed
        - scans
    - ShapeNet
        - ShapeNet_preprocessed            
        - ShapeNetCore.v2

Run CAD Model Retrieval

Our pipeline for automatic CAD model retrieval consists of three steps. Results after each step will be saved to /results.

Note that we use PyTorch3D as rendering pipeline, hence all 3D data are transformed into the PyTorch3D coordinate system. Information about this coordinate system can be found here.

The configuration file is a simple text file in .ini format. Default values for configuration parameters are available in /config. Note that these are just an indication of what a "reasonable" value for each parameter could be, and are not meant as a way to reproduce any of the results from our paper.

1) CAD Model Retrieval

Run CAD model retrieval with:

bash run_cad_retrieval.sh config=ScanNet.ini gpu=0

The results will be written to /results/ScanNet/$scene_name/retrieval. Results contain the top5 retrieved CAD models for each target object, as well as the combined top1 results for all target objects. Additionally, the scene mesh without target objects is written to /results/ScanNet/$scene_name, which might be beneficial for visualization.

2) CAD Model Clustering and Cloning

Run CAD model clustering and cloning with:

bash run_cad_similarity.sh config=ScanNet.ini gpu=0

Results after CAD model clustering and cloning will be written to /results/ScanNet/$scene_name/similarity.

3) Pose Refinement

Run 9DOF differentiable pose refinement with:

bash run_cad_pose_refine.sh config=ScanNet.ini gpu=0

Final results after 9DOF pose refinement will be written to /results/ScanNet/$scene_name/refinement.

Citation

If you found this work useful for your publication, please consider citing us:

@inproceedings{ainetter2023automatically,
  title={Automatically Annotating Indoor Images with CAD Models via RGB-D Scans},
  author={Ainetter, Stefan and Stekovic, Sinisa and Fraundorfer, Friedrich and Lepetit, Vincent},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3156--3164},
  year={2023}
}

Question about bounding boxes orientation in pytorch3d space

Hello,

I was wondering how the basis of the bounding boxes were transformed in your preprocessing step.
Indeed, even though the example scene comes with a 'bbox_all.ply' mesh showing the bounding boxes on the preprocessed mesh, when I try to manually place boxes in Blender with the given center, scale and orientation stored in the scene pickle file, the boxes are off...

For instance, I tried to place two tables which have a non zero angle along the Y axis:

After trying to understand what was off, it seems like simply inverting the Y axis' rotation allows for a near perfect alignment of the placed boxes and the original bounding boxes:

It is visually near perfect, although when looking closely the corners of the box don't line up perfectly with yours. We can see that in the following screenshot where the top left corner doesn't really align and the edge at the top isn't perfectly parallel to yours. (Sorry for the random orange circle, it is a blender tool)

It's weird to me as I was thinking that it was some some of change in the coordinates systems that would induce that, but I don't think such a change would induce a "mirroring effect" of the Y axis rotation.
Am I doing something wrong here or is there a step I didn't understand?

For reproducability, I imported "mesh_py3d_textured.ply", "bbox_all.ply" in Blender, and with a small script read the center, basis and scale of these two tables. I then proceeded to place cubes in Blender at the given position with the given scale (halved as Blender spawns cubes that are of size 2) and the given rotation (converted from a 3x3 rotation matrix to euler angles in degrees).

stefan-ainetter / scannotate Goto Github PK

scannotate's Introduction

Automatically Annotating Indoor Images with CAD Models via RGB-D Scans

CAD Model and Pose Annotations for ScanNet

Installation Requirements and Setup

Data Preprocessing

Preprocessing ShapeNet CAD Models

Run CAD Model Retrieval

1) CAD Model Retrieval

2) CAD Model Clustering and Cloning

3) Pose Refinement

Citation

scannotate's People

Contributors

Stargazers

Watchers

Forkers

scannotate's Issues

Recommend Projects

Recommend Topics

Recommend Org