Giter VIP home page Giter VIP logo

scannotate's Introduction

Automatically Annotating Indoor Images with CAD Models via RGB-D Scans


arXiv

This repository contains the code and data for the WACV23 paper "Automatically Annotating Indoor Images with CAD Models via RGB-D Scans".

CAD Model and Pose Annotations for ScanNet

CAD model and pose annotations for the ScanNet dataset are available here. Annotations are automatically generated using scannotate and HOC-Search. The quality of these annotations was verified in several verification passes, with manual re-annotations performed for outliers to ensure that final annotations are of high quality.

Installation Requirements and Setup

  • Clone this repository. Create and activate the virtual environment.

Note: We tested the code using PyTorch v1.7.1, PyTorch3D v0.6.2 and Cuda 10.1. The following installation guide is customized to these specific versions. You may have to install different versions according to your system specifications. For general information about how to install PyTorch3D see the official installation guide.

The runtime dependencies can be installed by running:

conda create -n scannotate python=3.9
conda activate scannotate
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch -c nvidia -c conda-forge
conda install -c fvcore -c iopath -c conda-forge fvcore iopath

For the CUB build time dependency, which you only need if you have CUDA older than 11.7, run:

conda install -c bottler nvidiacub

After installing the above dependencies, run the following commands:

pip install scikit-image matplotlib imageio plotly opencv-python open3d trimesh==3.10.2
conda install pytorch3d==0.6.2 -c pytorch3d

The corresponding environment file can be found at environment.yml.

Data Preprocessing

  • Download the ScanNet example here. Extract the folders extracted, preprocessed, scans and copy them to /data/ScanNet. Note that by downloading the example you agree to the ScanNet Terms of Use.

    This data example additionally contains the already preprocessed input scan, e.g. 3D bounding box and instance segmentation for the target objects as well as the 3D scan transformed into the PyTorch3D coordinate system.

  • Download the ShapenetV2 dataset by signing up on the website. Extract ShapeNetCore.v2.zip to /data/ShapeNet.

Preprocessing ShapeNet CAD Models

To center and scale-normalize the downloaded ShapeNet CAD models, run:

bash run_shapenet_prepro.sh gpu=0

The gpu argument specifies which GPU should be used for processing. By default, code is executed on CPU.

After the above-mentioned steps the /data folder should contain the following directories:

- data
    - ScanNet
        - extracted
        - preprocessed
        - scans
    - ShapeNet
        - ShapeNet_preprocessed            
        - ShapeNetCore.v2

Run CAD Model Retrieval

Our pipeline for automatic CAD model retrieval consists of three steps. Results after each step will be saved to /results.

Note that we use PyTorch3D as rendering pipeline, hence all 3D data are transformed into the PyTorch3D coordinate system. Information about this coordinate system can be found here.

The configuration file is a simple text file in .ini format. Default values for configuration parameters are available in /config. Note that these are just an indication of what a "reasonable" value for each parameter could be, and are not meant as a way to reproduce any of the results from our paper.

1) CAD Model Retrieval

Run CAD model retrieval with:

bash run_cad_retrieval.sh config=ScanNet.ini gpu=0

The results will be written to /results/ScanNet/$scene_name/retrieval. Results contain the top5 retrieved CAD models for each target object, as well as the combined top1 results for all target objects. Additionally, the scene mesh without target objects is written to /results/ScanNet/$scene_name, which might be beneficial for visualization.

2) CAD Model Clustering and Cloning

Run CAD model clustering and cloning with:

bash run_cad_similarity.sh config=ScanNet.ini gpu=0

Results after CAD model clustering and cloning will be written to /results/ScanNet/$scene_name/similarity.

3) Pose Refinement

Run 9DOF differentiable pose refinement with:

bash run_cad_pose_refine.sh config=ScanNet.ini gpu=0

Final results after 9DOF pose refinement will be written to /results/ScanNet/$scene_name/refinement.

Citation

If you found this work useful for your publication, please consider citing us:

@inproceedings{ainetter2023automatically,
  title={Automatically Annotating Indoor Images with CAD Models via RGB-D Scans},
  author={Ainetter, Stefan and Stekovic, Sinisa and Fraundorfer, Friedrich and Lepetit, Vincent},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={3156--3164},
  year={2023}
}

scannotate's People

Contributors

stefan-ainetter avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

whuhxb

scannotate's Issues

Regarding Rendering Scannet Pointcloud

Hi,

Thank you so much for your wonderful work!

We are trying to use pointcloud rendering code for scannet from your repository. We found the point rasterizer and SfMPerspectiveCamerasScanNet but I don't fully understand the connection between inputs to these classes and scannet extrinsics. Could you let us know what processing you do over poses/intrinsics from scannet before passing to these functions? Pytorch3D's documentation has been quite confusing, any help from you would be very appreciated!

Issue with Creating PKL File for SCANnotate from Preprocessed ScanNet Data

I am currently working on a project where I need to apply SCANnotate to a point cloud dataset that I have already classified using RandLA-Net. My point cloud data is derived from RGB images, and I have all the pose matrices for the camera poses as well as the depth images.

My main issue is understanding how to create the required PKL file to make it compatible with SCANnotate, similar to the preprocessed ScanNet information.

I have reviewed the documentation but am struggling to understand the exact structure and content needed for the PKL file. Could you provide guidance or an example of how to construct this PKL file? Any help or pointers to relevant parts of the documentation would be greatly appreciated.

Thank you!

Question about bounding boxes orientation in pytorch3d space

Hello,

I was wondering how the basis of the bounding boxes were transformed in your preprocessing step.
Indeed, even though the example scene comes with a 'bbox_all.ply' mesh showing the bounding boxes on the preprocessed mesh, when I try to manually place boxes in Blender with the given center, scale and orientation stored in the scene pickle file, the boxes are off...

For instance, I tried to place two tables which have a non zero angle along the Y axis:
bbox_base

After trying to understand what was off, it seems like simply inverting the Y axis' rotation allows for a near perfect alignment of the placed boxes and the original bounding boxes:
bbox_modified
It is visually near perfect, although when looking closely the corners of the box don't line up perfectly with yours. We can see that in the following screenshot where the top left corner doesn't really align and the edge at the top isn't perfectly parallel to yours. (Sorry for the random orange circle, it is a blender tool)
corner_weird

It's weird to me as I was thinking that it was some some of change in the coordinates systems that would induce that, but I don't think such a change would induce a "mirroring effect" of the Y axis rotation.
Am I doing something wrong here or is there a step I didn't understand?

For reproducability, I imported "mesh_py3d_textured.ply", "bbox_all.ply" in Blender, and with a small script read the center, basis and scale of these two tables. I then proceeded to place cubes in Blender at the given position with the given scale (halved as Blender spawns cubes that are of size 2) and the given rotation (converted from a 3x3 rotation matrix to euler angles in degrees).

ScanNet Preprocessing Scene

Hello,

Thanks for answering me and uploading the code with promptness.

Currently, I am working on instance detection and replacing them with CAD models. Until now, I detected the instances using the Mask3D algorithm using our own PointClouds scanned and reconstructed using BundleFusion, similar to the ScanNet dataset.

Thus, my question is: How to create the preprocessed ScanNet information (extracted and preprocessed) to replicate this information for our own PointClouds?

Many thanks for considering my request.

Waiting for the code!

First of all, thank you for your great work and congratulations on the paper!

When are you planning to release the code? Looking forward to it!

Code Release

Hello. First of all, congrats on your amazing job. Currently, I am working on the alignment between CAD models and RGB-D scans.

For this reason, I am interested in when you plan to release the code.

I appreciate any help you can provide.

Code for ScanNet preprocessed

Hello,

Thanks for your nice work! I had one question - how do you preprocess the ScanNet dataset as in provided example, specifically preprocessed folder and if the code for that could be made publicly available? Please let me know, thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.