Giter VIP home page Giter VIP logo

ssen's Introduction

Spatial Semantic Embedding Network

This repository contains code for Spatial Semantic Embedding Network:Fast 3D Instance Segmentation with Deep Metric Learning by Dongsu Zhang, Junha Chun, Sang Kyun Cha, Young Min Kim. We are currently 3rd place on ScanNet 3D Instance Segmentation Challenge on AP. SSEN

Code for running pretrained network, visualizing the validation results on ScanNet, and training from scratch is available. The code has been tested on Ubuntu 18.04 and CUDA 10.0 environment. The following are guides for installation and execution of the code. Our project uses Minkowski Engine for constructing the sparse convolutional network.

Installation

Anaconda and environment installations

conda create -n ssen python=3.7
conda activate ssen
conda install openblas
pip install -r requirements.txt
conda install pytorch==1.3.1 torchvision==0.4.2 cudatoolkit=10.0 -c pytorch
pip install MinkowskiEngine==0.4.2

Evaluate

To visualize the outputs of our model,

  1. Download semantic segmentation model, instance segmentation model, example point cloud from google drive. The semantic segmentation model was pretrained on Spatio Temporal Segmentation, but it is slightly different from the Spatio Temporal Segmentation, since the semantic labels of the instance segmentation is slightly different from semantic segmentation in ScanNet. Both of the models are pretrained on train and validation set and the example_scene.pt preprocessed point cloud from test set.
  2. Place the files as below.
{repo_root}/
 - data/
   - models/
     - instance_model.pt
     - semantic_model.pt
   - example_scene.pt

Then run

python eval.py

Visualize validation results on ScanNet scenes

The validation results are on google drive Download the scenes and run

python visualize.py --scene_path {scene_path}

The semantic labels from the visualizations were ran with 10 rotations.

Train ScanNet from scratch

Download ScanNet dataset

Download ScanNet from homepage and place them under ./data/scannet. You need to sign the terms of use. The data folders should be placed as following.

{repo_root}/
 - data/
   - scannet/
     - scans/
     - scans_test/
     - scannet_combined.txt
     - scannet_train.txt
     - scannet_val.txt

Our model preprocesses all the point cloud into .pt file. To preprocess data, run

python -m utils.preprocess_data

Our network also requires predicted semantic labels for training. From Spatio Temporal Segmentationโ€™s model zoo, download ScanNet pretrained model (train only) and place them at ./data/models/MinkUNet34C-train-conv1-5.pth. We found that using transfer learning from semantic segmentation model greatly reduced time for training.

python -m utils.preprocess_semantic

The output tensor created has postfix of {scene_name}_semantic_segment.pt which is tensor of N x 8 where N is the number of points. Indices [0:3] are coordinates, [3:6] are color features, [7:8] are (ground truth) semantic and instance label and the [8] is the predicted semantic label by pretrained semantic label.

Train

To train the instance segmenation model, run

python main.py

For using transfer learning, download pretrained model from Spatio Temporal Segmentation.

For different training hyperparameters, you may change the configs in configs/ folder. For logging the training and visualizing the embedding space, run

tensorboard --logdir log --port 8123

Citing SSEN

If you use the SSEN, please cite:

@inproceedings{zhang2020ssen,
    author = {Zhang, Dongsu and Chun, Junha and Cha, Sang and Kim, Young Min},
    title = "{Spatial Semantic Embedding Network: Fast 3D Instance Segmentation with Deep Metric Learning}",
    booktitle={arXiv preprint arXiv:2007.03169},
    year={2020}
}

ssen's People

Contributors

96lives avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

ssen's Issues

How to visualize the instance segmentation results?

Hello. Thank you for your nice work.
I have a question. How to use the trained model to visualize the results? The trained model is different the instance segmentation model that you upload to google drive.
Hope your reply. Thank you.

Question about way to train instant segmentation model

First of all, thank you for sharing the nice code! I have a few questions ...

  • In the paper, it is specified that the semantic model requires pre-trained MinkowskiNet.
    However, for instant segmentation model, I'm not so clear whether I have to use pre-trained weight or not.
    Training an instant segmentation model from scratch is enough to reproduce the result? Which setting did you use in your paper?

  • In code, It seems like SSEN load pre-trained mikowskiNet weight at the start of training. However, I found out that loading weight files using init() code of SemanticSegModel or InstantSegModel class does not work properly (Maybe strict=False could be the reason since preprocessing_semnatic.py works well.).

  • Do you have any plan to release the evaluation code(Which compute mAP) for validation split?

License

First of all thanks for providing the awesome code.

Under which license is it being published? I.g. is it allowed to use it (or parts of it) for research, as well as commercial products?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.