Giter VIP home page Giter VIP logo

labelmaker-mix3d's Introduction

Mix3D: Out-of-Context Data Augmentation for 3D Scenes (3DV 2021)

Alexey Nekrasov*, Jonas Schult*, Or Litany, Bastian Leibe, Francis Engelmann

Mix3D is a data augmentation technique for 3D segmentation methods that improves generalization.

PWC

PyTorch Lightning Config: Hydra Code style: black

teaser



[Project Webpage] [arXiv] [Demo]

Demo

We also host a demo page, where you can upload your scene and in 30 minutes receive a prediction from the model! We created it to collect the most challenging scenes that could be captured in the wild and test the limits of the current SOTA models.

News

  • 18. April 2022: For the last Minkowski Engine release use branch or pull request
  • 12. October 2021: Code released.
  • 6. October 2021: Mix3D accepted for oral presentation at 3DV 2021. Paper on [arXiv].
  • 30. July 2021: Mix3D ranks 1st on the ScanNet semantic labeling benchmark.

Learderboard

Running the code

This repository contains the code for the analysis experiments of section 4.2. Motivation and Analysis Experiments from the paper For the ScanNet benchmark and Table 1 (main paper) we use the original SpatioTemporalSegmentation-Scannet code. To add Mix3D to the original MinkowskiNet codebase, we provide the patch file SpatioTemporalSegmentation.patch. With the patch file kpconv_tensorflow_mix3d.patch, you can add Mix3D to the official TensorFlow code release of KPConv on ScanNet and S3DIS. Analogously, you can patch the official PyTorch reimplementation of KPConv with the patch file kpconv_pytorch_mix3d.patch. Check the supplementary for more details.

Code structure

├── mix3d
│   ├── __init__.py
│   ├── __main__.py     <- the main file
│   ├── conf            <- hydra configuration files
│   ├── datasets
│   │   ├── outdoor_semseg.py       <- outdoor dataset
│   │   ├── preprocessing       <- folder with preprocessing scripts
│   │   ├── semseg.py       <- indoor dataset
│   │   └── utils.py        <- code for mixing point clouds
│   ├── logger
│   ├── models      <- MinkowskiNet models
│   ├── trainer
│   │   ├── __init__.py
│   │   └── trainer.py      <- train loop
│   └── utils
├── data
│   ├── processed       <- folder for preprocessed datasets
│   └── raw     <- folder for raw datasets
├── scripts
│   ├── experiments
│   │   └── 1000_scene_merging.bash
│   ├── init.bash
│   ├── local_run.bash
│   ├── preprocess_matterport.bash
│   ├── preprocess_rio.bash
│   ├── preprocess_scannet.bash
│   └── preprocess_semantic_kitti.bash
├── docs
├── dvc.lock
├── dvc.yaml        <- dvc file to reproduce the data
├── poetry.lock
├── pyproject.toml      <- project dependencies
├── README.md
├── saved       <- folder that stores models and logs
└── SpatioTemporalSegmentation-ScanNet.patch        <- patch file for original repo

Dependencies

The main dependencies of the project are the following:

python: 3.7
cuda: 10.1

For others, the project uses the poetry dependency management package. Everything can be installed with the command:

poetry install

Check scripts/init.bash for more details.

Data preprocessing

After the dependencies are installed, it is important to run the preprocessing scripts. They will bring scannet, matterport, rio, semantic_kitti datasets to a single format. By default, the scripts expect to find datsets in the data/raw/ folder. Check scripts/preprocess_*.bash for more details.

dvc repro scannet # matterport, rio, semantic_kitti

This command will run the preprocessing for scannet and will save the result using the dvc data versioning system.

Training and testing

Train MinkowskiNet on the scannet dataset without Mix3D with a voxel size of 5cm:

poetry run train

Train MinkowskiNet on the scannet dataset with Mix3D with a voxel size of 5cm:

poetry run train data/collation_functions=voxelize_collate_merge

BibTeX

@inproceedings{Nekrasov213DV,
  title     = {{Mix3D: Out-of-Context Data Augmentation for 3D Scenes}},
  author    = {Nekrasov, Alexey and Schult, Jonas and Litany, Or and Leibe, Bastian and Engelmann, Francis},
  booktitle = {{International Conference on 3D Vision (3DV)}},
  year      = {2021}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.