Giter VIP home page Giter VIP logo

mask_r-cnn's Introduction

Mask R-CNN for Object Detection and Segmentation

This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow based on Matterport's version. The model generates bounding boxes and segmentation masks for each instance of an object in the image, with Feature Pyramid Network (FPN) + ResNet-101 as backbones.

Features:

  • Mask R-CNN implementation built on TensorFlow and Keras.
  • Model training with data augmentation and various configuration.
  • Custom mAP callback during the training process for initial evaluation.
  • Training with 5-fold cross-validation strategy.
  • Evaluation with mean Average Precision (mAP) on COCO metric [email protected]:.05:.95 and PASCAL VOC metric [email protected]. For more information, read here.
  • Jupyter notebooks examples to visualize the detection pipeline at every step and understand more about Mask R-CNN.
  • Convert predicted results to the VGG annotation format to expand the dataset for further training. Achieved a finer, more accurate mask with less labeling time than by handwork (~3 quarters in large-scale).
    • For instance, the comparison between the annotated mask by handwork (left) and by model's prediction (right)

Structure:

It is recommended to organize the dataset folder, testing image/video folder and model weight under same folder as the below structure:

├── notebooks                                 # several notebooks from Matterport's Mask R-CNN
├── dataset                                   # place the dataset here
│   └── <dataset_name>              
│       ├── train
│       │   ├── <image_file 1>                # accept .jpg or .jpeg file
│       │   ├── <image_file 2>
│       │   ├── ...
│       │   └── via_export_json.json          # corresponded single annotation file, must be named like this
│       ├── val
│       └── test         
├── logs                                      # log folder
├── mrcnn                                     # model folder
├── test                                      # test folder
│   ├── image
│   └── video
├── trained_weight                            # pre-trained model and trained weight folder
|   ...
├── environment.yml                           # environment setup file
├── README.md
├── dataset.py                                # dataset configuration
├── evaluation.py                             # weight evaluation
└── training.py                               # training model

Usage:

  • Conda environment setup:
        conda env create -f environment.yml
        conda activate mask-rcnn
  • Training:
    * Train a new model starting from pre-trained weights
        python3 training.py --dataset=/path/to/dataset --weight=/path/to/pretrained/weight.h5
    
    * Resume training a model
        python3 training.py --dataset=/path/to/dataset --continue_train=/path/to/latest/weights.h5
  • Evaluating:
    python3 evaluation.py --dataset=/path/to/dataset --weights=/path/to/pretrained/weight.h5
  • Testing
    * Image
        python3 image_detection.py --dataset=/path/to/dataset --weights=/path/to/pretrained/weight.h5 --image=/path/to/image/directory
    
    * Video (update weight path and dataset path in mrcnn.visualize_cv2)
        python3 video_detection.py --video_path = /path/to/testing/video/dir/
    
  • Annotation generating:
    python3 annotating_generation.py --dataset=/path/to/dataset --weights=/path/to/pretrained/weight.h5 --image=/path/to/image/directory
    
  • View training plot:
    tensorboard --logdir=logs/path/to/trained/dir
    

Annotation format:

Annotated image for this implementation is created by VGG Image Annotator with format structure:

{ 'filename': '<image_name>.jpg',
           'regions': {
               '0': {
                   'region_attributes': {},
                   'shape_attributes': {
                       'all_points_x': [...],
                       'all_points_y': [...],
                       'name': <class_name>}},
               ... more regions ...
           },
           'size': <image_size>
}

Notes:

  • This implementation is well worked under TensorFlow 1.14.0, Keras 2.2.5, Cuda 10.0.130 and CuDNN 7.6.5
  • dataset.py must be modify for other custom dataset.
  • Futher training parameters configuration can be read in here.
  • Pre-trained weight on COCO: download here and place it in trained_weight\

mask_r-cnn's People

Contributors

quanghuy0497 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.