Giter VIP home page Giter VIP logo

keras-mobile-detectnet's Introduction

Keras MobileDetectNet

Example

MobileDetectNet is an object detector which uses MobileNet feature extractor to predict bounding boxes. It was designed to be computationally efficient for deployment on embedded systems and easy to train with limited data. It was inspired by the simple yet effective design of DetectNet and enhanced with the anchor system from Faster R-CNN.

Network Arcitecture

Example

Training

python train.py --help

Label Format

MobileDetectNet uses the KITTI label format and directory structure. See here for more details

Preprocessing

Images are scaled between -1 and 1 to take advantage of transfer learning from pretrained MobileNet.

Anchors

MobileNet outputs a 7x7x256 from its last layer with a 224x224x3 input. In each of the 7x7 squares we place 9 anchors with combinations of the following settings:

  • Scale 1, 2, and 3
  • Aspect Ratio 1, 4/3, and 3/4

We set the anchor to 1 if a rectangle has > 0.3 IoU with the anchor. The bounding box generated is given to the box with the highest IoU over 0.3.

Due to the smaller network receptive size and low spacial dimension output of MobileNet, anchors partially outside the image can be used.

Augmentation

python3 test_augment.py --help

Training is done with imgaug utilizing Keras Sequences for multicore preprocessing and online data augmentation:

return iaa.Sequential([
    iaa.Fliplr(0.5),
    iaa.CropAndPad(px=(0, 112), sample_independently=False),
    iaa.Affine(translate_percent={"x": (-0.4, 0.4), "y": (-0.4, 0.4)}),
    iaa.SomeOf((0, 3), [
        iaa.AddToHueAndSaturation((-10, 10)),
        iaa.Affine(scale={"x": (0.9, 1.1), "y": (0.9, 1.1)}),
        iaa.GaussianBlur(sigma=(0, 1.0)),
        iaa.AdditiveGaussianNoise(scale=0.05 * 255)
    ])
])

Data augmentation is also used for validation for the purpose of making sure smaller objects are detected.

return iaa.Sequential([
    iaa.CropAndPad(px=(0, 112), sample_independently=False),
    iaa.Affine(translate_percent={"x": (-0.4, 0.4), "y": (-0.4, 0.4)}),
])

If a dataset contains many smaller bounding boxes or detecting smaller objects is not a concern, this should be adjusted for both train and validation augmentation.

Loss

Standard loss functions are used for everything other than the bounding box regression, which uses 10*class_(ij)*|y_pred_(ij) - y_true_(ij)| in order to not penalize the network for bounding box predictions without an object present and to normalize the loss against class loss. Class loss is binary crossentropy and region loss is mean absolute error.

Optimization

SGD with Warm Restarts seems to converge effectively for the application, but the standard Adam with LR=0.0001 will also work fine.

Inference

python test_inference.py --help

TensorRT

A TF-TRT helper function has been intergrated into the model which allows for easy inference acceleration on the nVidia Jetson platform. In model.py MobileDetectNet.tftrt_engine() will create a TensorRT accelerated Tensorflow graph. An example of how to use it is included in test_inference.py.

Performance

Using an FP16 TF-TRT graph the model runs at ~55 FPS on the Jetson Nano in mode 1 (5W). The performance doesn't seem to be effected running it in mode 0 (10W).

keras-mobile-detectnet's People

Contributors

csvance avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.