Giter VIP home page Giter VIP logo

keras-inference-time-optimizer's Introduction

Keras inference time optimizer (KITO)

This code takes on input trained Keras model and optimize layer structure and weights in such a way that model became much faster (~10-30%), but works identically to initial model. It can be extremely useful in case you need to process large amount of images with trained model. Reduce operation was tested on all Keras models zoo. See comparison table below.

Installation

pip install kito

How it works?

In current version it only apply single type of optimization: It reduces Conv2D + BatchNormalization set of layers to single Conv2D layer. Since Conv2D + BatchNormalization is very common set of layers, optimization works well almost on all modern CNNs for image processing.

Also supported:

  • DepthwiseConv2D + BatchNormalization => DepthwiseConv2D
  • SeparableConv2D + BatchNormalization => SeparableConv2D
  • Conv2DTranspose + BatchNormalization => Conv2DTranspose
  • Conv3D + BatchNormalization => Conv3D

How to use

Typical code:

model.fit(...)
...
model.predict(...)

must be replaced with following block:

from kito import reduce_keras_model
model.fit(...)
...
model_reduced = reduce_keras_model(model)
model_reduced.predict(...)

So basically you need to insert 2 lines in your code to speed up operations. But note that it requires some time to convert model. You can see usage example in test_bench.py

Comparison table

Neural net Input shape Number of layers (Init) Number of layers (Reduced) Number of params (Init) Number of params (Reduced) Time to process 10000 images (Init) Time to process 10000 images (Reduced) Conversion Time (sec) Maximum diff on final layer Average difference on final layer
MobileNet (1.0) (224, 224, 3) 102 75 4,253,864 4,221,032 32.38 22.13 12.45 2.80e-06 4.41e-09
MobileNetV2 (1.4) (224, 224, 3) 152 100 6,156,712 6,084,808 52.53 37.71 87.00 3.99e-06 6.88e-09
ResNet50 (224, 224, 3) 177 124 25,636,712 25,530,472 58.87 35.81 45.28 5.06e-07 1.24e-09
Inception_v3 (299, 299, 3) 313 219 23,851,784 23,817,352 79.15 59.55 126.02 7.74e-07 1.26e-09
Inception_Resnet_v2 (299, 299, 3) 782 578 55,873,736 55,813,192 131.16 102.38 766.14 8.04e-07 9.26e-10
Xception (299, 299, 3) 134 94 22,910,480 22,828,688 115.56 76.17 28.15 3.65e-07 9.69e-10
DenseNet121 (224, 224, 3) 428 369 8,062,504 8,040,040 68.25 57.57 392.24 4.61e-07 8.69e-09
DenseNet169 (224, 224, 3) 596 513 14,307,880 14,276,200 80.56 68.74 772.54 2.14e-06 1.79e-09
DenseNet201 (224, 224, 3) 708 609 20,242,984 20,205,160 98.99 87.04 1120.88 7.00e-07 1.27e-09
NasNetMobile (224, 224, 3) 751 563 5,326,716 5,272,599 46.05 31.76 728.96 1.10e-06 1.60e-09
NasNetLarge (331, 331, 3) 1021 761 88,949,818 88,658,596 445.58 328.16 1402.61 1.43e-07 5.88e-10
ZF_UNET_224 (224, 224, 3) 85 63 31,466,753 31,442,689 96.76 69.17 9.93 4.72e-05 7.54e-09
DeepLabV3+ (mobile) (512, 512, 3) 162 108 2,146,645 2,097,013 583.63 432.71 48.00 4.72e-05 1.00e-05
DeepLabV3+ (xception) (512, 512, 3) 409 263 41,258,213 40,954,013 1000.36 699.24 333.1 8.63e-05 5.22e-06
ResNet152 (224, 224, 3) 566 411 60,344,232 60,117,096 107.92 68.53 357.65 8.94e-07 1.27e-09

Config: Single NVIDIA GTX 1080 8 GB. Timing obtained on Tensorflow 1.4 (+ CUDA 8.0) backend

Notes

  • It feels like conversion works very slow for no reason, but it should be much faster since all manipulations with layers and weights are very fast. Probably I use some very slow Keras operations in process. Feel free to give advice on how to change code to make it faster.
  • You can check that both models work the same with function: compare_two_models_results(model, model_reduced, 10000)
  • Non-zero difference on final layer is accumulated because of large amount of floating point operations, which is not precise
  • Some non-standard layer or parameters (which is not used in any keras.applications CNN) can produce wrong results. Most likely code will just fail in these conditions and you will see layer which cause it in python error message.

Requirements

  • Code was tested on Keras 2.1.6 (TensorFlow 1.4 backend) and on Keras 2.2.0 (TensorFlow 1.8.0 backend)

Formulas

Base formulas

Other implementations

PyTorch BN Fusion - with support for VGG, ResNet, SeNet.

keras-inference-time-optimizer's People

Contributors

idmippm avatar mrgloom avatar zfturbo avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.