Giter VIP home page Giter VIP logo

omidghadami95 / efficientnetv2_quantization_ck Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 350 KB

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset, etc.

Jupyter Notebook 100.00%
efficientnet post-training-quantization ptq qat quantization quantization-aware-training efficientnetv2 ckplus efficientnetv2-b2 emotion-recognition

efficientnetv2_quantization_ck's Introduction

EfficientNetV2_Quantization_CKplus tensorflow keras

EfficientNetV2 (Efficientnetv2-b2) and quantization int8 and fp32 (QAT and PTQ) on CK+ dataset . fine-tuning, augmentation, solving imbalanced dataset and so on.

Screenshot-3-8-2024-10-24-25-PM

Real-time facial emotion recognition using EfficientNetV2 CNN and quantization on CK+ dataset. This code includes:
1- data loading steps (download and split dataset).
2- preprocessing steps on CK+ dataset (normalization, resizing, augmentation and solving imbalanced dataset problem).
3- fine-tuning (using pre-trained weights from imagenet dataset as initial weights for training step).
4- quantization int8 and fp32 and fin-tuning after quantization ( Quantization-aware training integer8 (QAT) and Post-training quantization float32 (PTQ) ).
5- Macro, Micro, and Weighted for Precision, Recall, F1-score
6- Confusion Matrix

Note that integer computation is much faster than float computation, especially in ARM architecture. Also, the size of Float32 data is four times bigger than Float8. So, quantization int8 has has some benefits in reducing inference time and model size. But, Sometimes, it leads to a lower accuracy (PTQ). If we want to compensate for this loss, we need to use quantization-aware training approach. It means we need fine-tuning after quantization to compensate for lost accuracy. Finally, we compared int8 QAT and fp32 PTQ in terms of accuracy and model size, and inference time. The important point is, it's not the correct inference time. Since, x86 architecture and ARM architecture are varied and our model is appropriate for ARM not x86. Thus, the inference time for int8 tflite model is slower than float32 tflite model or even slower than simple tf model without quantization. In fact, based on our observations on Samsung Galaxy A54 smartphone, int8 tflite model is roughly two times faster than fp32 tflite model.

efficientnetv2_quantization_ck's People

Contributors

omidghadami95 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.