Giter VIP home page Giter VIP logo

spp_net_image_classification's Introduction

CNN Architectures(with and without SPP Layer) for Image Classification

Packagist

Author

Arpit Aggarwal

Introduction to the Project

In this project, different CNN Architectures like VGG-16(with and without SPP Layer), VGG-19(with and without SPP Layer), and ResNet-50(with and without SPP Layer) were used for the task of Dog-Cat image classification. The input to the CNN networks was a (224 x 224 x 3) image and the number of classes were 2, where '0' was for a cat and '1' was for a dog. The CNN architectures were implemented in PyTorch and the loss function was Cross Entropy Loss. The hyperparameters to be tuned were: Number of epochs(e), Learning Rate(lr), momentum(m), weight decay(wd) and batch size(bs).

Data

The data for the task of Dog-Cat image classification can be downloaded from: https://drive.google.com/drive/folders/1EdVqRCT1NSYT6Ge-SvAIu7R5i9Og2tiO?usp=sharing. The dataset has been divided into three sets: Training data, Validation data and Testing data. The analysis of different CNN architectures for Dog-Cat image classification was done on comparing the Training Accuracy and Validation Accuracy values.

Results

The results after using different CNN architectures are given below:

  1. VGG-16(without SPP Layer, pretrained on ImageNet)

Training Accuracy = 99.27% and Validation Accuracy = 96.73% (e = 100, lr = 0.005, m = 0.9, bs = 32, wd = 0.001)

  1. VGG-16(with SPP Layer)

Training Accuracy = 99.61% and Validation Accuracy = 97.23% (e = 100, lr = 1e-3, m = 0.9, bs = 32, wd = 5e-4)

  1. VGG-19(without SPP Layer, pretrained on ImageNet)

Training Accuracy = 99.13% and Validation Accuracy = 97.25% (e = 100, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

  1. VGG-19(with SPP Layer)

Training Accuracy = 99.51% and Validation Accuracy = 97.45% (e = 100, lr = 1e-3, m = 0.9, bs = 32, wd = 5e-4)

  1. ResNet-50(without SPP Layer, pretrained on ImageNet)

Training Accuracy = 99.43% and Validation Accuracy = 98.43% (e = 100, lr = 0.005, m = 0.9, bs = 32, wd = 5e-4)

Software Required

To run the jupyter notebooks, use Python 3. Standard libraries like Numpy and PyTorch are used.

Credits

The following links were helpful for this project:

  1. https://github.com/yueruchen/sppnet-pytorch/
  2. https://discuss.pytorch.org/t/elegant-implementation-of-spatial-pyramid-pooling-layer/831

spp_net_image_classification's People

Contributors

arp95 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

spp_net_image_classification's Issues

Hello there! I have a few questions about spp_net. I look forward to your answers

  1. Using the pre-trained network to add the spp layer before the final fully connected layer, can I still use the parameters of the pre-trained network for migration learning?
    2 . I saw that your spp_ResNet_50 network has also performed ReSize(224,224). Isn’t it possible to use the SPP layer without resize? SPP can handle pictures of various specifications, right?
    3 When adding SPP to the pytorch pre-training network, can I not re-construct the network structure? Is it possible to just add an SPP layer to the pre-trained network?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.