Giter VIP home page Giter VIP logo

apple-detection's Introduction

Improving Apple Detection and Counting Using RetinaNet

This work aims to investigate the apple detection problem through the deployment of the RetinaNet object detection framework in conjunction with the VGG architecture. Following hyper-parameters’ optimisation, the performance scaling with the backbone’s network depth is examined through four different proposed deployments for the side-network. Analysis of the relationship between performance and training size establishes that 10 samples are enough to achieve adequate performance, while 200 samples are enough to achieve state-of-the-art performance. Moreover, a novel lightweight model is proposed that achieves an F1-score of 0.908 and inference time of nearly 70FPS. These results outperform previous state-of-the-art models in both performance and detection rates. Finally, the results are discussed regarding the model’s limitations, and insights for future work are provided.

Dataset

The dataset used for this project is the ACFR dataset and can be downloaded here. It consists of images of three different fruits (apples, mangoes & almonds), but only the apple set was used. The original train/val/test set was preserved in order to make comparisons with previous studies.

The dataset contains 1120 308x202 samples with apples. The annotations are given in #item, x0, y0, x1, y1, class format (circular) and can be converted to square with the examples/convert_annotations.py file. More info in the readme.txt file in the dataset folder.

example 1
example 2
example 3
example 4

Architectures

The repository consists of four side-network architectures, each one implemented on the four repo branches.

  • master : The original side-network architecture.

  • retinanet_p3p4p5 : The original side-network architecture without the strided convolutional filters right after the VGG network.

  • retinanet_ci_multiclassifiers : The retinanet_p3p4p5 implementation with separate classification regression heads for the predictions.

  • retinanet_ci : A lightweight implementation where common classification and regression heads make predictions right after the Ci reduced blocks, without the upsampling-merging technique.

Installation

Clone the repo and follow the instructions in: fizyr/keras-retinanet

Sources

  1. fizyr/keras-retinanet
  2. Martin Zlocha
  3. ACFR FRUIT DATASET

apple-detection's People

Contributors

nikostsagk avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.