Giter VIP home page Giter VIP logo

debu97 / object-localization-using-bag-of-visual-words Goto Github PK

View Code? Open in Web Editor NEW
0.0 0.0 0.0 2.31 MB

This was done as a part of project in Machine learning course. It involves developing bag of visual words from unsupervised and unlabeled data i.e, real time data. This unclustered data is then clustered into 3 classes(objects) and the most significant object is chosen. The most significant object is then located in the input image and finally plotted.

Python 100.00%

object-localization-using-bag-of-visual-words's Introduction

Object Localization using Bag of Visual Words

This is project done under Machine Learning course at Indian Institute of Space Science and Technology.

This project aims to detect and localize significant objects in a real time moving frame (eg. live capture from laptop's front cam).

Prerequisites

Following libraries have been used in this project: Matplot library Glob2 Imageio Numpy Scipy Math

Step 1: Corner Detection

The code utilizes OpenCV's Harris Corner detector to locate key points in the input image (captured frame of laptop's front camera). The number of key points(corners) can be varied. These corners are plotted on the original image.

Step 2: Feature Extraction

The next step involves constructing a square patch around each corner point and then it is passed to HOG feature descriptor. Gradients and magnitude are calculated for each image patch. Magnitudes are added for each bin range of gradient direction and then assembled into that. A histogram of features is obtained. Running this for all image patches followed by normalization in all RGB channels will give a feature space.

Step 3: CLustering

K means clustering is used to group similar feature points. Similarity between feature points is decided on the minimal of distance of each feature point and centroid of the class. K can be varied. K represents the total no. of clusters(objects).

Before Clustering Results

After Clustering Results

Step 4: Labelling

The featured vectors are then matched correspondingly to their corner points and then the clusters of feature space is converted into clusters of corner points. Probability of each cluster is calculated and highest probability is termed as most significant object.

Results

Step 5: Localization

The final step is plotting of most significant object which is plotting the highest probable cluster of cornered points on the input image.

Input Image Results

Final(localized) image Results

object-localization-using-bag-of-visual-words's People

Contributors

debu97 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.