The grab from mkairanbay

https://www.aiforsea.com/computer-vision

https://drive.google.com/open?id=1h3zx7i8IYNFY6BZp717Ljq8FwYtW9THN

train.py - contains the code for fine-tuing the Inception v3 CNN taken from https://github.com/fchollet/deep-learning-models.
- The Inception v3 CNN which is trained on ImageNet dataset, have been fine tuned with VMMRdb dataset (http://vmmrdb.cecsresearch.org/), where 3040 classes are used for training. According to the author of the paper (http://vmmrdb.cecsresearch.org/papers/VMMR_TSWC.pdf), they have selected those vehicle classes which have more than 20 samples per class. Following their experimental setup, we have selected 3040 vehicle classes and fine-tuned the Inception v3 CNN changing the number of nodes in last layer from 1000 to 3040. All the layers (earlier as well as latter) of the CNN have been trained. Since, there is a limitation in github for uploading heavy files (maximum 100MB), we have uploaded the trained weights to Google Drive https://bit.ly/2KR9Mne
- The next step, was fine-tuning this CNN with Cars dataset (https://ai.stanford.edu/~jkrause/cars/car_dataset.html). In this case, the number of nodes in last layer is changed from 3040 to 196. However, only the latest fully connected layer is fine-tuned. We have believed that the earlier layers of the network is already reach to low-level visual features from previous step. Therefore, there is no require to train again. The trained weights can be downloaded from Google Drive https://bit.ly/2WI889G , due to previous github memory limit reason.
test.py - contains the source code for testing the trained CNN model. It loads the trained model and reads all test samples with ImageDataGenerator. We have choosen as batch size 40, since, there are 8041 test samples (8040 mod 40 = 0). This means that we have 201 (8041 / 40 = 201) feed forward iterations to the model. For each test samples we have predicted top-5, top-3 and top-1 most probable vehicle classes (each vehicle class contains vehicle make, model and year). The top-5, top-3 and top-1 accuracies can be found in accuracy.txt file.
inception_v3_cars_main.py souce file is also used for testing, however, in comparison to test.py file, it reads only one image file and produces top-5 most probable prediction results with confidence scores.
The test sample with respective top-5 prediction result is illustrated in the following figure:
accuracy.txt file contains the top-5, top-3 and top-1 accuracies:
top-5 accuracy: 0.9587064676616915
top-3 accuracy: 0.9335820895522388
top-1 accuracy: 0.8105721393034826
output.txt contains the output of test.py source code. Each line of the file characterized as follows (seperated by comma):
- # of test cases (starting from 0)
- input image name. For example: 00076.jpg
- ground truth image label. For example: AM General Hummer SUV 200
- top five prediction results. For example: [AM General Hummer SUV 200,Geo Metro Convertible 199,Lamborghini Reventon Coupe 200,BMW 6 Series Convertible 200,Mazda Tribute SUV 201]
- top one prediction. For example: [AM General Hummer SUV 200]
The following line will show the first line of output.txt file
0,00076.jpg,AM General Hummer SUV 200,[AM General Hummer SUV 200,Geo Metro Convertible 199,Lamborghini Reventon Coupe 200,BMW 6 Series Convertible 200,Mazda Tribute SUV 201],[AM General Hummer SUV 200]
output_top1_with_confidense_score.txt contains the output of test.py source code. Each line of the file characterized as follows (seperated by comma):
- # of test cases (starting from 0)
- input image name. For example: 00076.jpg
- ground truth image label. For example: AM General Hummer SUV 200
- top one prediction. For example: [AM General Hummer SUV 200]
- the confidence score of top one prediction. For example: 0.996206
The following line will show the first line of output_top1_with_confidense_score.txt file
0,00076.jpg,AM General Hummer SUV 200,0.996206
cars.txt the file contains enumerated class names

Using the result of this trained model, we have attempted to build web-service for smartphones which will predict the vehicle's make, model and year. The web-service can be found by following the url: http://lattaes.herokuapp.com/.

User have to upload or take a photo of vehicle and press to "Predict" button.

It will upload the photo to the server and starts to predict the vehicle's make, model and year.

In the result it will show top-5 predicted classes with confidense scores.

However, the performance of the web-service is not good. Because, the CNN is trained with Theano, however, the web-service is working with Tensorflow. The loading the Theano model to heroku takes very long time (more than 120 seconds), which restricts to the limitations of heroku. Therefore, we have decided to use tensorflow as a backend. However, it influenced to the performance of prediction in a bad way. In a feature, we have to solve this issue (train with tensorflow or convert the weights from theano to tensorflow) and upload proper weights.

mkairanbay / grab Goto Github PK

grab's Introduction

grab's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent