Giter VIP home page Giter VIP logo

carnd-vehicle-detection's Introduction

Vehicle Detection Project


Vehicle Detection Project

The goals / steps of this project are the following:

  • Perform a Histogram of Oriented Gradients (HOG) feature extraction on a labeled training set of images and train a classifier Linear SVM classifier
  • a color transform and append binned color features to my HOG feature vector. features normalization and randomization of training and testing dataset are conducted
  • a sliding-window technique is adopted to propose candidate regions for car, and the regions are input into a trained classifier
  • a pipeline works on a video stream and a heat map is created for recurring detections frame by frame to reject outliers and follow detected vehicles.
  • Estimate a bounding box for vehicles detected.

Submissiones

  • CarND-Vehicle-Dection.ipynb is the project solution , all the code contains in it.
  • svc_pickle.p is the model and feature parameters
  • README.md it the writeup file of the project
  • test_result.mp4 is the performance on the test_video.mp4
  • result.mp4 is the performance on the project_video.mp4


Histogram of Oriented Gradients (HOG)

1. Explain how (and identify where in your code) you extracted HOG features from the training images.

I started by reading in all the vehicle and non-vehicle images. there are 8792 cars and 8968 non-cars for training and testing, Here is some examples of vehicle and non-vehicle classes:

alt text

Here is an example using the RGB color space and HOG parameters of orientations=8, pixels_per_cell=(8, 8) and cells_per_block=(2, 2):

alt text

2. Explain how you settled on your final choice of HOG parameters.

The 6th cell of CarND-Vehicle-Dection.ipynb show how I settel on the final choice of HOG parameters, I get a smaller data set of car and notcar, small_car number is 1125 and small_notcar is 1125, to investigate the best HOG parameters.

Only to consider the HOG features, experiments are conducted and the result is shown as below:

NO. Colorspace Orient Pixels perCell Cells per Block HOG Channel Extract Time Classifier Accuracy Train Time
1 RGB 9 8 2 ALL 8.44 Linear SVC 0.9511 5.0503
2 RGB 12 8 2 ALL 8.77 Linear SVC 0.9489 0.8531
3 RGB 18 8 2 ALL 9.69 Linear SVC 0.9511 0.9631
4 RGB 36 8 2 ALL 11.49 Linear SVC 0.9578 1.4608
5 HSV 9 8 2 ALL 8.74 Linear SVC 0.9733 0.3411
6 HSV 12 8 2 ALL 8.98 Linear SVC 0.9822 0.3944
7 HSV 18 8 2 ALL 9.72 Linear SVC 0.9778 0.6515
8 HSV 36 8 2 ALL 11.87 Linear SVC 0.9822 1.1518
9 LUV 9 8 2 ALL 9.15 Linear SVC 0.9578 5.7852
10 LUV 12 8 2 ALL 9.46 Linear SVC 0.9756 0.4846
11 LUV 18 8 2 ALL 10.01 Linear SVC 0.9711 0.6345
12 LUV 36 8 2 ALL 12.07 Linear SVC 0.9867 1.1297
13 HLS 9 8 2 ALL 8.66 Linear SVC 0.9778 0.3291
14 HLS 12 8 2 ALL 9.02 Linear SVC 0.98 0.4273
15 HLS 18 8 2 ALL 9.8 Linear SVC 0.9756 0.602
16 HLS 36 8 2 ALL 11.91 Linear SVC 0.9778 1.1222
17 YUV 9 8 2 ALL 8.59 Linear SVC 0.9733 0.3299
18 YUV 12 8 2 ALL 8.93 Linear SVC 0.9867 0.4075
19 YUV 18 8 2 ALL 9.64 Linear SVC 0.9778 0.5656
20 YUV 36 8 2 ALL 11.77 Linear SVC 0.9867 1.1095
21 YCrCb 9 8 2 ALL 8.64 Linear SVC 0.9756 0.3924
22 YCrCb 12 8 2 ALL 8.93 Linear SVC 0.9667 0.3848
23 YCrCb 18 8 2 ALL 9.66 Linear SVC 0.9756 0.5941
24 YCrCb 36 8 2 ALL 11.78 Linear SVC 0.98 1.2059

we can find that RGB color space is not good features for hog to detect vehicles, Time cost increases with the orientation numbers, Orientation num of 12 performorms good at aspect of accuracy and time costs.

but we still cannot find which Color space is the best. So I Run the HOG+SVM struction for the whole 8792 cars and 8968 non-cars data set

NO. Colorspace Orient Pixels perCell Cells per Block HOG Channel Extract Time Classifier Accuracy Train Time
1 RGB 12 8 2 ALL 76.16 Linear SVC 0.9727 30.9131
2 HSV 12 8 2 ALL 78.38 Linear SVC 0.9828 4.8821
3 LUV 12 8 2 ALL 90.82 Linear SVC 0.9735 31.413
4 HLS 12 8 2 ALL 77.9 Linear SVC 0.9831 28.569
5 YUV 12 8 2 ALL 79.44 Linear SVC 0.9862 5.4692
6 YCrCb 12 8 2 ALL 73.53 Linear SVC 0.98 19.823

and we find that HOG features perform the best in YUV Color space on all vehicles and not vehicels data

so next we want to found the best Parameters and we run the HOG+SVM classifier on the smaller data set to get the intuition.

NO. Colorspace Orient Pixels perCell Cells per Block HOG Channel Extract Time Classifier Accuracy Train Time
1 YUV 12 8 2 ALL 9.18 Linear SVC 0.98 0.4347
2 YUV 12 16 2 ALL 5.96 Linear SVC 0.9644 0.8132
3 YUV 12 32 2 ALL 5.36 Linear SVC 0.9311 0.0665

We can find from tables above that 8 pixels per cell is the best perfmance

NO. Colorspace Orient Pixels perCell Cells per Block HOG Channel Extract Time Classifier Accuracy Train Time
1 YUV 12 8 2 ALL 8.69 Linear SVC 0.9778 1.8986
2 YUV 12 8 4 ALL 8.08 Linear SVC 0.9667 1.0709

We can find from tables above that 2 cells per block will be chosen

NO. spatial spatial size hist hist_bins Extract Time Classifier Accuracy Train Time
1 True (32, 32) True 16 9.49 Linear SVC 0.9822 0.6012
2 True (32, 32) True 32 9.33 Linear SVC 0.9933 0.5326
3 True (32, 32) True 64 9.34 Linear SVC 0.9844 0.5998
4 True (32, 32) True 128 9.33 Linear SVC 0.9933 0.5325
5 True (64, 64) True 16 9.33 Linear SVC 0.9844 1.4657
6 True (64, 64) True 32 9.37 Linear SVC 0.98 1.3106
7 True (64, 64) True 64 9.33 Linear SVC 0.9778 1.723
8 True (64, 64) True 128 9.33 Linear SVC 0.98 1.2972
9 True (128, 128) True 16 9.52 Linear SVC 0.9844 14.5779
10 True (128, 128) True 32 10.37 Linear SVC 0.9667 21.6237
11 True (128, 128) True 64 14.92 Linear SVC 0.9733 21.0462
12 True (128, 128) True 128 10.35 Linear SVC 0.9756 10.9654

as the table shows below,the best spaitial size and hist_bins is (32, 32) and (32) repectly.

So the final parameter is that

color_space = 'YUV' # Can be RGB, HSV, LUV, HLS, YUV, YCrCb
orient = 12  # HOG orientations
pix_per_cell = 8 # HOG pixels per cell
cell_per_block = 2 # HOG cells per block
hog_channel = "ALL" # Can be 0, 1, 2, or "ALL"
spatial_size = (32, 32) # Spatial binning dimensions
hist_bins = 32    # Number of histogram bins
spatial_feat = True # Spatial features on or off
hist_feat = False # Histogram features on or off
hog_feat = True # HOG features on or off

3. Describe how (and identify where in your code) you trained a classifier using your selected HOG features (and color features if you used them).

Since HOG + SVM Frame has achived a hug success in pedestrain detection, so in "Train Classifier" section of CarND-Vehicle-Dection.ipynb, I trained a linear SVM with the default classifier parameters and using HOG features with color spatial hist features and achives the accuracy of 99.01%

Sliding Window Search

1. Describe how (and identify where in your code) you implemented a sliding window search. How did you decide what scales to search and how much to overlap windows?

In " Sliding Window Search" section, I write a method find_cars based on the udacity lesson fuction. find_cars is a Hog Sub-sampling Window Search method, it's a combination of HOG feature extraction and a sliding window search to make the process more effcient and time-saving.

the HOG features are extracted for Region of Interest only once, and then features are subsampled according to the size of the window and then fed to the classifier.

Since the car looks smaller when far away and look larger when it's nearby, we should different scale in the different region. The region is show below:

y_start_stop scale
(400, 500) 1.0, 1.3
(410, 500) 1.4
(420, 556) 1.6
(430, 556) 1.8, 2.0
(440, 556) 1.9
(400, 556) 1.3, 2.2
(500, 656) 3.0

The image below shows the 1st row of the table's boxes region:

alt text

The image below shows the 6st row of the table's boxes region:

alt text

2. Show some examples of test images to demonstrate how your pipeline is working. What did you do to optimize the performance of your classifier?

Ultimately I searched on two scales using YUV 3-channel HOG features plus spatially binned color in the feature vector, which provided a nice result. The image below shows the rectangles returned by find_cars drawn onto test image in the final implementation. Here are some example images:

alt text


Video Implementation

1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (somewhat wobbly or unstable bounding boxes are ok as long as you are identifying the vehicles most of the time with minimal false positives.)

Here's a link to my video result

2. Describe how (and identify where in your code) you implemented some kind of filter for false positives and some method for combining overlapping bounding boxes.

I recorded the positions of positive detections in each frame of the video. From the positive detections I created a heatmap and then thresholded that map to identify vehicle positions. I then used scipy.ndimage.measurements.label() to identify individual blobs in the heatmap. I then assumed each blob corresponded to a vehicle. I constructed bounding boxes to cover the area of each blob detected.

Here's an example result showing the heatmap from a series of frames of video, the result of scipy.ndimage.measurements.label() and the bounding boxes then overlaid on the last frame of video:

Here are six frames and their corresponding heatmaps:

alt text alt text alt text alt text alt text alt text

Here is the output of scipy.ndimage.measurements.label() on the integrated heatmap from all six frames and the resulting bounding boxes:

alt text alt text alt text alt text alt text alt text


Discussion

1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?

It takes lots of time to tune the parameter of HOG features, Finnaly I set a small data sets to find the intuition. I think it's is a good choice to do experiements on smaller data sets.

Particle filter or kalman filter can be added to the future work, Since there are correlations between frame and frame, this will make the work more robust.

SSD or Faster RNN can be a good direction to this project.

carnd-vehicle-detection's People

Contributors

ryan-keenan avatar siatwangmin avatar mvirgo avatar baumanab avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.