Giter VIP home page Giter VIP logo

car_behavioral_cloning's Introduction

Car_behavioral_cloning

CNN netwrok for end-to-end driving on Udacity simulator

Lake Track
Lake Track

Project Description

In this project, I use a neural network to clone car driving behavior. It is a supervised regression problem between the car steering angles and the road images in front of a car.

Those images were taken from three different camera angles (from the center, the left and the right of the car).

The network is based on The NVIDIA model, which has been proven to work in this problem domain.

As image processing is involved, the model is using convolutional layers for automated feature engineering.

Files included

  • model.py The script used to create and train the model.
  • drive.py The script to drive the car.
  • utils.py The script to provide useful functionalities (i.e. image preprocessing and augumentation)
  • model.h5 The model weights.

Run the pretrained model

Start up the Udacity self-driving simulator, choose a scene and press the Autonomous Mode button. Then, run the model as follows:

python drive.py model.h5

To train the model

You'll need the data folder which contains the training images.

python model.py 

This will generate a file model.h5 after 5 epochs

Model Architecture Design

The design of the network is based on the NVIDIA model, which has been used by NVIDIA for the end-to-end self driving test. As such, it is well suited for the project.

It is a deep convolution network which works well with supervised image classification / regression problems. As the NVIDIA model is well documented, I was able to focus how to adjust the training images to produce the best result with some adjustments to the model to avoid overfitting and adding non-linearity to improve the prediction.

I've added the following adjustments to the model.

  • I used Lambda layer to normalized input images to avoid saturation and make gradients work better.
  • I've added an additional dropout layer to avoid overfitting after the convolution layers.
  • I've also included ELU for activation function for every layer except for the output layer to introduce non-linearity.

In the end, the model looks like as follows:

  • Image normalization
  • Convolution: 5x5, filter: 24, strides: 2x2, activation: ELU
  • Convolution: 5x5, filter: 36, strides: 2x2, activation: ELU
  • Convolution: 5x5, filter: 48, strides: 2x2, activation: ELU
  • Convolution: 3x3, filter: 64, strides: 1x1, activation: ELU
  • Convolution: 3x3, filter: 64, strides: 1x1, activation: ELU
  • Fully connected: neurons: 100, activation: ELU
  • Drop out (0.5)
  • Fully connected: neurons: 50, activation: ELU
  • Fully connected: neurons: 10, activation: ELU
  • Fully connected: neurons: 1 (output)

As per the NVIDIA model, the convolution layers are meant to handle feature engineering and the fully connected layer for predicting the steering angle. However, as stated in the NVIDIA document, it is not clear where to draw such a clear distinction. Overall, the model is very functional to clone the given steering behavior.

The below is a model structure output from the Keras which gives more details on the shapes and the number of parameters.

Layer (type) Output Shape Params Connected to
lambda_1 (Lambda) (None, 66, 200, 3) 0 lambda_input_1
convolution2d_1 (Convolution2D) (None, 31, 98, 24) 1824 lambda_1
convolution2d_2 (Convolution2D) (None, 14, 47, 36) 21636 convolution2d_1
convolution2d_3 (Convolution2D) (None, 5, 22, 48) 43248 convolution2d_2
convolution2d_4 (Convolution2D) (None, 3, 20, 64) 27712 convolution2d_3
convolution2d_5 (Convolution2D) (None, 1, 18, 64) 36928 convolution2d_4
flatten_1 (Flatten) (None, 1152) 0 convolution2d_5
dense_1 (Dense) (None, 100) 115300 flatten_1
dropout_1 (Dropout) (None, 1, 18, 64) 0 dense_
dense_2 (Dense) (None, 50) 5050 droupout_1
dense_3 (Dense) (None, 10) 510 dense_2
dense_4 (Dense) (None, 1) 11 dense_3
Total params 252219

Data Preprocessing

Image Sizing

  • the images are cropped so that the model won’t be trained with the sky and the car front parts
  • the images are resized to 66x200 (3 YUV channels) as per NVIDIA model
  • the images are normalized (image data divided by 127.5 and subtracted 1.0). As stated in the Model Architecture section, this is to avoid saturation and make gradients work better)

Model Training

Image Augumentation

For training, I used the following augumentation technique along with Python generator to generate unlimited number of images:

  • Randomly choose right, left or center images.
  • For left image, steering angle is adjusted by +0.15
  • For right image, steering angle is adjusted by -0.15
  • Randomly flip image left/right
  • Randomly translate image horizontally with steering angle adjustment (0.002 per pixel shift)
  • Randomly translate image vertically

Using the left/right images is useful to train the recovery driving scenario. The horizontal translation is useful for difficult curve handling (i.e. the one after the bridge).

Examples of Augmented Images

The following is the example transformations:

Center Image

Center Image

Left Image

Left Image

Right Image

Right Image

Flipped Image

Flipped Image

Translated Image

Translated Image

Pre-processed Image

Pre-processed Image

Training, Validation and Test

I splitted the images into train and validation set in order to measure the performance at every epoch. Testing was done using the simulator.

As for training,

  • I used mean squared error for the loss function to measure how close the model predicts to the given steering angle for each image.
  • I used Adam optimizer for optimization with learning rate of 1.0e-4 which is smaller than the default of 1.0e-3

The Lake Side Track

As there can be unlimited number of images augmented, I set the samples per epoch to 20,000. I tried from 1 to 50 epochs but I found 5 epochs is good enough to produce a well trained model for the lake side track. The batch size of 32 was chosen.

Outcome

The model can drive the course without bumping into the side ways.

Future work

  1. Train the model for Jungle track
  2. Make the model compatible for real life driving

References

car_behavioral_cloning's People

Contributors

khatkeashish avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.