Auto Parking using DQN with Webots & Airsim simulator

This is a simple implementation of auto-parking using reinforcement learning methods with GPS and Lidar sensors.

We have deployed our model on 2 different environments to validate its superiority.

Airsim Environment

Quick Start

download the Neighborhood4 environment from the release ( modified from swri-robotics/Neighborhood )
unzip the environment
run the Unreal Engine generated .exe file in the environment package
clone this repository and cd into the Airsim directory then execute the evaluation by
```
python eval.py
```

This environment is manually controllable, you can impress F10 to see the control panel

Training

follow the instructions upon, train the model by

python dqn_car.py

Methods

We train an agent by setting a reward function by

where the rewards are calculated as follows respectively

Some of the functions are modified from Train PPO Agent for Automatic Parking Valet - MATLAB & Simulink, and we added some special reward when something bad happened such as collision and drive out of our bounded area or total time out

The reward function can be viewed like this

We defined 11 kinds of actions of our car, which can be illustrated mainly by the figure below

there a 10 directions each performs as an action, and we set a stop action which can raise the brake.

Results

We finally made a stable auto-parking model after about 4000 episodes training, countless fails, 15 days figuring out problems and adjusting the hyper parameters using the relative distance and orientation.

Extend

There are possibilities using CNN and a backward camera on the car to train a more robust model, but because the queite simple environment(thus CNN can not extract ample valid information) we didn't try that, but we can tell that it can be easily performed by just setting the DQN policy to "CNNpolicy"

This code is modified from AirSim/PythonClient/reinforcement_learning at main · microsoft/AirSim

Webots Environment

First download Webots

python==3.8 gym==0.21.0 numpy==1.22.3 stable_baselines3==1.6.0 torch==1.12.0

configure python interpreter in

observation:

position
orientation
distance_sensor_value

We can simply add any sensors including camera,lidar to the car. The observation we choose is based on the reality.

reward:

see function in .\Webots\controllers\rl_controller\rl_controller.py compute_reward

environments and action code:

.\Webots\controllers\rl_controller\car_controller.py

train code:

.\Webots\controllers\rl_controller\rl_controller.py

algorithm:

DQN

training_logs:

.\Webots\controllers\rl_controller\训练记录

result:

reference

using Webots in reinforcement learning

Real World Deployment

Under Developing

wiinew / reinforcement_learning_auto_parking Goto Github PK

reinforcement_learning_auto_parking's Introduction