Python 94.91% Jupyter Notebook 5.09%

anarcho_av---multi-agent-rl-for-av-traffic-clearance's People

Contributors

Stargazers

Watchers

Forkers

timefly-1989

anarcho_av---multi-agent-rl-for-av-traffic-clearance's Issues

Reference papers

Hello
Thank you for sharing your code. Could you upload your paper? I tried to download but I could not find

In env.py, the function get_follow_speed_by_id, there is a mistake

At the beginning of the function
leader_data = traci.vehicle.getLeader(self.emer.ID)

the second line should get sent the agent vehicle Id not the emergency vehicle ID

no_acc action allows acceleration

The current implementation of the no_acc (no acceleration) action, allows acceleration. This happens because the default used behaviour of SUMO is to apply maximum possible acceleration, and our implementation of no_acc was to simply send nothing to SUMO.

Our edits should:

make sure no_acc is applied as: current velocity is maintained in singleAgent.applyAction function.
make sure the feasibility check in env.get_feasible_actions checks for the feasibility of maintaining the current velocity. My current belief is that: no_acc will always be feasible, since:

Forward feasibility will need to check if the current velocity is greater than vsafe (safe velocity for next step, retrieved by traci.getFollowSpeed()), if so, no_acc action should be addressed as infeasible.
It does not require |deceleration| > |max_deceleration| --> So it is always backward feasible

Integarete this with are_we_ok function created by @AlaaHesham1996 and check if last step's velocity was maintained for our agent.

Footnote: One thing that we must maintain in our system to avoid breaking it down, though.. is: there should always be at least one always-feasible action. This is to guarantee that the agent will always be able to choose an action per loop, as assumed. Currently, the only always-feasible action is decelerate.

Condition for checking reward on maximum velocity is faulty

Calculating the step reward in the Single Agent (Version 1.3), here has something wrong to it.:

The reward, in general, is a linear function of the acceleration. However, if the EV (ambulance) was already at maximum velocity and maintained maximum velocity for the next step, then we will see its acceleration as zero although no agent affected its acceleration or slowed it down. Our logic was then to give the agent the maximum reward nonetheless (i.e. equivalent to maximum acceleration.

This was --however-- translated in a somewhat wrong way to code. We see that the condition for the reward is:
if ( abs(amb_last_velocity-self.emer.max_speed) <= 1e-10 ): #i.e. amb_last_velocity==self.emer.max_speed
Which only reflects the first part (ambulance was at maximum velocity), without "and maintained maximum velocity for the next step". The corrected code should look something like:
if ( abs(amb_last_velocity-self.emer.max_speed) <= 1e-10 and abs(self.emer.spd-self.emer.max_speed) <= 1e-10 ):

I did not edit it so that we would get a chance to re-train the agent before updating the code, and see if that affects anything.

abdulhady-feteiha / anarcho_av---multi-agent-rl-for-av-traffic-clearance Goto Github PK

anarcho_av---multi-agent-rl-for-av-traffic-clearance's People

Contributors

Stargazers

Watchers

Forkers

anarcho_av---multi-agent-rl-for-av-traffic-clearance's Issues

Reference papers

In env.py, the function get_follow_speed_by_id, there is a mistake

no_acc action allows acceleration

Condition for checking reward on maximum velocity is faulty

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent