Giter VIP home page Giter VIP logo

roahmlab / reachability-based_trajectory_safeguard Goto Github PK

View Code? Open in Web Editor NEW
28.0 3.0 11.0 58.94 MB

We use reachability to ensure the safety of a decision agent acting on a dynamic system in real-time. We compute the Forward Reachable Set offline and use it online to adjust any potentially unsafe decisions that cause a collision with an obstacle.

MATLAB 99.36% C 0.62% Objective-C 0.02%
rtd reachability safety reinforcement-learning robotics simulation

reachability-based_trajectory_safeguard's Introduction

Welcome to Reachability-based Trajectory Safeguard

What does this do?

This method uses the dynamics model and Reachability computation to ensures the safety of a decision-making agent(human or RL agent). We take advantage of parameterized trajectories and "adjust" the parameter selected by the decision-making agent to a guaranteed safe parameter close by. Here is a gentle introduction. Online Operation Image

Paper:

Please cite our paper as

Shao, Y. S., Chen, C., Kousik, S., & Vasudevan, R. (2020). Reachability-based Trajectory Safeguard (RTS): A Safe and Fast Reinforcement Learning Safety Layer for Continuous Control. arXiv preprint arXiv:2011.08421.

Abstract:

Reinforcement Learning (RL) algorithms have achieved remarkable performance in decision making and control tasks due to their ability to reason about long-term, cumulative reward using trial and error. However, during RL training, applying this trial-and-error approach to real-world robots operating in safety critical environment may lead to collisions. To address this challenge, this paper proposes a Reachability-based Trajectory Safeguard (RTS), which leverages trajectory parameterization and reachability analysis to ensure safety during training and testing. This method ensures an agent with continuous action space can be trained from scratch safely in real-time. By ensuring safety with RTS, this paper demonstrates that the proposed algorithm is not only safe, but can achieve a higher reward in a considerably shorter training time when compared to RTD, RTS with a discrete action space, and a baseline RL algorithm.

Questions and Bugs

Please contact Yifei Shao(syifei) for questions regarding Car or Drone example, and Chao Chen(joecc) for questions regarding the cartpole example. All emails end with @umich.edu

Dependencies

Step1: Install MATLAB 2020a. Since its RL toolbox is a bit inflexible and so modify MATLABIntallPath/toolbox/rl/rl/+rl/+env/MATLABEnvironment.m to have a the IsDone flag do a little more than what it does now: Change Line 243 from 'if isdone' to 'if abs(isdone - 1) < 0.1 || abs(isdone - 3) < 0.1 || abs(isdone - 4) < 0.1 || abs(isdone - 5) < 0.1'. Then restart MATLAB.

Step2: Clone all repositories and checking out to the correct branch

Step3: add all to MATLAB path. Remove rl folder from path if you are running car or drone example. You should be good to go!

Sanity Check: run run_highway_testing and use the arrow keys on the figure to drive the car around, it should edit your inputs so that it never crashes. To run_highway_eval, please make sure to disable the manual command in run_highway_testing

All required:

RTD

RTD_tutorial

simulator

CORA checkout to commit 484c54e0d7990312741fddde5a9c9309d3e8808c

zono_RTD_turtlebot_example

MATLAB_2020a 'Control System Toolbox' 'Optimization Toolbox' 'Mapping Toolbox' 'Deep Learning Toolbox' 'Symbolic Math Toolbox' 'Statistics and Machine Learning Toolbox' 'Reinforcement Learning Toolbox' 'Parallel Computing Toolbox' 'MATLAB Parallel Server'(don't think this is a toolbox) 'Polyspace Bug Finder' 'Filter Design HDL Coder' 'Simulink' 'Stateflow'

Drone:

quadrotor_RTD

edits on the repo: Change bounds

How to use this repo?

Level 1 Observe Result:

Use common_evaluation.m to see the training plots of the three examples for different methods, also use that file to tally up experiment random simulation result. To visualize how each agent performs, use run_xxx_eval.m and load different agents in agent&exp to look at the behavior of different agents.

Level 2 Do evaluation :

Run run_xxx_eval.m till completion and save the experience to observe how good it is

Level 3 Do training:

Run run_xxx_training with plot_sim_flag turned off, so it automatically uses parallel pool. WIth 16 parpool workers, Car training takes about 10 hours, Drone 2 hours, and Cartpole in no time.

Level 4 Do offline FRS computation:

Car: Run gen_frs_idea5.m to get the FRS file. You may wish to clean it up using clean_up_FRS.m

Drone: The FRS was computed in the depended quadrotor_RTD repository

Cartpole: run gen_cartpole_frs.m, documentation under construction.

Different Modes:

In run_***_eval.m, change S.safety_layer = 'RTS' or 'Z' for proposed method, 'RTS' with S.discrete_flag = 1 for discrete version of proposed method, 'NoSafety' or 'N' for No safety, 'RTD' or 'R' with HLP = [] for reward optimizing RTD, 'RTD' with HLP defined for original RTD.

Common Bugs

Most episodes are very short

Make sure you have modified the rl toolbox isdone flag, and the start location is not already in collision

reachability-based_trajectory_safeguard's People

Contributors

shaoyifei96 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

reachability-based_trajectory_safeguard's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.