Accepted in Conference of Robot Learning (CoRL) 2021.
Harshit Sikchi, Wenxuan Zhou, David Held
- PyTorch 1.5
- OpenAI Gym
- MuJoCo
- tqdm
- D4RL dataset
- LOOP (Core method)
-
- Training code (Online RL):
train_loop_sac.py
- Training code (Online RL):
-
- Training code (Offline RL):
train_loop_offline.py
- Training code (Offline RL):
-
- Training code (safe RL):
train_loop_safety.py
- Training code (safe RL):
-
- Policies (online/offline/safety):
policies.py
- Policies (online/offline/safety):
-
- ARC/H-step lookahead policy:
controllers/
- ARC/H-step lookahead policy:
- Environments:
envs/
- Configurations:
configs/
- All the experiments are to be run under the root folder.
- Config files in
configs/
are used to specify hyperparameters for controllers and dynamics. - Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.
python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>
Environments wrappers with their termination condition can be found under envs/
Download CRR trained models from Link into the root folder.
python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs> --offline_algo=CRR --prior_type=CRR
Currently supported for d4rl MuJoCo locomotions tasks only.
python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>
Safety environments can be found under envs/safety_envs.py
Parts of the codes are used from the references mentioned below:
@article{SpinningUp2018,
author = {Achiam, Joshua},
title = {{Spinning Up in Deep Reinforcement Learning}},
year = {2018}
}
https://github.com/Xingyu-Lin/mbpo_pytorch