Code for Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention
. Please see License for details.
Project website: https://sites.google.com/view/mtrf
- Clone this repo with pre-populated submodule dependencies
$ git clone --recursive [email protected]:vikashplus/r3l.git
- Update submodules
$ cd MTRF
$ git submodule update --remote
conda env create -f environment.yml
- This might complain for you to add nvidia-*** to your Python path in
.bashrc
, just follow the instructions given to resolve this.
- This might complain for you to add nvidia-*** to your Python path in
pip install -r requirements.txt
pip install -U git+https://github.com/hartikainen/serializable.git@76516385a3a716ed4a2a9ad877e2d5cbcf18d4e6
- This repository depends on definitions in this specific serializable package.
- Add
MTRF
repository to your python_path- option1:
conda develop MTRF
- option2: manually add <MTRF_folder_path> to python_path
- option1:
- Enter the
algorithms
directory and runpip install -e .
to installsoftlearning
. - Run an example command (see below).
softlearning run_example_local examples.development --exp-name=replicate_basket_results --algorithm=PhasedSAC --num-samples=1 --trial-gpus=1 --trial-cpus=2 --universe=gym --domain=SawyerDhandInHandDodecahedron --task=BasketPhased-v0 --task-evaluation=BasketPhasedEval-v0 --video-save-frequency=0 --save-training-video-frequency=5 --vision=False --preprocessor-type="None" --checkpoint-frequency=50 --checkpoint-replay-pool=False
softlearning run_example_local examples.development --exp-name=replicate_bulb_results --algorithm=PhasedSAC --num-samples=1 --trial-gpus=1 --trial-cpus=2 --universe=gym --domain=SawyerDhandInHandDodecahedron --task=BulbPhased-v0 --task-evaluation=BulbPhasedEval-v0 --video-save-frequency=0 --save-training-video-frequency=5 --vision=False --preprocessor-type="None" --checkpoint-frequency=50 --checkpoint-replay-pool=False
- Add
export CUDA_VISIBLE_DEVICES="0,1"
in front of the command to specify GPUs. - Change
--num-samples=X
for X seeds of the same experiment. - Change
--trial-gpus=X
to specify X GPUs PER trial. - Find results in
~/ray_results/<universe>/<domain>/<task>/<experiment_name>
@article{guptaYuZhaoKumar2021reset,
title={Reset-Free Reinforcement Learning via Multi-Task Learning: Learning Dexterous Manipulation Behaviors without Human Intervention},
author={Gupta, Abhishek* and Yu, Justin* and Zhao, Tony Z* and Kumar, Vikash* and Rovinsky, Aaron and Xu, Kelvin and Devlin, Thomas and Levine, Sergey},
journal={International Conference on Robotics and Automation(ICRA)},
year={2021}
}