Giter VIP home page Giter VIP logo

synergydrl's Introduction

Synergy analysis source code [DEPRECATED]

This codebase is deprecated. Please visit the latest version : synergy_analysis

Our special notes

This implementation uses Tensorflow and it is based on the library Softlearning https://github.com/rail-berkeley/softlearning. We customized the softlearning codebase to run our experiments. Author of modification: Chai Jiazheng e-mail:[email protected]

Getting Started

Prerequisites

The codes run in Linux operating system. The experiments can be run locally using conda. For conda installation, you need to have Conda installed. Also, our environments currently require a MuJoCo license.

Step by step Installation

Mujoco Setup

  1. On the home directory ~, create folder .mujoco
  2. Download and unzip MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
  3. Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjpro150/bin
  4. Add the following line with the appropriate username to .bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/username/.mujoco/mjpro150/bin

  1. source ~/.bashrc

Conda Setup

  1. Install Conda with python version 3. Please choose 'yes' for conda init option at the end of installation.
  2. source ~/.bashrc
  3. conda install lockfile (required in experiments)

Codebase Setup

  1. git clone this codebase synergyDRL
  2. cd synergyDRL
  3. conda env create -f updated_env.yml
  4. conda activate synergy_analysis
  5. cd ..
  6. pip install -e synergyDRL
  7. cd synergyDRL

Finish

The environment should be ready to run experiments.

To deactivate and remove the conda environment:

conda deactivate
conda remove --name synergy_analysis --all

To run and reproduce our results:

While in the folder synergyDRL, with the virtual environment synergy_analysis activated,

  1. ./HC_experiments_all_commands.sh
  2. ./HeavyHC_experiments_all_commands.sh
  3. ./FC_experiments_all_commands.sh
  4. ./summary_graphs_results_production.sh
  5. ./extract_synergy_pattern.sh

Users must run 1),2),3) before 4) and 5). Users are also encouraged to further parallelize the command lines in 1),2),3) to speed up the training and action collection of the three agents. The whole experiments take tremendous of time.

The results after running 1),2),3),4),5) are in "experiments_results" in the synergyDRL folder.

Details from the original softlearning codebase(Extra information, not obligatory)

Training an agent

  1. To train the agent
synergy_analysis run_example_local examples.development \
    --universe=gym \
    --domain=HalfCheetah \
    --task=Energy0-v0 \
    --exp-name=HC_E0_r1 \
    --checkpoint-frequency=100  # Save the checkpoint to resume training later

examples.development.main contains several different environments and there are more example scripts available in the /examples folder. For more information about the agents and configurations, run the scripts with --help flag: python ./examples/development/main.py --help

optional arguments:
  -h, --help            show this help message and exit
  --universe {gym}
  --domain {...}
  --task {...}
  --num-samples NUM_SAMPLES
  --resources RESOURCES
                        Resources to allocate to ray process. Passed to
                        `ray.init`.
  --cpus CPUS           Cpus to allocate to ray process. Passed to `ray.init`.
  --gpus GPUS           Gpus to allocate to ray process. Passed to `ray.init`.
  --trial-resources TRIAL_RESOURCES
                        Resources to allocate for each trial. Passed to
                        `tune.run_experiments`.
  --trial-cpus TRIAL_CPUS
                        Resources to allocate for each trial. Passed to
                        `tune.run_experiments`.
  --trial-gpus TRIAL_GPUS
                        Resources to allocate for each trial. Passed to
                        `tune.run_experiments`.
  --trial-extra-cpus TRIAL_EXTRA_CPUS
                        Extra CPUs to reserve in case the trials need to
                        launch additional Ray actors that use CPUs.
  --trial-extra-gpus TRIAL_EXTRA_GPUS
                        Extra GPUs to reserve in case the trials need to
                        launch additional Ray actors that use GPUs.
  --checkpoint-frequency CHECKPOINT_FREQUENCY
                        Save the training checkpoint every this many epochs.
                        If set, takes precedence over
                        variant['run_params']['checkpoint_frequency'].
  --checkpoint-at-end CHECKPOINT_AT_END
                        Whether a checkpoint should be saved at the end of
                        training. If set, takes precedence over
                        variant['run_params']['checkpoint_at_end'].
  --restore RESTORE     Path to checkpoint. Only makes sense to set if running
                        1 trial. Defaults to None.
  --policy {gaussian}
  --env ENV
  --exp-name EXP_NAME
  --log-dir LOG_DIR
  --upload-dir UPLOAD_DIR
                        Optional URI to sync training results to (e.g.
                        s3://<bucket> or gs://<bucket>).
  --confirm-remote [CONFIRM_REMOTE]
                        Whether or not to query yes/no on remote run.

References

The algorithms are based on the following papers:

Motor Synergy Development in High-performing Deep Reinforcement Learning algorithms.
Jiazheng Chai, Mitsuhiro Hayashibe.

Soft Actor-Critic Algorithms and Applications.
Tuomas Haarnoja*, Aurick Zhou*, Kristian Hartikainen*, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine. arXiv preprint, 2018.
paper | videos

Latent Space Policies for Hierarchical Reinforcement Learning.
Tuomas Haarnoja*, Kristian Hartikainen*, Pieter Abbeel, and Sergey Levine. International Conference on Machine Learning (ICML), 2018.
paper | videos

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor.
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. International Conference on Machine Learning (ICML), 2018.
paper | videos

Composable Deep Reinforcement Learning for Robotic Manipulation.
Tuomas Haarnoja, Vitchyr Pong, Aurick Zhou, Murtaza Dalal, Pieter Abbeel, Sergey Levine. International Conference on Robotics and Automation (ICRA), 2018.
paper | videos

Reinforcement Learning with Deep Energy-Based Policies.
Tuomas Haarnoja*, Haoran Tang*, Pieter Abbeel, Sergey Levine. International Conference on Machine Learning (ICML), 2017.
paper | videos

If Softlearning helps you in your academic research, you are encouraged to cite their paper. Here is an example bibtex:

@techreport{haarnoja2018sacapps,
  title={Soft Actor-Critic Algorithms and Applications},
  author={Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar, Henry Zhu, Abhishek Gupta, Pieter Abbeel, and Sergey Levine},
  journal={arXiv preprint arXiv:1812.05905},
  year={2018}
}

synergydrl's People

Contributors

jiazhengchai avatar jzinsa avatar dependabot[bot] avatar

Stargazers

 avatar Guanda Li avatar

Watchers

 avatar paper2code - bot avatar

Forkers

haffon

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.