Giter VIP home page Giter VIP logo

leap's Introduction

LEAP

This is the codebase for Latent Embeddings for Abstracted Planning (LEAP), from the following paper:

Planning with Goal Conditioned Policies
Soroush Nasiriany*, Vitchyr Pong*, Steven Lin, Sergey Levine
Neural Information Processing Systems 2019
Arxiv | Website

This guide contains information about (1) Installation, (2) Experiments, and (3) Setting up Your Own Environments.

Installation

Download Code

  • multiworld (contains environments):git clone -b leap https://github.com/vitchyr/multiworld
  • doodad (for launching experiments):git clone -b leap https://github.com/vitchyr/doodad
  • viskit (for plotting experiments):git clone -b leap https://github.com/vitchyr/viskit
  • Current codebase: git clone https://github.com/snasiriany/leap
    • install dependencies: pip install -r requirements.txt

Add paths

export PYTHONPATH=$PYTHONPATH:/path/to/multiworld/repo
export PYTHONPATH=$PYTHONPATH:/path/to/doodad/repo
export PYTHONPATH=$PYTHONPATH:/path/to/viskit/repo
export PYTHONPATH=$PYTHONPATH:/path/to/leap/repo

Setup Docker Image

You will need to install docker to run experiments. We have provided a dockerfile with all relevant packages. You will use this dockerfile to build your own docker image.

Before setting up the docker image, you will need to obtain a MuJoCo license to run experiments with the MuJoCo simulator. Obtain the license file mjkey.txt and save it for reference.

Set up the docker image with the following steps:

cd docker
<add mjkey.txt to current directory>
docker build -t <your-dockerhub-uname>/leap .
docker login --username=<your-dockerhub-uname> --email=<your-email>
docker push <your-dockerhub-uname>/leap

Setup Config File

You must setup the config file for launching experiments, providing paths to your code and data directories. Inside railrl/config/launcher_config.py, fill in the appropriate paths. You can use railrl/config/launcher_config_template.py as an example reference.

Experiments

All experiment files are located in experiments. Each file conforms to the following structure:

variant = dict(
  # defualt hyperparam settings for all envs
)

env_params = {
  '<env1>' : {
    # add/override default hyperparam settings for specific env
    # each setting is specified as a dictionary address (key),
    # followed by list of possible options (value).
    # Example in following line:
    # 'rl_variant.algo_kwargs.tdm_kwargs.max_tau': [10, 25, 100],
  },
  '<env2>' : {
    ...
  },
}

Running Experiments

You will need to follow four sequential stages to train and evaluate LEAP:

Stage 1: Generate VAE Dataset

python vae/generate_vae_dataset.py --env <env-name>

Stage 2: Train VAE

Train the VAE. There are two variants, image based (for pm and pnr) and state based (for ant):

python vae/train_vae.py --env <env-name>
python vae/train_vae_state.py --env <env-name>

Before running: locate the corresponding .npy file from the previous stage. The .npy file contains the VAE dataset. Place the path in your config settings for your env inside the script:

'vae_variant.generate_vae_dataset_kwargs.dataset_path': ['your-npy-path-here'],

Stage 3: Train RL

Train the RL model. There are two variants (as described in previous stage):

python image/train_tdm.py --env <env-name>
python state/train_tdm_state.py --env <env-name>

Before running: locate the trained VAE model from the previous stage. Place the path in your config settings for your env inside the script. Complete one of the following options:

'rl_variant.vae_base_path': ['your-base-path-here'], # folder of vaes
'rl_variant.vae_path': ['your-path-here'], # one vae

Stage 4: Test RL

Test the RL model. There are two variants (as described in previous stage):

python image/test_tdm.py --env <env-name>
python state/test_tdm_state.py --env <env-name>

Before running: located the trained RL model from the previous stage. Place the path in your config settings for your env inside the script. Complete one of the following options:

'rl_variant.ckpt_base_path': ['your-base-path-here'], # folder of RL models
'rl_variant.ckpt': ['your-path-here'], # one RL model

Experiment Options

See the parse_args function in railrl/misc/exp_util.py for the complete list of options. Some important options:

  • env: the env to run (ant, pnr, pm)
  • label: name for experiment
  • num_seeds: number of seeds to run
  • debug: run with light options for debugging

Plotting Experiment Results

During training, the results will be saved to a file called under

LOCAL_LOG_DIR/<env>/<exp_prefix>/<foldername>
  • LOCAL_LOG_DIR is the directory set by railrl.config.launcher_config.LOCAL_LOG_DIR
  • <exp_prefix> is given either to setup_logger.
  • <foldername> is auto-generated and based off of exp_prefix.
  • inside this folder, you should see a file called progress.csv.

Inside the viskit codebase, run:

python viskit/frontend.py LOCAL_LOG_DIR/<env>/<exp_prefix>/

If visualizing VAE results, add --dname='vae_progress.csv' as an option.

Setting up Your Own Environments

You will need to follow the multiworld template for creating your own environments. You will need to register your environment. For Mujoco envs for example, follow the examples in multiworld/envs/mujoco/__init__.py for reference.

Credit

Much of the coding infrastructure is based on RLkit, which itself is based on rllab.

leap's People

Contributors

snasiriany avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

leap's Issues

multiworld leap branch sawyer_push_nips environment set_env_state

Hi Soroush,

Sorry to bother you again. I am opening this new issue because I have some questions about the multiworld environment in your experiments. I am trying to use your push_and_reach environment in the multiworld repo leap branch. I find in the sawyer_push_nips.py, the set_env_state function is rewritten but compared to the original one in the master branch, the line self.sim.forward() is missing.
I think in the environment, we need the sim.forward() to do the simulation and so that the state we pass to the function is actually set by the environment. Therefore, I am wondering why the sim.forward isn't in the set_env_state function. Or do you actually call some other function somwhere else to achieve the same effect as sim.forward but I fail to notice?
set_state_leap

RecursionError: maximum recursion depth exceeded

I'm trying to run the code locally with doodad but I encounter the above error. The Error log is as following:

File "/home/james/Documents/GitHub/leap/railrl/torch/pytorch_util.py", line 145, in fanin_init
return fanin_init(tensor.data)
[Previous line repeated 983 more times]
File "/home/james/Documents/GitHub/leap/railrl/torch/pytorch_util.py", line 144, in fanin_init
if isinstance(tensor, TorchVariable):
RecursionError: maximum recursion depth exceeded

Trying to increase the max depth doesn't help.

The use of linear dynamics

Hi, I am trying to use your code for the pointmass_uwall task and thank you for making your code public. I have a question about the linear_loss part in the vae code. In the train_vae.py, the use_linear_dynamics is set to True but the linearity_weight is set to 0. Therefore, in the VAE training, the linear_loss is not actually added to the total loss.

In the pointmass_uwall task, do you change the linearity_weight to some number(not 0.0) in the pre-training of the VAE?

docker build failed: mujoco invalid activation key

I obtained mjkey.txt and I'm assured that it's valid. However, after moving it into the correct directory, during docker build process, which in "python -c 'import mujoco_py '", it shows 'ERROR: Invalid activation key'. I'm confused about this problem.
Is there any alternative method to run experiments without building and using docker? Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.