thbuer-yw / doil Goto Github PK

View Code? Open in Web Editor NEW

PyTorch implementation of Dense connections based Off-policy adversarial Imitation Learning (DOIL). We intgrated dense connections and TD3 into GAIL for better sample efficiency.

Python 96.47% Shell 3.53%

doil's Introduction

Dense connections based Off-policy adversarial Imitation Learning

PyTorch implementation of Dense connections based Off-policy adversarial Imitation Learning (DOIL).

In DOIL, we use the TD3 algorithm to train the imitation policy. In addition, dense connections are integrated into the actor network and the critic network of DOIL. Both the TD3 algorithm and dense connections are beneficial for improving the sample efficiency of GAIL.

Method is tested on MuJoCo continuous control tasks in OpenAI gym. Networks are trained using PyTorch 1.4 and Python 3.7.

Expert data

We use the official TD3 code of D2RL to train the agent. And then, the trained agent is used to generate expert trajectories. The expert data of Ant-v2, BipedalWalker-v3, HalfCheetah-v2, Hopper-v2, Reacher-v2 and Walker2d-v2 is available at this Google drive site.

Usage

The ablation experiments can be reproduced by running:

./run_ablation.sh

The main experiments for DOIL can be reproduced by running:

./run_experiments.sh

We can also run experiments when reward types vary and using only states transitions by changing the arguments reward_type and states_only, respectively.

If the argument wdail is set ture, then WGAN is used to train the discriminator, just try it!

Results

Recommend Projects

thbuer-yw / doil Goto Github PK

doil's Introduction

Dense connections based Off-policy adversarial Imitation Learning

Expert data

Usage

Results

doil's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent