Giter VIP home page Giter VIP logo

irl-essential-code's Introduction

Description

This is a GAIL baselines what belong to Inverse Reinforcement Learning (IRL) methods.

As we all know GAN and GAIL are fragile, even the baseline code what is written by OpenAI is hard to train. Therefore, I write a GAIL code which is PyTorch edition. Besides, Because of the fragility of GAIL, I add some trick in code, what is inevitable, and the tricks are as flows:

My_GAIL_PyThorch

Requirements

  • mujoco-py==2.0.2.13
  • PyTorch==1.7.1
  • See more details in requirement.txt

Trick:

  1. Memory: add a replay buffer to train generator,
  2. Batch Normal: using batch normal trick to transform state , action and next state , note: this trick is used for train generator net ,instead of discriminator net.
  3. Reward Function: if generator accuracy less than 0.5, then this indicates that the generator can not identify the generated data and exert data, thus the reward is optimal reward. Conversely the reward equals to reward function generated by discriminator.
  4. Add noise : add noise to discriminator

Note:

  1. The key to train GAIL is that balancing the discriminator and generator performance, a strong discriminator is not allowed, the discriminator should waiting for the generator.

Usage

python main.py  --env_name=Hopper-v2

note: By this way, you can only change the ==environment name==, the other parameters only can be changed in their ==yaml file==, the file path is =="./env_parser/"==.

Runs

  1. Hopper-v2 (expert return = 3500)

image-20210408143157754

image-20210414091849909

  1. HalfCheetah-v2(expert return = 6000) image-20210409142601820

  2. Ant-v2 (expert return =5500 )

    image-20210412100841604

  3. Walker2d-v2 (expert return = 4900)

image-20210414102348203

  1. InvertedPendulum((expert return = 1000)

    image-20210413092110327

  2. InvertedDoublePendulum((expert return = 9359) image-20210414101914232

Generate Expert Demonstrations

This package can be used to generate expert demonstrations.

You can also download expert demonstration via link: Expert Demonstration

Reference

[SAC(pytorch-soft-actor-critic-master)]: https://github.com/pranz24/pytorch-soft-actor-critic

The websites of Four GAIL editions are as flows:

[gail-pytorch]:https://github.com/hcnoh/gail-pytorch.git

[PyTorch-RL]:https://github.com/Khrylx/PyTorch-RL.git

[imitation]:https://github.com/openai/imitation.git

[GAIL]:https://github.com/JiangengDong/GAIL.git

irl-essential-code's People

Contributors

johnny-zhang92 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

irl-essential-code's Issues

I have never seen such wonderful GAIL codes before!!!

我从未见过如此美妙的GAIL代码!!!
Ich habe noch nie so wunderbare GAIL-Codes gesehen!!!
こんなに素敵なGAILコードを見たのは初めてです!!!
Я никогда раньше не видел таких замечательных кодов GAIL!!!!
Je n'ai jamais vu d'aussi merveilleux codes GAIL avant...
Nunca había visto unos códigos GAIL tan maravillosos!!!

About the performace

Hello! Thank you for your contribution. But there are some problems when I used this code.
I can not reproduce the performance of this code which is shown in the README, for example the env HalfCheetah. I don't know what the problem is. Maybe because I can not download the expert trajectories data in the website in the README. So I used the code to generate expert data. At the same time, I do used the parameter you offered in the code without any change as default. However, the performace is not good (400 average for HalfCheetah whose expert scored 11k+). Could you please do me a favor to give me some suggestion to reproduce the performance? Thanks a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.