Giter VIP home page Giter VIP logo

retro-contest-sonic's Introduction

retro-contest-sonic

A student implementation of the World Models paper with documentation.

Ongoing project.

TODO

CURRENTLY DOING

DONE

  • β-VAE for the Visual model
  • MDN-LSTM for the Memory model
  • CMA-ES for the Controller model
  • Training pipelines for the 3 models
  • Human recordings to generate data
  • MongoDB to store data
  • LSTM and VAE trained "successfully"
  • Multiprocessing of the evaluation of a set of parameters given by the CMA-ES
  • Submit learnt agents

LONG TERM PLAN ?

  • Cleaner code, more optimized and documented
  • Game agnostic
  • Continue training / testing better architectures
  • Online training instead of using a database

How to launch the scripts

  • Install the modules in the requirements.txt, pytorch 0.4 and mongoDB
  • Buy or find the ROMs of Sonic The Hedgehog and install them with retro-gym.

Once you've done that, you will need to train the 3 components :
python train_vae.py
python train_lstm.py --folder=xxx
python train_controller.py --folder=xxx where xxx is the folder number created in saved_models/

While training the VAE and the LSTM, pictures will be saved in a folder results/

Once you're done, you can use your best trained controller to play a random level using : python play_best --folder=xxx
Dont forget to change the RENDER_TICK in const.py to 1, so you can see what's happening.

Resources

Differences with the official paper

  • No temperature
  • No flipping of the loss sign during training (to encourage exploration)
  • β-VAE instead of VAE

retro-contest-sonic's People

Contributors

dylandjian avatar emilwallner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

retro-contest-sonic's Issues

How to generate the dataset

Hi, is the method of generating datasets provided in this repository? If so (I noticed that there are files with name "jerk.py" and "human.py"), how can I run these code to get dataset?

Thanks for your help, looking forward to your response :D

How to get the buttons info?

Hi jian. Hope you are doing well. The env.py just shows the info directly: buttons = [“B”, “A”, “MODE”, “START”, “UP”, “DOWN”, “LEFT”, “RIGHT”, “C”, “Y”, “X”, “Z”]. Where can we reach this information for a new game? Any suggestions will be appreciated.

Controller and CMA-ES : number of parameters.

Hey !

Thanks for the PyTorch code, it is pretty useful. The writeup is great too.

I have two questions regarding CMA-ES and the Controller (which is the policy mapping states to actions).

  1. Regarding the number of parameters in the policy

The goal of CMA-ES is to optimize the policy of the controller, which is in your case the neural network defined here. This neural network, composed of 2 FC layers, has over 1M parameters ((1024*2 + 200) * 512 + 512 * 4 = 1 153 024). Would you expect CMA-ES to work on such a high dimensional parameter space ? In the World Models paper, they justify using CMA-ES because they intentionally use a linear policy, which has less than 1k parameters. So it seem weird to use a MLP for the policy.

  1. Regarding what is passed as input to CMA-ES

Also, I don't understand why the number of parameters passed to CMA-ES is PARAMS_FC1 + LATENT_VEC + 512. Isn't this number should be the number of parameters in the policy aka Controller ? Then it should be (PARAMS_FC1 + LATENT_VEC) * 512 + 512 * ACTION_SPACE (as in the calculation mentioned before).

Unable to run train_vae.py

Even after generating the dataset you mentioned here #2 , I am getting the following error when trying to run python train_vae.py,
[TRAIN] Fetching: 25 new run from the db [TRAIN] Last id: 0, added runs: 4 added frames: 10031 [TRAIN] current iteration: 10, averaged loss: 32442.459 Traceback (most recent call last): File "train_vae.py", line 140, in <module> main() File "/home/paperspace/anaconda3/lib/python3.6/site-packages/click/core.py", line 722, in __call__ return self.main(*args, **kwargs) File "/home/paperspace/anaconda3/lib/python3.6/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home/paperspace/anaconda3/lib/python3.6/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/paperspace/anaconda3/lib/python3.6/site-packages/click/core.py", line 535, in invoke return callback(*args, **kwargs) File "train_vae.py", line 136, in main train_vae(str(current_time)) File "train_vae.py", line 102, in train_vae traverse_latent_space(vae, frames[0], frames[-1], total_ite) File "/home/paperspace/dev/retro-contest-sonic/lib/visu.py", line 24, in traverse_latent_space save_image(res, 'results/vae/sample_traverse_{}.png'.format(total_ite)) File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torchvision/utils.py", line 104, in save_image im.save(filename) File "/home/paperspace/anaconda3/lib/python3.6/site-packages/PIL/Image.py", line 1932, in save fp = builtins.open(filename, "w+b") FileNotFoundError: [Errno 2] No such file or directory: 'results/vae/sample_traverse_20.png'
also I had to manually hardcode the timestamp in train_vae.py variable current_time as it was already extracting current time (and I had generated the mongodb dataset retro_contest earlier).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.