Giter VIP home page Giter VIP logo

supervised-end-to-end-weight-sharing-for-starcraft-ii's Introduction

Supervised End-to-end Weight-sharing for StarCraft II

What

This is the source code for the project I built during the StarCraft 2 AI Workshop organised by Niels Justesen and Prof. Sebastian Risi at the IT University of Copenhagen. Many thanks to them and the sponsors for making this workshop possible (header image credits to Niels Justesen). Turns out, I was fortunate enough to win the first price, yeah!!!!! ๐Ÿ˜ƒ ๐Ÿ‘

Disclaimer

Sorry in advance, the code is far from clean and not very well-structured. Please keep in mind that the goal of the workshop was to implement something in less than 26 hours from 1pm on Saturday January 20th to 3pm on Sunday January 21st.

Dependencies

In parentheses is the version of these packages I used during this workshop but you should be able to run my code using other releases with minor changes. You can read Niel's great post for a step by step walkthrough on setting up your environment.

Goal

I had absolutely zero knowledge about StarCraft when I started this workshop so I could not input a ton of prior in my agent. To make things a little easier, I used the mini games as a test bed instead of the complex maps of the full game. Reinforcement Learning for StarCraft turned out to be more complex than anticipated so my final solution is leveraging Supervised Learning to train the agent. The training dataset is gathered by recording moves of scripted agents provided in the DeepMind PySC2 package.

My model has two inputs and two outputs. It takes an image as input (minimap player_relative) together with a one-hot encoded vector representing all the available actions for a given game state. The model is then designed to predict policy: both the next action to take, and the (x, y) coordinates of the screen where to click. The model is trained end-to-end to perform both classification (next action) and regression (screen coordinates). The weights of the convolutional layers learning visual features are shared.

Results

MoveToBeacon mini-game

  • Gif demo: beacon_agent_demo.gif
  • Limitations: the agent performs really well in this game but sometimes cannot move to rightmost positions on the screen. This is probably because the (x, y) coordinates predicted by the model are screen ratios from [0.0, 1.0] which are then transformed to screen coordinates. My guess is that somewhere along the way, the value is rounded to the nearest integer below its current value.
CollectMineralShards mini-game

  • Gif demo: mineral_agent_demo.gif
  • Limitations: the agent is performing poorly in this game. Both characters are attracted to each other instead of actively searching for minerals. My guess is that the model actually predict the mean of the mineral cluster position instead of the coordinates of the nearest mineral. This could explain why the characters are always sprinting to a position near the board's center.

Usage

Run my pre-trained models in ./bin:

# play the game with trained agent
python3 -m pysc2.bin.agent --map MoveToBeacon --agent TrainedAgent.TrainedAgent
python3 -m pysc2.bin.agent --map CollectMineralShards --agent TrainedAgent.TrainedAgent

I have made my datasets available (dataset_beacon.zip, dataset_mineral.zip) but you can gather your own data as follows:

# generate training data using scripted agents
python3 -m pysc2.bin.agent --map MoveToBeacon --agent ScriptedAgent.ScriptedAgent --max_agent_steps 10000
python3 -m pysc2.bin.agent --map CollectMineralShards --agent ScriptedAgent.ScriptedAgent --max_agent_steps 10000

Train your own model:

# train the models using generated datasets
python3 train.py beacon
python3 train.py mineral

Citation

If this work is useful to your research, please cite it as follows.

@online{beltramelli2017starcraft,
  title={Supervised End-to-end Weight-sharing for StarCraft II},
  author={Beltramelli, Tony},
  url={https://github.com/tonybeltramelli/Supervised-End-to-end-Weight-sharing-for-StarCraft-II},
  year={2017}
}

License

This project and the associated media are distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) License, the source code is distributed under the Apache License 2.0.

supervised-end-to-end-weight-sharing-for-starcraft-ii's People

Contributors

nilox94 avatar tonybeltramelli avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.