Reinforcement Learning for Maplestory
This repository aims proof of concept for reinforcement learning in Maplestory, which includes Maplestory game development and RL algorithm development. It might be hard to understand if you are not familiar with Maplestory.
RPG games ought to be a suitable environment for RL because they have a lot of states and actions, so finding the optimal policy is challenging. However, it is hard to find a suitable environment for RL because it is hard to implement the environment itself. For this reason, Maplestory is the familiar one.
For a given set of skills, what is the sequence that maximizes overall damage within the time limit?
Reward
Action
State
It is natural that state also includes information on the enemy, but we are dealing with a single, non-interactive enemy. Since the goal of the current stage is proof of concept, we are not getting deeper right now.
In the current stage, the transition is deterministic. If you use skill
Q-learning is a model-free reinforcement learning algorithm that learns the optimal policy by updating the Q-value table.
By simplifying the Bellman equation considering the deterministic transition, we can get the value iteration equation as follows.