Briefly, the program learns how to play this game by using reinforcement learning(experimental).
We won the round as..
Now, the program knows a way to win. Also it knows how it lost the game in this goal state. So, it will try to win or defense itself against our moves.
Also it can try to learn new goal states..