zhuoranyang / reinforcement-learning-algorithms Goto Github PK
View Code? Open in Web Editor NEWThis project forked from tarek-ullah/reinforcement-learning-algorithms
These implementatios shows Convergence and performance of policy and value iteration algorithms, how the convergence of these algorithms to the optimal value function depends on the number of iterations used. Furthermore, I have implemented on-policy SARSA and off-policy Q-learning algorithms and showed how the performance of these algorithms depends on the exploration-exploitation tradeoff, and on learning rates. My experiments were evaluted on benchmark reinforcement learning tasks such as a smallworld, gridworld and a cliffworld MDP to analyze the performance of our algorithms.