Giter VIP home page Giter VIP logo

ashutoshtiwari13 / simple-random-search Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 2.0 2.01 MB

A simple random searching technique which provides a competitive approach to Reinforcement learning for Locomotion related tasks on Mu-Jo-Co bodies like Humanoid, Half-Cheetah etc

License: MIT License

Python 100.00%
ars numpy random-search reinforcement-learning-algorithms reinforcement-learning pybullet mujuco humanoid-walking half-cheetah

simple-random-search's Introduction

Build status Build Status License MIT made with &hearts in Python

Augmented Random Search using Numpy

The project aims on building a new type of Artificial intelligence algorithm which is simple and surpasses many already available algorithms for Humanoid or Mu-Jo-Co(Multidimensionla-Joint-with-contact) locomotion related tasks. It simulates a powerful AI Algorithm,called Augmented Random Search (ARS) by training a Half-cheetah (Mu-Jo-Co) to walk and run across a field. to walk and run .

Motivation

Link to the Google-DeepMind's Video

Existing methods

  • Asynchronous Actor-Critic Agents
  • Deep Learning
  • Deep Reinforcement Learning

How is it different

  • Unlike other AI systems where the exploration occurs after each action (Action Space) , here exploration occurs after end of each episode (Policy space)
  • ARS is a shallow learning technique unlike deep learning in other AI's systems (Uses only one perceptron rather than layers of it)
  • ARS discards the technique of Gradient Descent for weight adjustment and uses the Method of Finite Differences

Implementation

Components

  • Perceptrons
  • Reward Mechanism and updation of weights
  • Method of finite Differences to find the best possible direction of movement

Algorithm

  • Scaling the update step by standard deviation of Rewards.
  • Online normalization of weights.
  • Choosing better directions for faster learning.
  • Discarding directions that yield lowest rewards.

Algorithm Overview

Alt text

Installation

  • Fork and clone the repository using git clone https://github.com/ashutoshtiwari13/Simple-Random-Search.git
  • Run pip install -r requirements.txt
  • Also check the Simulation.txt for setting up the PyBullet Simulation Environment
  • Use the Anaconda Cloud - Spyder IDE (Any framework/IDE of your choice)
  • Use Python 3.6 and above
  • Run the command python ars.py

Results

Reference Mu-ju-Co

Alt text

Series of Rewards

Rewards start from being negative as low as -900 and climbs to positive 900 in around 1000 steps. Alt Text Alt Text Alt Text

Simulation Images

Alt text

Further reading

  • Ben Recht's Blog
  • Reference paper - Link
  • Research paper used - Link

Happy coding ๐Ÿ˜Š โค๏ธ โœ”๏ธ

simple-random-search's People

Contributors

ashutoshtiwari13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.