Giter VIP home page Giter VIP logo

ppo's Introduction

Introduction

This repository contains an implementation of PPO. Initially the plan was to explore how CQL affects PPO, but with a deeper understanding of both PPO and CQL, it appears to make more sense to explore CQL applied to the soft actor critic algorithm. The primary reason is that PPO's design was not specifically for offline RL, while SAC's is designed for offline RL. CQL is specifically a tool to use for helping generalize with offline RL.

See https://github.com/gth828r/sac-cql for exploration of CQL applied to SAC.

ppo's People

Contributors

gth828r avatar

Watchers

 avatar  avatar

ppo's Issues

Make platform-specific encapsulation of constants

To improve portability, we should introduce a mechanism to inspect the underlying hardware (e.g. GPU, TPU, CPU-only) and set values based on that. We can create a factory pattern to return a set of pre-defined constants for hardware we recognize, as well as a default to use for non-CPU hardware which we do not recognize.

Choose an environment to test with

We need a simulation environment to test with. Something like an environment from AI Gym. Choose one to explore the capabilities of PPO combined with CQL.

Experiment with CQL

Play around with our PPO implementation both with and without CQL. Evaluate what impact it has.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.