Giter VIP home page Giter VIP logo

drl-agents's Introduction


Contents

Back to top


deep_rl


RL Landscape

Back to top

68747470733a2f2f706c616e73706163652e6f72672f32303137303833302d6265726b656c65795f646565705f726c5f626f6f7463616d702f696d672f616e6e6f74617465642e6a7067


reinforcement-learning

Source: eleurent/phd-bibliography


RL Agents Implementation

Back to top

algorithms

  • Value Optimization
    • [QR-DQN]
    • [DQN] - [Slides] [Code] [rainbow]
    • [Bootstrapped DQN]
    • [DDQN]
    • [NEC]
    • [MMC]
    • [N-step Q Learning]
    • [PAL]
    • [Categorical DQN]
    • [NAF]
  • Policy Optimization
    • [Policy Gradient]
    • [Actor Critic]
      • [DDPG] [Code]
        • [HAC DDPG]
        • [DDPG with HER]
      • [Clipped PPO]
      • [PPO]
  • [DFP]
  • Imitation
    • [Behavioural cloning]
    • [Inverse Reinforcement Learning] [Code] [irl-imitation-code]
    • [Generative Adversarial Imitation Learning]

Value Optimization Agents

Back to top

Policy Optimization Agents

Back to top

General Agents

Back to top

Imitation Learning Agents

Back to top

  • Behavioral Cloning (BC) (code)
Hierarchical Reinforcement Learning Agents

Back to top

Memory Types

Back to top

Exploration Techniques

Back to top


RL History

Back to top

  • Temporal difference(TD) learning (1988)
  • Q‐learning (1998)
  • BayesRL (2002)
  • RMAX (2002)
  • CBPI (2002)
  • PEGASUS (2002)
  • Least‐Squares Policy Iteration (2003)
  • Fitted Q‐Iteration (2005)
  • GTD (2009)
  • UCRL (2010)
  • REPS (2010)
  • DQN (2014) - DeepMind

Back to top

awesome


Back to top

landscape

RL Environments

Back to top

  • [Acrobot]
  • [Bike]
  • [Blackjack]
  • [Cartpole]
  • [ContextBandit]
  • [Continuous Chain]
  • [Corridor]
  • [Discrete Chain]
  • [Discretiser (for continuous environments)]
  • [Double Loop]
  • [Environment]
  • [Gridworld]
  • [Inventory management]
  • [Linear context bandit]
  • [Linear dynamic quadratic]
  • [Mountaincar (2d and 3d)]
  • [POMDP Maze]
  • [Optimistic Task]
  • [Puddleworld]
  • [Random MDPs]
  • [Riverswim]

RL Mechanisms

Back to top

  • [Attention and Memory]
  • [Unsupervised learning ]
    • [GANs]
    • [GQN]
    • [UNREAL]
  • [Hierarchical RL]
    • [FuNs]
    • [Option-Critic]
    • [STRAW]
    • [h-DQN]
    • [Stochastic Neural Networks]
  • [Multi-agent RL]
  • [Relational RL]
  • [Learning to Learn, a.k.a. Meta-Learning]
    • [Few/One/Zero-shot Learning]
      • [MAML]
    • [Transfer and Multi-Task Learning]
    • [Learning to Optimize]
    • [Learning to Re-inforcement Learn]
    • [Learning Combinatorial Optimization]
    • [AutoML]

RL Games

Back to top

  • Chinook (1997;2007) for Checkers,
  • Deep Blue (2002) for chess,
  • Logistello (1999) for Othello,
  • TD-Gammon (1994) for Backgammon,
  • GIB (2001) for contract bridge,
  • MoHex (2017) for Hex,
  • DQN (2016)(2018) for Atari 2600 games,
  • AlphaGo (2016a) and AlphaGo Zero (2017) for Go,
  • Alpha Zero (2017) for chess, shogi, and Go,
  • Cepheus (2015), DeepStack (2017), and Libratus (2017a;b) for heads-up Texas Hold’em Poker,
  • Jaderberg et al. (2018) for Quake III Arena Capture the Flag,
  • OpenAI Five, for Dota 2 at 5v5, https://openai.com/five/,
  • Zambaldi et al. (2018), Sun et al. (2018), and Pang et al. (2018) for StarCraft II

Back to top

  • [Board Games]
    • [Computer Go]
    • [AlphaGo: Trainig pipeline with MCTS]
    • [AlphaGo Zero]
    • [Alpha Zero]
  • [Card Games]
    • [DeepStack]
  • [Video Games]
    • [Atari 2600 games]
    • [StarCraft]
    • [StarCraft II mini-games]
    • [Quake III Arena]
    • [Minecraft]
    • [Super Smash Bros]
    • [Doom]
    • [ViZDoom]

DRL applied to Robotics

Back to top

  • [Sim-to-Real]
    • [MuJoCo]
  • [Imitation Learning]
  • [Value-based Learning]
  • [Policy-based Learning]
  • [Model-based Learning]
  • [Autonomous Driving Vehicles]

DRL applied to NLP

Back to top

  • [Sequence Generation]
  • [Machine Translation]
  • [Dialogue Systems]

DRL applied to Vision

Back to top

  • [Recognition]
  • [Motion Analysis]
  • [Scene Understanding]
  • [Vision + NLP]
  • [Visual Control]
  • [Interactive Perception]

References

Back to top



Maintainer

Gopala KR / @gopala-kr


drl-agents's People

Contributors

gopala-kr avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.