Giter VIP home page Giter VIP logo

lbql_icml2020's Introduction

Lookahead-bounded Q-learning

Authors: Ibrahim El Shar and Daniel Jiang

This is the source code for our paper "Lookahead-bounded Q-learning" published at ICML 2020.

Description

We propose a new provably convergent variant of Q-learning that leverages upper and lower bounds derived using information relaxation techniques to improve the performance of standard Q-learning.

Illustration of LBQL algorithm at iteration n

Citation

You can use the following Bibtex entry to cite our paper:

  @article{elshar2020lookahead,
    Title = {Lookahead-bounded Q-learning},
    Author = {Ibrahim El Shar and Daniel Jiang},
    journal={Proceedings of the 37th International Conference on Machine Learning (ICML)},
    Year = {2020},
    address={Vienna, Austria}
  }

Installation

Code was tested on Python version 3.6

Build a working python enviromnent for e.g. using Anaconda.

Install packages from requirements.txt

Clone this repo:

	git clone https://github.com/ibrahim-elshar/LBQL_ICML2020.git

Instructions

There are five environments organized in folders inside src: Windy Gridworld (WG), Stormy Gridworld (SG), Repositioning for Car-sharing in 2 stations Platform (2-CS-R), Pricing for Car-sharing in 2 stations Platform (2-CS) and Pricing for Car-sharing in 4 stations Platform (4-CS). Each environment is placed in a separate folder.

Each folder contains:

  • Environment file, e.g. carsharing.py for 2-CS. Running this file will produce the optimal Q-value Qstar.pkl if applicable.
  • agents.py file that contains the code for QL, Double-QL, SQL, BCQL and LBQL algorithms.
  • run.py file which re-runs the experiments for the environment and reproduce the performance and relative error plots.

Hyperparameters for an algorithm can be set by chaninging the corresponding class default parameters in agents.py file.

python agents.py will produce LBQL vs QL bounds plots.

To rerun the experiments for an evironment cd to the environment folder first then:

$ python run.py

lbql_icml2020's People

Contributors

ibrahim-elshar avatar

Stargazers

 avatar Shaan Chanchani  avatar Haque Ishfaq avatar Youya Xia avatar HenryLiu avatar

Watchers

James Cloos avatar  avatar

Forkers

zhouforst lasseoc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.