Giter VIP home page Giter VIP logo

ranked_reward_rl's Introduction

Ranked Reward Reinforcement Learning (Bin Packing Problem)

This contains experimental code for the Bin Packing Problem with PPO and ranked reward MCTS methods. Bin packing problem is a combinatorial optimization problem consisting of a set of bins of varied sizes to be filled into a container. There are multiple versions of the bin packing problem. This repository considers the offline problem where we have a fixed size container and all bins to be fit into the container are visible throughout. The goal is to fit the maximum number of bins possible into the container. Currently it implements a 2D problem (which can be extended) where the state space consists of the features of all items {id, size and placement(if already placed)} and a set of feasible actions. There are constraints on the actions as the items cannot be placed when not supported below or go beyond the container. The actions available are based on the left over items to be filled and valid placements in the container. The problem sets are generated using a method similar to that mentioned in the ranked reward paper. This considers a discrete action space. At each step, the agent selects an item and its placement in the container. The reward is +1 for fitting all the items in the container, -1 when no actions are feasible and there are items remaining to be filled and 0 otherwise. An episode either ends when all items are placed or no more items can be placed.

The policy and value network architecture is based on Ranked Reward. There are assumptions made about the architecture as the exact architecture is not mentioned in the paper. Currently there are two implementations:

  1. using a PPO agent
  2. using the ranked reward method

Both the methods have a similar policy and value network architecture but the ranked reward method uses MCTS.

Installation

git clone https://github.com/shiveshkhaitan/ranked_reward_rl
cd ranked_reward_rl
conda env create -f environment.yml
conda activate ranked_reward

Example placement using Ranked Reward

ranked_reward_rl's People

Contributors

shiveshkhaitan avatar

Stargazers

 avatar

Watchers

James Cloos avatar  avatar

Forkers

zc19950602

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.