Ranked Reward Reinforcement Learning (Bin Packing Problem)

This contains experimental code for the Bin Packing Problem with PPO and ranked reward MCTS methods. Bin packing problem is a combinatorial optimization problem consisting of a set of bins of varied sizes to be filled into a container. There are multiple versions of the bin packing problem. This repository considers the offline problem where we have a fixed size container and all bins to be fit into the container are visible throughout. The goal is to fit the maximum number of bins possible into the container. Currently it implements a 2D problem (which can be extended) where the state space consists of the features of all items {id, size and placement(if already placed)} and a set of feasible actions. There are constraints on the actions as the items cannot be placed when not supported below or go beyond the container. The actions available are based on the left over items to be filled and valid placements in the container. The problem sets are generated using a method similar to that mentioned in the ranked reward paper. This considers a discrete action space. At each step, the agent selects an item and its placement in the container. The reward is +1 for fitting all the items in the container, -1 when no actions are feasible and there are items remaining to be filled and 0 otherwise. An episode either ends when all items are placed or no more items can be placed.

The policy and value network architecture is based on Ranked Reward. There are assumptions made about the architecture as the exact architecture is not mentioned in the paper. Currently there are two implementations:

using a PPO agent
using the ranked reward method

Both the methods have a similar policy and value network architecture but the ranked reward method uses MCTS.

Installation

git clone https://github.com/shiveshkhaitan/ranked_reward_rl
cd ranked_reward_rl
conda env create -f environment.yml
conda activate ranked_reward

shiveshkhaitan / ranked_reward_rl Goto Github PK

ranked_reward_rl's Introduction

Ranked Reward Reinforcement Learning (Bin Packing Problem)

Installation

Example placement using Ranked Reward

ranked_reward_rl's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent