Giter VIP home page Giter VIP logo

ar2l_bpp's Introduction

Adjustable Robust Reinforcement Learning for Online 3D Bin Packing

Introduction

This is the official PyTorch implementation for the paper titled "Adjustable Robust Reinforcement Learning for Online 3D Bin Packing". The paper introduces the AR2L framework, which takes into account both the average performance and worst-case performance of a packing policy. By using this framework, the trained packing policy can be made more robust, while still maintaining acceptable performance in nominal cases. In the AR2L framework, the training process involves alternating between training the packing policy, the permutation-based attacker, and the mixture-dynamics model in each iteration. The PPO algorithm is utilized to train these three policies. Additionally, the packing policy is built on the PCT algorithm. The video demonstration can be found using the YouTube Link.

Dependencies

Before executing the training process, please ensure that the necessary requirements have been installed.

pip install -r requirements.txt

Training

The packing policy has the flexibility to observe a varying number of next boxes (NNB). The robustness of the policy can be adjusted by tuning the hyperparameter alpha.

Environment: discrete, NNB=5, alpha=1.0

bash scripts train_disc.sh 5 1.0

Environment: discrete, NNB=10, alpha=1.2

bash scripts train_disc.sh 10 1.2

Environment: discrete, NNB=15, alpha=1.3

bash scripts train_disc.sh 15 1.3

Environment: discrete, NNB=20, alpha=1.0

bash scripts train_disc.sh 20 1.0

Environment: continuous, NNB=5, alpha=1.0

bash scripts train_cont.sh 5 1.0

Environment: continuous, NNB=10, alpha=1.0

bash scripts train_cont.sh 10 1.0

Environment: continuous, NNB=15, alpha=1.0

bash scripts train_cont.sh 15 1.0

Environment: continuous, NNB=20, alpha=1.0

bash scripts train_cont.sh 20 1.0

Validation

To select an effective AR2L packing policy, you can evaluate various packing policies with and without the permutation-based attacker.

bash val_disc.sh [NNB] [path to the parent directory where all the models are saved] load_adv

example: bash val_disc.sh 5 ./logs/experiment/timeStr load_adv
bash val_disc.sh [NNB] [path to the parent directory where all the models are saved] not_load_adv

example: bash val_disc.sh 5 ./logs/experiment/timeStr not_load_adv

After conducting the validation, please add the space utilization in the nominal dynamics (not_load_adv) and the space utilization in the worst-case dynamics (load_adv) for each model. Then, you can choose the best one among them.

Evaluation

You can evaluate the selected packing policy in various settings.

bash eval_disc.sh [NNB] [path to the BPP model] [path to the adv model] load_adv

example: bash eval_disc.sh 5 ./logs/experiment/timeStr/BPP-subtimeStr.pt ./logs/experiment/timeStr/Adv-subtimeStr.pt load_adv
bash eval_disc.sh [NNB] [path to the BPP model] [path to the adv model] not_load_adv

example: bash eval_disc.sh 5 ./logs/experiment/timeStr/BPP-subtimeStr.pt ./logs/experiment/timeStr/Adv-subtimeStr.pt not_load_adv

Acknowledgement

We appreciate the anonymous reviewers, (S)ACs, and PCs of NeurIPS2023 for their insightful comments to further improve our paper and their service to the community. We would like to thank the authors of PCT for providing their highly valuable implementation of PCT. and the authors of the PPO PyTorch Implementation.

Citation

@inproceedings{
pan2023adjustable,
title={Adjustable Robust Reinforcement Learning for Online 3D Bin Packing},
author={Yuxin Pan and Yize Chen and Fangzhen Lin},
booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
year={2023},
url={https://openreview.net/forum?id=1mdTYi1jAW}
}

License

This source code is provided solely for academic use. Please refrain from using it for commercial purposes without obtaining proper authorization from the author.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.