Giter VIP home page Giter VIP logo

rltf's Introduction

RLTF: Reinforcement Learning from Unit Test Feedback

This is the official code for the paper RLTF: Reinforcement Learning from Unit Test Feedback.

Installation

The code requires some dependencies as specified in requirements.txt. Please follow the relevant libraries to install or run:

pip install -r requirements.txt

Datasets

  • APPS: Please follow the downloading and preprocessing instructions provided here.
  • MBPP: The dataset is available here.

Download and unzip all files into the data folder.

Models

Coming soon.

Processes

Surprised Finetune

  • CodeT5: sh script/train_actor_deepspeed.sh
  • CodeGEN: sh script/train_actor_codegen_deepspeed.sh

Generating Programs Online

  • CodeT5: python script/generate_online_parallel.py
  • CodeGEN: python script/generate_codegen_online_parallel.py

Online RL Finetune

After running the online generation for a short period and accumulating a certain number of samples:

  • CodeT5: sh script/train_actor_rl_online_v1_deepspeed.sh
  • CodeGEN: sh script/train_actor_rl_codegen_online_v1_deepspeed.sh

Generate Program, Run Unit Test, Compute pass@k

Generate Program:

  • CodeT5: python script/generate_parallel.py
  • CodeGEN: python script/generate_parallel_codegen.py

Run Unit Test:

  • sh script/run_unit_tests.sh

Compute pass@k:

  • python compute_pass_at_k_metric.py

Citation

If you find the paper or the source code useful to your projects, please cite the following bibtex:

@misc{liu2023rltf,
      title={RLTF: Reinforcement Learning from Unit Test Feedback}, 
      author={Jiate Liu and Yiqin Zhu and Kaiwen Xiao and Qiang Fu and Xiao Han and Wei Yang and Deheng Ye},
      year={2023},
      eprint={2307.04349},
      archivePrefix={arXiv},
      primaryClass={cs.AI}
}

License

The code is released under BSD 3-Clause - see LICENSE.txt for details.

This code is developed from other open source projects: including CodeRL, APPS, and transformers. We thank the original contributors of these works for open-sourcing their valuable source codes.

rltf's People

Contributors

liujiate avatar zyq-scut avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.