Giter VIP home page Giter VIP logo

ujeyjw / text2reward Goto Github PK

View Code? Open in Web Editor NEW

This project forked from xlang-ai/text2reward

0.0 0.0 0.0 99.85 MB

[ICLR 2024] Code for the paper "Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning"

Home Page: https://text-to-reward.github.io/

Shell 0.02% C++ 0.91% Python 5.91% C 0.42% Cuda 0.11% Makefile 0.01% GLSL 0.15% Batchfile 0.01% Jupyter Notebook 92.47% Dockerfile 0.01%

text2reward's Introduction

Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning


Code for paper Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning. Please refer to our project page for more demonstrations and up-to-date related resources.

Updates

  • 2023-10-09: We released our code.
  • 2023-09-20: We release the paper and website of text2reward.

Dependencies

To establish the environment, run this code in the shell:

# set up conda
conda create -n text2reward python=3.7
conda activate text2reward
# set up ManiSkill2 environment
cd ManiSkill2
pip install -e .
pip install stable-baselines3==1.8.0 wandb tensorboard
cd ..
cd run_maniskill
bash download_data.sh
# set up MetaWorld environment
cd ..
cd Metaworld
pip install -e .
# set up code generation
pip install langchain chromadb==0.4.0

TroubleShooting

  1. If you have not installed mujoco yet, please follow the instructions from here to install it. After that, please try the following commands to confirm the successful installation:
$ python3
>>> import mujoco_py
  1. If you encounter the following errors when running ManiSkill2, we refer you to read the documents here.
    • RuntimeError: vk::Instance::enumeratePhysicalDevices: ErrorInitializationFailed
    • Some required Vulkan extension is not present. You may not use the renderer to render, however, CPU resources will be still available.
    • Segmentation fault (core dumped)

Usage

Reimplement

To reimplement our experiment results, you can run the following scripts:

ManiSkill2:

bash run_oracle.sh
bash run_zero_shot.sh
bash run_few_shot.sh

It's normal to encounter the following warnings:

[svulkan2] [error] GLFW error: X11: The DISPLAY environment variable is missing
[svulkan2] [warning] Continue without GLFW.

MetaWorld:

bash run_oracle.sh
bash run_zero_shot.sh

Generate new reward code

Firstly please add the following environment variable to your .bashrc (or .zshrc, etc.).

export PYTHONPATH=$PYTHONPATH:~/path/to/text2reward

Then navigate to the directory text2reward/code_generation/single_flow and run the following scripts:

# generate reward code for Maniskill
bash run_maniskill_zeroshot.sh
bash run_maniskill_fewshot.sh
# generate reward code for MetaWorld
bash run_metaworld_zeroshot.sh

Run new experiment

By default, the run_oracle.sh script above uses the expert-written rewards provided by the environment; the run_zero_shot.sh and run_few_shot.sh scripts use the generated rewards used in our experiments. If you want to run a new experiment based on the reward you provide, just follow the bash script above and modify the --reward_path parameter to the path of your own reward.

Citation

If you find our work helpful, please cite us:

@article{text2reward,
  title={Text2Reward: Automated Dense Reward Function Generation for Reinforcement Learning},
  author={Xie, Tianbao and Zhao, Siheng and Wu, Chen Henry and Liu, Yitao and Luo, Qian and Zhong, Victor and Yang, Yanchao and Yu, Tao},
  journal={arXiv preprint arXiv:2309.11489},
  year={2023}
}

Contributors

text2reward's People

Contributors

sihengz02 avatar timothyxxx avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.