Giter VIP home page Giter VIP logo

auto-instruct's Introduction

Auto-Instruct

This is the repository for Auto-Instruct, an automatic solution of generating and selecting instructions for prompting large language models (LLMs). Our method leverages the inherent generative ability of LLMs to produce diverse candidate instructions for a given task, and then ranks them using a scoring model trained on a variety of 575 existing NLP tasks. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions. For more details, please refer to our paper "Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models" in EMNLP 2023 Findings.

Auto-Instruct Pipeline

The repository includes the following contents:

  • data: the training / testing data files, meta-prompts, downstream prompts, and generated instructions.
  • GPT-3/optimization: the source code for data, model training and model evaluation.
    • instruction_generation_templates: the templates of creating meta-prompts for each task (used for instruction generation)
    • instruction_generation: scripts for instruction generation
    • instruction_labeling: scripts for label the instructions for training / testing, as well as dataset pre-processing
    • run.py: entrance for model training / testing, see GPT-3/optimization/README.md
    • evaluation: evaluation scripts for the ranking model

Environment

pip install -r requirements.txt

Checkpoints

Checkpoints of instruction ranking models:

  • Trained on instructions generated by text-davinci-003 under the few-shot setting: checkpoint
  • Trained on instructions generated by text-davinci-003 under the zero-shot setting: checkpoint

Citation

If you find our work useful, please kindly cite our paper:

@inproceedings{Auto-Instruct,
  author = {Zhihan Zhang and
                  Shuohang Wang and
                  Wenhao Yu and
                  Yichong Xu and
                  Dan Iter and
                  Qingkai Zeng and
                  Yang Liu and
                  Chenguang Zhu and
                  Meng Jiang},
  title = {Auto-Instruct: Automatic Instruction Generation and 
                  Ranking for Black-Box Language Models},
  booktitle = {Findings of the 2023 Conference on Empirical 
               Methods in Natural Language Processing, {EMNLP} 2023, 
               Singapore, December 6-10, 2023},
  publisher = {Association for Computational Linguistics},
  year = {2023},
  url = {https://doi.org/10.48550/arXiv.2310.13127}
}

auto-instruct's People

Contributors

ytyz1307zzh avatar qingkaizeng avatar

Stargazers

Łukasz Augustyniak avatar Sled avatar Yichuan LI avatar Diwank Singh Tomer avatar Zilin Xiao avatar Wenhao Yu avatar skykiseki avatar Lynn avatar  avatar Yifan Ding avatar  avatar Zhaoxuan Tan avatar  avatar Pierre Lepagnol avatar yongzhen.wang avatar Allen avatar Zephyr avatar

Watchers

 avatar

Forkers

dm2-nd

auto-instruct's Issues

gpt-utils.py

Hello,how can i use my own openai-key in gpt-utils.py, i always get HTTPError:500 in result.
When I replace the openai.api in the "Authorization: Bearer" line of the script with my own key, the message in response is still "You didn't provide an API key. You need to provide your API key in an Authorization header".
how can i deal with it?
Thank you very much!!!

Instruction generation open-sourced.

Hello, I am very interested in your work. Regarding the instruction generation part in section 4.1, I would like to ask if the instructions generated using the black-box LLM can be made public? I would be very grateful for your reply to my question.

Training Data for The Ranking FlanT5

Hello,

Thank you very much, and congratulations on your well-written code and paper.
I have a few questions regarding the training data of FlanT5:

  1. As I understand it, you classified the tasks using the 'category' field in each 'task_metadata.json' file, found at 'data/niv2_english/TASK_NAME/task_metadata.json', resulting in 575 tasks for training. Is this correct?
  2. For each task, you sampled 400 examples (pairs of candidate instructions, x, y, plus RougeL), following the filters outlined in Appendix A. You mentioned ending up with a total of 122K examples for training, but with 400 examples for each of the 575 tasks, this would amount to 230K total examples. Are there tasks with significantly fewer than 400 examples?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.