Auto-Instruct

This is the repository for Auto-Instruct, an automatic solution of generating and selecting instructions for prompting large language models (LLMs). Our method leverages the inherent generative ability of LLMs to produce diverse candidate instructions for a given task, and then ranks them using a scoring model trained on a variety of 575 existing NLP tasks. In experiments on 118 out-of-domain tasks, Auto-Instruct surpasses both human-written instructions and existing baselines of LLM-generated instructions. For more details, please refer to our paper "Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models" in EMNLP 2023 Findings.

The repository includes the following contents:

data: the training / testing data files, meta-prompts, downstream prompts, and generated instructions.
GPT-3/optimization: the source code for data, model training and model evaluation.
- instruction_generation_templates: the templates of creating meta-prompts for each task (used for instruction generation)
- instruction_generation: scripts for instruction generation
- instruction_labeling: scripts for label the instructions for training / testing, as well as dataset pre-processing
- run.py: entrance for model training / testing, see GPT-3/optimization/README.md
- evaluation: evaluation scripts for the ranking model

Environment

pip install -r requirements.txt

Checkpoints

Checkpoints of instruction ranking models:

Trained on instructions generated by text-davinci-003 under the few-shot setting: checkpoint
Trained on instructions generated by text-davinci-003 under the zero-shot setting: checkpoint

Citation

If you find our work useful, please kindly cite our paper:

@inproceedings{Auto-Instruct,
  author = {Zhihan Zhang and
                  Shuohang Wang and
                  Wenhao Yu and
                  Yichong Xu and
                  Dan Iter and
                  Qingkai Zeng and
                  Yang Liu and
                  Chenguang Zhu and
                  Meng Jiang},
  title = {Auto-Instruct: Automatic Instruction Generation and 
                  Ranking for Black-Box Language Models},
  booktitle = {Findings of the 2023 Conference on Empirical 
               Methods in Natural Language Processing, {EMNLP} 2023, 
               Singapore, December 6-10, 2023},
  publisher = {Association for Computational Linguistics},
  year = {2023},
  url = {https://doi.org/10.48550/arXiv.2310.13127}
}

ytyz1307zzh / auto-instruct Goto Github PK

auto-instruct's Introduction

Auto-Instruct

Environment

Checkpoints

Citation

auto-instruct's People

Contributors

Stargazers

Watchers

Forkers

auto-instruct's Issues

gpt-utils.py

May rank all candidate instructions for given examples with GPT-4?

How about the API cost used in ranking model train stage?

Release the ranker model

Instruction generation open-sourced.

Training Data for The Ranking FlanT5

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent