Giter VIP home page Giter VIP logo

rainier's Introduction

Rainier: Reinforced Knowledge Introspector

This repo hosts the code for the paper, Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering, presented at EMNLP 2022.

Resources

Model: Our Rainier model is now on huggingface model hub! [policy] [value]

Usage: Please see Rainier's huggingface model card

Knowledge: We release the commonsense datasets augmented with Rainier-generated knowledge. You can download the knowledge_rainier.json file from our Google Drive folder.

Setup

Create and activate the Conda environment:

conda env create -f environment.yml
conda activate rainier

Install gsutil.

Download model

Download the Rainier model: Go to /model/ and run gdown 1qmxFTENNITA16_54dkqR6pHMDofa3Jee Alternatively, you can download the rainier-large.pth file from our Google Drive folder and put it under /model/

Download data

Download the UQA data: Go to /data/ and run python download_uqa.py

Download the non-UQA data: Go to /data/ and run gdown 1vfJQnqeRzr9MXPQmtbrAsQUuWZD1bZqF Alternatively, you can download the non-uqa.zip file from our Google Drive folder, put it under /data/ and unzip it. Make sure the 4 individual folders are directly under /data/

Running inference

Running inference requires a GPU with at least 22G memory. If that doesn't fit your memory, consider parallelizing on multiple GPUs, or using a smaller --batch_size.

To run inference with the default setting, go to the /rainier/ directory and run

python main.py --mode eval

This will evaluate the dev split of all seen and unseen datasets, with Rainier-large as the knowledge introspector and UnifiedQA-large as the QA model. You can view the output knowledge in /model/knowledge/ and the inference results in /model/inference/.

Some flags you can set (see the full list in args.py):

--eval_split [dev|test]     The dataset split you want to evaluate. Some test data does not have gold labels so we provide utility scripts to convert the inference results to leaderboard submission files.
--eval_tasks [task-list]    Please choose a subset from the full list (which is also the default value): obqa,arc_e,arc_h,ai2sci_e,ai2sci_m,csqa,qasc,piqa,siqa,wg,numersense,riddlesense,quartz,hellaswag. Write your choice as a comma-separated list.
--eval_baseline             Additionally evaluate the no-knowledge baseline.
--eval_ckpt [path]          The path to Rainier model ckpt. The default value is ../model/rainier-large.pth
--load_from_ckpt [path]     This loads the Rainier model ckpt from a raw training ckpt file, and overrides the --ckpt parameter.

Training the Rainier model

The Rainier model is trained in two stages.

Stage I: Imitation Learning

We trained this stage using 1x RTX6000 GPU with 24G memory.

If you would like to skip this training stage, you can download a copy of our ckpt. Go to /model/ and run gdown 1PeL3E7UreVIHKOkLNSyzgyAYoab-MA5N Alternatively, you can download the rainier-large_stageI.pth file from our Google Drive folder and put it under /model/

First, generate silver knowledge from GPT-3.

If you would like to use our pre-generated data, you can download a copy of our pre-generated knowledge. Go to /data/ and run gdown 1V6Za8BfEwWa4xRgXcVEFhS8tWepHZPAw Alternatively, you can download the knowledge_gkp.zip file from our Google Drive folder, unzip it and put it under /data/

Otherwise, you can generate the knowledge yourself by going to the /rainier/ directory and run

sh generate_knowledge_gkp.sh

Remember to set the OPENAI_API_KEY envvar beforehand, and be ready to spend a lot of money ;)

Then, you can start Stage I training by going to the /rainier/ directory and run

python imitation.py 

This will train on all seen datasets, using silver knowledge as supervision. You can track the training in Tensorboard. The best model ckpt will be saved under /runs/imitation/. Make sure to run python extract_model_from_ckpt_stageI.py ../runs/imitation/[path-to-best].ckpt before proceeding to the next stage. This extracts the model state dict and puts it at /model/rainier-large_stageI.pth

Stage II: Reinforcement Learning

We trained this stage using 8x RTX6000 GPUs, each has 24G memory.

To train Stage II with the default setting, go to the /rainier/ directory and run

python main.py --mode train

This will train Rainier on all seen datasets, with UnifiedQA-large as the QA model. You can track the training in Tensorboard, and view the (dev set) output knowledge in /runs/[path-to-save-dir]/knowledge/ and the inference results in /runs/[path-to-save-dir]/inference/.

Some flags you can set (see the full list in args.py):

--train_tasks [task-list]   Please choose a subset from the full list (which is also the default value): obqa,arc_e,arc_h,ai2sci_e,ai2sci_m,csqa,qasc,piqa,siqa,wg. Write your choice as a comma-separated list.
--eval_baseline             Additionally evaluate the no-knowledge baseline.
--model_ckpt [path]         The path to stage I model ckpt. The default value is ../model/rainier-large_stageI.pth
--load_from_ckpt [path]     This resumes training from an existing ckpt.

Make sure to run python extract_model_from_ckpt_stageII.py --load_from_ckpt ../runs/[path-to-best].pth after the training, so that you can use the trained Rainier model for inference.

Citation

If you find this repo useful, please cite our paper:

@article{Liu2022RainierRK,
  title={Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering},
  author={Jiacheng Liu and Skyler Hallinan and Ximing Lu and Pengfei He and Sean Welleck and Hannaneh Hajishirzi and Yejin Choi},
  journal={ArXiv},
  year={2022},
  volume={abs/2210.03078},
  url={https://api.semanticscholar.org/CorpusID:252735191}
}

rainier's People

Contributors

liujch1998 avatar

Stargazers

Dominic Sun avatar Yining Lu avatar  avatar cai avatar Rémi avatar Jeff Carpenter avatar Jiuzhouh avatar Seung Jun Moon avatar  avatar  avatar XU JINWEN avatar 陈越 (Chen Yue) avatar  avatar Annie avatar yangchao avatar Claudio Sebastián Castillo avatar Yushi Hu avatar Kai Xiong avatar peng2001 avatar Hoyeon Chang avatar Anonymous avatar Zhuoran Jin avatar  avatar Seungone Kim avatar  avatar Yihao Feng avatar ChenhaoCui avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.