Giter VIP home page Giter VIP logo

alrope123-prompt-waywardness's Introduction

Prompt Waywardness

This includes an original implementation "PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts" by Daniel Khashabi, Xinxi Lyu, Sewon Min, Lianhui Qin, Kyle Richardson, Sameer Singh, Sean Welleck, Hannaneh Hajishirzi, Tushar Khot, Ashish Sabharwal, Yejin Choi.

This code provides commands to run the models and reproduce the numbers reported in the paper. The code is taken and modified from the Channel LM Prompting repo.

Please leave issues for any questions about the paper or the code.

If you find our code or paper useful, please cite the paper:

@inproceedings{khashabi2021waywardness,
  title={{PROMPT WAYWARDNESS: The Curious Case of Discretized Interpretation of Continuous Prompts}},
  author = {Khashabi, Daniel and Lyu, Xinxi and Min, Sewon and Qin, Lianhui and Richardson, Kyle and Singh, Sameer and Welleck, Sean and Hajishirzi, Hannaneh and Khot, Tushar and Sabharwal, Ashish and Choi, Yejin},
  booktitle={Proceedings of NAACL},
  year={2022}
}

Content

  1. Installation
  2. Download & Preprocess Data
  3. Default Commands
  4. Reproducing Main Results (Section 4.2 of the paper)
  5. Reproducing Analysis (Section 4.3 of the paper)

You can run the channel model and the direct model for each of these methods. Please see Section 3 of the paper for more details about these formulations.

Installation

$ conda create -n waywardness python=3.8
$ conda activate waywardness
$ conda install pytorch=1.7.1 -c pytorch
$ pip install transformers==4.3.0

Download and Preprocess Data

We use (and modify) the data and the preprocessing script from Gao et al. ACL 2021 (paper, code) and Zhang et al. NeurIPS 2015 (paper, data).

To download the k-shot data (already preprocessed): Download the data (65.6MB) from this link. Pleae place data-processed.zip under the same directory as the code and unzip it.

To download the original data and preprocess yourself: Download the data (14MB) from this link. Pleae place data-processed.zip under the same directory as the code and unzip it.

Then, run python3 generative_k_shot_data.py, and you are done!

Optionally, you can specify arguments such as

  • --data_dir: directory for the original data (default is data/original).
  • --output_dir: directory for the preprocessed data (default is data).

To check the data: You can see the list of five datasets used in the paper by ls data/k-shot. Each dataset consists of five different splits based on five different splits (test sets are the same).

We also used sentences sampled from The PILE, stored under the prompts directory. Please make sure to cite their paper when you use this data.

Default Commands

python3 main.py \ 
    --task {SST-2|sst-5|agnews|trec|subj} \
    --prompt_group {NI|PILE} \
    --split test \
    --data_dir data \
    --out_dir out \
    --method direct \
    --prompt_tune \
    --do_train \
    --gamma {0.01|0}

Useful notes:.

  • You can adjust --batch_size if you run into OOM issue (default is 8).
  • To train with individual prompt, you can replace --prompt_group with --prompt_task.
  • Once you train the model, you can specify --do_check to load the existing checkpoint without retraining the model.
  • Please note that GPU parallization is not implemented for inference.
  • To save a log file, please specify --log_file.

Reproducing Main Results

This section is for reproducing the results of the main experiments in Section 4.2 of the paper.

Run the default commands.

Reproducing Analysis

This section is for reproducing the results of the analysis experiments in Section 4.3 of the paper.

Effect of gamma

Run the default commands, but fix --prompt_group NI and vary --gamma {0|0.0001|0.0005|0.001|0.003|0.005|0.01|0.03}.

Effect of prompt length

Run the default commands, but fix --prompt_group PILE and vary --pile_len {4|7|14|28|56}.

Effect of model size

Run the default commands, but fix --prompt_group PILE --gamma 0.01,0.005,0.003 and vary --gpt2 gpt2-{small|medium|large|xl} .

Projection onto true task definitions

Run the default commands, but fix --prompt_group TRUE.

alrope123-prompt-waywardness's People

Contributors

alrope123 avatar shmsw25 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.