Giter VIP home page Giter VIP logo

dart's Introduction

DART

Implementation for ICLR2022 paper Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners.

  • ❗NOTE: The code has been reorganized and we also provide a paper-list at PromptKG.

Environment

  • [email protected]
  • Use pip install -r requirements.txt to install dependencies.
  • wandb account is required if the user wants to search for best hyper-parameter combinations.

Data source

  • 16-shot GLUE dataset from LM-BFF.
  • Generated data consists of 5 random splits (13/21/42/87/100) for a task, each has 16 samples.
    • The generation process follows LM-BFF here.

How to run

  • To train / test on a data split from a single task with specific parameters, use run.py.
    • For customized training & evaluation, you can modify based on the sample configuration file config/sample.yml.
$ python run.py -h  
usage: run.py [-h] [--config CONFIG] [--do_train] [--do_test]

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG, -c CONFIG
                        Configuration file storing all parameters
  --do_train
  --do_test
  • To search optimal hyper-parameters for each task and reproduce our result, please use sweep.py:
    • Please refer to documentation for WandB for more details.
    • ❗NOTE: we follow LM-BFF in that we search optimal sets of hyper-parameters on different data splits respectively.
$ python sweep.py -h
usage: sweep.py [-h] [--project_name PROJECT_NAME] --task_name TASK_NAME
                [--data_split {13,21,42,87,100}]
                [--pretrain_model PRETRAIN_MODEL] [--pet_method {pet,diffpet}]
                [--random_seed RANDOM_SEED] [--max_run MAX_RUN]

optional arguments:
  -h, --help            show this help message and exit
  --project_name PROJECT_NAME
                        project name for sweep
  --task_name TASK_NAME
  --data_split {13,21,42,87,100}
                        few-shot split-id for GLUE dataset
  --pretrain_model PRETRAIN_MODEL
                        name or path for pretrained model
  --pet_method {pet,diffpet}
                        prompt encoding method
  --random_seed RANDOM_SEED
                        random seed for training
  --max_run MAX_RUN     maximum tries for sweep

How to Cite

@inproceedings{
zhang2022differentiable,
title={Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners},
author={Ningyu Zhang and Luoqiu Li and Xiang Chen and Shumin Deng and Zhen Bi and Chuanqi Tan and Fei Huang and Huajun Chen},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=ek9a0qIafW}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.