negated-prompts-for-llms's Introduction

Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts

We aim to answer four main questions in this work. (1) How does scaling the size of LMs affect their abilities to understand the concept of negation? (2) Are LMs explicitly trained to follow instructions (T0, InstructGPT) better at understanding negated instructions? (3) Can In-Context Learning or Fine-tuning help mitigate this problem? (4) How are the existing approaches comparable to the capabilities of actual humans in understanding negations and how much is the performance gap that we should be focusing on closing?

The answers can be found in our case study paper Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts! Come check it out! :)

Dependencies

You can use pip install -r requirements.txt to install the required libraries.

OpenAI Beta

To use GPT-3 you must use OpenAI Beta, which is limited access. You can apply for access here. Once you have access you will need to point the score.py to your API key with the --key argument or put your key in api.key which is the default path.

Running Scorers

Once you have a dataset downloaded, running all the zero-shot scoring strategies at once is as simple as:

CUDA_VISIBLE_DEVICES=[gpu devices ids] python score.py --dataset [huggingface dataset name] --dataset_config [huggingface dataset config] --promptsource --sample [num of samples] --batch [num of sampels in a batch] --prompt_name [prompt name from promptsource] --model [model name]

For example, running inferece of OPT-125m on the ARC-Easy datasets can be done as follows:

CUDA_VISIBLE_DEVICES=0 python score.py --dataset ai2_arc --dataset_config ARC-Easy --promptsource  --use_csv --sample 300 --batch 8 --prompt_name "q&a" --model opt-125m

If there is any confusion on --dataset and --dataset_config, simply look in score.py to see how dataset selection works. --model is the name of OPT, T0, GPT-2 or GPT-3 model e.g. xl, davinci, etc. Check the score.py for the full list of supported LMs. To speed things up you can use a larger --batch if you have enough GPU memory. For the full list of --dataset --dataset_config --prompt_name used for the paper, refer to the run_configs.txt file

Other

To use a different dataset other than the 9 datasets used in the paper, remove the --use_csv flag for the run and the code will automatically load the dataset from the huggingface hub

negated-prompts-for-llms's People

Contributors

Stargazers

Watchers

negated-prompts-for-llms's Issues

What about reversing the order of instructions and questions?

Thank you for sharing this interesting work.
As I looked at the prompts used in the work, I noticed almost in all cases the negations simply replace the word 'correct' with 'incorrect'. I wonder whether you also tested other negation instructions, and would the results still be the same?
For example, switching instructions and questions so that the instructions are directly followed by the answers:

Original:

Generate the incorrect answer to the following question. Question: Astronauts weigh more on Earth than they do on the moon because Answer is

Inverse:

Question: Astronauts weigh more on Earth than they do on the moon because what?
Generate an incorrect answer to the above question. The answer is

For an AR language model, the current prompt "Astronauts weigh more on Earth than they do on the moon because" seems to me a little misleading that the model may tend to simply complete the sentence based on its knowledge regardless of the instruction. Maybe (for now I don't have an OpenAI API to verify this) using a different prompt structure will vary the results?

Any discussion by anyone is welcome.

Recommend Projects

joeljang / negated-prompts-for-llms Goto Github PK

negated-prompts-for-llms's Introduction

Can Large Language Models Truly Understand Prompts? A Case Study with Negated Prompts

Dependencies

OpenAI Beta

Running Scorers

Other

negated-prompts-for-llms's People

Contributors

Stargazers

Watchers

Forkers

negated-prompts-for-llms's Issues

What about reversing the order of instructions and questions?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent