Giter VIP home page Giter VIP logo

gpt-2's Introduction

gpt-2

Code and samples from the paper "Language Models are Unsupervised Multitask Learners".

For now, we have only released a smaller (117M parameter) version of GPT-2.

See more details in our blog post.

Installation

Git clone this repository, and cd into directory for remaining commands

git clone https://github.com/openai/gpt-2.git && cd gpt-2

Then, follow instructions for either native or Docker installation.

Native Installation

Download the model data

sh download_model.sh 117M

The remaining steps can optionally be done in a virtual environment using tools such as virtualenv or conda.

Install tensorflow 1.12 (with GPU support, if you have a GPU and want everything to run faster)

pip3 install tensorflow==1.12.0

or

pip3 install tensorflow-gpu==1.12.0

Install other python packages:

pip3 install -r requirements.txt

Docker Installation

Build the Dockerfile and tag the created image as gpt-2:

docker build --tag gpt-2 -f Dockerfile.gpu . # or Dockerfile.cpu

Start an interactive bash session from the gpt-2 docker image.

You can opt to use the --runtime=nvidia flag if you have access to a NVIDIA GPU and a valid install of nvidia-docker 2.0.

docker run --runtime=nvidia -it gpt-2 bash

Usage

WARNING: Samples are unfiltered and may contain offensive content.

Some of the examples below may include Unicode text characters. Set the environment variable:

export PYTHONIOENCODING=UTF-8

to override the standard stream settings in UTF-8 mode.

Unconditional sample generation

To generate unconditional samples from the small model:

python3 src/generate_unconditional_samples.py | tee /tmp/samples

There are various flags for controlling the samples:

python3 src/generate_unconditional_samples.py --top_k 40 --temperature 0.7 | tee /tmp/samples

To check flag descriptions, use:

python3 src/generate_unconditional_samples.py -- --help

Conditional sample generation

To give the model custom prompts, you can use:

python3 src/interactive_conditional_samples.py --top_k 40

To check flag descriptions, use:

python3 src/interactive_conditional_samples.py -- --help

GPT-2 samples

WARNING: Samples are unfiltered and may contain offensive content.

While we have not yet released GPT-2 itself, you can see some samples from it in the gpt-2-samples folder. We show unconditional samples with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40. We show conditional samples, with contexts drawn from WebText's test set, with default settings (temperature 1 and no truncation), with temperature 0.7, and with truncation with top_k 40.

Citation

Please use the following bibtex entry:

@article{radford2019language,
  title={Language Models are Unsupervised Multitask Learners},
  author={Radford, Alec and Wu, Jeff and Child, Rewon and Luan, David and Amodei, Dario and Sutskever, Ilya},
  year={2019}
}

Future work

We may release code for evaluating the models on various benchmarks.

We are still considering release of the larger models.

License

MIT

gpt-2's People

Contributors

wuthefwasthat avatar imgntn avatar natemurthy avatar github30 avatar armaanbhullar avatar madisonmay avatar mrene avatar minimaxir avatar

Watchers

xingqiangchen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.