Giter VIP home page Giter VIP logo

gpt-2-pytorch's Introduction

GitHub Contributions

gpt-2-pytorch's People

Contributors

graykode avatar raveenb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

gpt-2-pytorch's Issues

Help Increasing the amount of training/fine-tuning text to about 10k words

Hello,
I am trying to train/fine-tune the GPT-2 model using your wrapper, I have successfully made it to train by using a text file, however I would like to train the model with lots of text like 10 thousand words on a specific topic/domain and have it generate from 500-1000 words but I keep getting a strange error when I try it.
Please how do I increase the amount of training/fine-tuning text from the current limit to about 10,000 words?

Discrepancy in Parameter Size of Smallest Model

I have been using an implementation of GPT-2 from your repository and noticed that the size of the smallest GPT-2 model available in the repository differs from the smallest model mentioned in the original paper of GPT-2.
Specifically, the size of the parameters of the smallest model in the repository is about 124M but the smallest model in original paper is 117M

I am curious to know why there is this difference

GPT-2 implementation problem

"Hi, I am reading the GPT-2 paper and encountering a problem with the following phrase related to implementation:

'A modified initialization method is used to account for the accumulation on the residual path with model depth. We scale the weights of residual layers at initialization by a factor of 1/√N, where N is the number of residual layers.'

My problem is that we normalize after accumulation (addition then normalization). So, why do we need to scale weights? Aren't we doing this to reduce the impact of accumulation?"

Cannot recognize <|endoftext|>

Thank you for this project! It is very helpful for me to understand how GPT2 synthesize text.

I also noticed that the GPT2/encoder.py does not implement the capability of recognizing special tokens as the HuggingFace tokenzier could.

The part of source code in HuggingFace's repo is at https://github.com/huggingface/transformers/blob/c836f77266be9ace47bff472f63caf71c0d11333/src/transformers/tokenization_utils.py#L516-L520

I understand that it is not critical, because there is only one special token <|endoftext|> in use wangkuiyi/huggingface-tokenizer-in-cxx#11

So, just saying.

Invalid Syntax

I installed Python 2 and followed the instructions in the readme, but I'm getting an 'Invalid Syntax' error on the end quote on the following command. I have retyped the command just in case of a copy/paste artifact and I get the same error.

main.py --text "It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent a swirl of gritty dust from entering along with him."

training

Is there's any way to train GPT2 using my own text corpus?

Missing requirements

It needs these packages as well so I guess they need to go into requirements.txt:

torch
tqdm

Can we use transfer learning on GPT2?

Hi, i am new in this field. Can we do transfer learning with a new dataset which suppose may contain specific domain content like food, electronics so on... and train the model?

Use my finetuned model?

I would very much like to know how I can use my own fine-tuned model that I trained using Colab to generate text. I have a bunch of checkpoints but I am uncertain how to proceed from here and (re)produce a bin file.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.