graykode / gpt-2-pytorch Goto Github PK
View Code? Open in Web Editor NEWSimple Text-Generator with OpenAI gpt-2 Pytorch Implementation
License: MIT License
Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation
License: MIT License
Thank you very much.
Hello,
I am trying to train/fine-tune the GPT-2 model using your wrapper, I have successfully made it to train by using a text file, however I would like to train the model with lots of text like 10 thousand words on a specific topic/domain and have it generate from 500-1000 words but I keep getting a strange error when I try it.
Please how do I increase the amount of training/fine-tuning text from the current limit to about 10,000 words?
Very interesting
I have a text file that pulls todays headlines. I'd like to feed that into the model as the prompt.
I have been using an implementation of GPT-2 from your repository and noticed that the size of the smallest GPT-2 model available in the repository differs from the smallest model mentioned in the original paper of GPT-2.
Specifically, the size of the parameters of the smallest model in the repository is about 124M but the smallest model in original paper is 117M
I am curious to know why there is this difference
Doesn't run. Lots of missing dependencies that should be in requirements.txt
"Hi, I am reading the GPT-2 paper and encountering a problem with the following phrase related to implementation:
'A modified initialization method is used to account for the accumulation on the residual path with model depth. We scale the weights of residual layers at initialization by a factor of 1/√N, where N is the number of residual layers.'
My problem is that we normalize after accumulation (addition then normalization). So, why do we need to scale weights? Aren't we doing this to reduce the impact of accumulation?"
I have pulled the code from branch train. Is there a way to train or fine tune the GPT-2 model with data parallelism on multiple GPUs? Thanks for your help.
Thank you for this project! It is very helpful for me to understand how GPT2 synthesize text.
I also noticed that the GPT2/encoder.py
does not implement the capability of recognizing special tokens as the HuggingFace tokenzier could.
The part of source code in HuggingFace's repo is at https://github.com/huggingface/transformers/blob/c836f77266be9ace47bff472f63caf71c0d11333/src/transformers/tokenization_utils.py#L516-L520
I understand that it is not critical, because there is only one special token <|endoftext|>
in use wangkuiyi/huggingface-tokenizer-in-cxx#11
So, just saying.
I installed Python 2 and followed the instructions in the readme, but I'm getting an 'Invalid Syntax' error on the end quote on the following command. I have retyped the command just in case of a copy/paste artifact and I get the same error.
main.py --text "It was a bright cold day in April, and the clocks were striking thirteen. Winston Smith, his chin nuzzled into his breast in an effort to escape the vile wind, slipped quickly through the glass doors of Victory Mansions, though not quickly enough to prevent a swirl of gritty dust from entering along with him."
Is there's any way to train GPT2 using my own text corpus?
It needs these packages as well so I guess they need to go into requirements.txt:
torch
tqdm
Hi,
I was wondering how to integrate the larger released open.ai models with this code base.
Many thanks,
Vince.
Hi, i am new in this field. Can we do transfer learning with a new dataset which suppose may contain specific domain content like food, electronics so on... and train the model?
python main.py --text "Babies cry because" --length 25 --nsamples 5 --batch_size 5
Any help would be greatly appreciated!
I would very much like to know how I can use my own fine-tuned model that I trained using Colab to generate text. I have a bunch of checkpoints but I am uncertain how to proceed from here and (re)produce a bin file.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.