Implementation of a generative language model based on the GPT (Generative Pre-trained Transformer) architecture. The model is designed to generate text based on the training data, and it uses a transformer architecture with self-attention mechanisms.
justmogen / ngpt Goto Github PK
View Code? Open in Web Editor NEWBuilding a Generatively Pretrained Transformer (GPT), following the paper "Attention is All You Need"