This project implements a mini version of a Generative Pre-trained Transformer (GPT) language model. It utilizes the decoder part of the Transformer model and self-attention mechanism to generate text predictions.
The model is trained on a provided text dataset and can generate text sequences based on the learned patterns. It's implemented in PyTorch and designed to be lightweight for educational purposes.
The green part
is the architecture of Mini GPT Language Model which is inspired from the traditional architecture of transformers.
- Python 3.x
- PyTorch