- A simple from-scratch implementation of llama model, which has 2 major distinguished components: Rotary Embedding and SwigLU.
- The dataset used in this implementation is TinyShakespeare.
- This implementation is just a replication of this repo: https://github.com/bkitano/llama-from-scratch. All credit to the author