This repository holds the code and resources for my personal project: building a basic Large Language Model (LLM) from scratch! This is an exploration of machine learning, NLP, and the inner workings of LLMs.
Attention!
The repository currently contains:
- A basic LLM implementation without attention mechanisms. (Fancy!)
- Code files with detailed annotations (thanks to Chat GPT and Bard! )
- Links to relevant resources (Keep learning!)
- Attention is All You Need: https://arxiv.org/pdf/1706.03762.pdf (The OG paper on attention! )
- A Survey of LLMs: https://arxiv.org/pdf/2303.18223.pdf (Get the scoop on all things LLM! )
- QLoRA: Efficient Finetuning of Quantized LLMs: https://arxiv.org/pdf/2305.14314.pdf (Cutting-edge optimization techniques! ⚡️)
- [https://www.youtube.com/watch?v=UU1WVnMk4E8] (Visual learning FTW! )