65M parameters LM speed ~10s/batch
pipeline:
- download dataset
- train tokenizer
- train LM
- inferen
run pipeline:
python main.py
Usefull commands
poetry install
make lint # runs linter checkers
make tox # runs tests in specified python versions
make requirements # inport poetry requirements to requirements.txt
make clean # delete chore
todo:
- NewGelu activation layer
- Implement Sophia optimizer