sulrash / minllmtrain Goto Github PK
View Code? Open in Web Editor NEWMinimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP
License: MIT License