Hi, Thank you for your work. I was able to follow the steps give

How to resume training about dl-art-school HOT 4 OPEN

PS-AI commented on September 22, 2024

How to resume training

from dl-art-school.

Comments (4)

152334H commented on September 22, 2024 2

Hi, you should comment out pretrain_model_gpt, and edit the resume_state file to match the latest training state file. e.g.

path:
  #pretrain_model_gpt: '../experiments/autoregressive.pth'
  strict_load: true
  resume_state: ../experiments/MyExperimentName/training_state/600.state   # <-- Set this to resume from a previous training state.

I will add this as documentation to the README.

from dl-art-school.

152334H commented on September 22, 2024 1

not exactly sure what the model loading differences are, will look into it

from dl-art-school.

Jxspa commented on September 22, 2024

Anyone else struggling to resume training? I can train fine with an 8GB 1080, but it seems impossible to resume. OOM no matter what I try.

from dl-art-school.

FurkanGozukara commented on September 22, 2024

Hi, you should comment out pretrain_model_gpt, and edit the resume_state file to match the latest training state file. e.g.
path:
  #pretrain_model_gpt: '../experiments/autoregressive.pth'
  strict_load: true
  resume_state: ../experiments/MyExperimentName/training_state/600.state   # <-- Set this to resume from a previous training state.
I will add this as documentation to the README.