I use this command with 100 reg images in ./data/1girl-reg and 56 instance images in .

Fixed it the problem was I was getting this error before <code class

Not working about custom-diffusion HOT 3 OPEN

adobe-research commented on August 15, 2024

Not working

from custom-diffusion.

Comments (3)

TingTingin commented on August 15, 2024 1

Fixed it the problem was I was getting this error before

attempting to unscale fp16 gradients

if fixed by changing this

                    accelerator.clip_grad_norm_(params_to_clip, args.max_grad_norm)
                optimizer.step()
                lr_scheduler.step()
                optimizer.zero_grad()

                #accelerator.clip_grad_norm_(params_to_clip, args.max_grad_norm)
            #optimizer.step()
            lr_scheduler.step()
            optimizer.zero_grad()

this apparently disabled some critical part of the process the actual fix was editing

"C:\Users\{your username}\anaconda3\envs\ST\Lib\site-packages\torch\cuda\amp\grad_scaler.py"

and changing

        with torch.no_grad():
            for group in optimizer.param_groups:
                for param in group["params"]:
                    if param.grad is None:
                        continue

        with torch.no_grad():
            for group in optimizer.param_groups:
                for param in group["params"]:
                    if param.grad is None:
                        continue
                    allow_fp16 = True

apparently theres an issue when using mixed precision and you need to explicitly enable this not sure of a better solution to add to the main train script's code as opposed to editing torch's files directly

from custom-diffusion.

TingTingin commented on August 15, 2024

Also as a side note since it seems that the gradient clipping is to focus the training.

Is it possible to use this method to finetune the entire model with this repo only asking as I cant finetune on a 8gb gpu with any other method and it would be interesting if this could technically fine tune the entire model

from custom-diffusion.

nupurkmr9 commented on August 15, 2024

Hi,
Thanks a lot for pointing out the error with mixed precision training. I will look into it more.

Regarding enabling full fine-tune in the same code, it should be possible by adding another type in the --freeze_model flag, and enabling all params to have gradients in the create_custom_diffusion function. Also, in case of full fine-tuning calling save_progress during training and load_model during inference is not required.
Let me know if you need more details. I will see if I can update the code to enable this as well.

Thanks.

from custom-diffusion.

Recommend Projects

Not working about custom-diffusion HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent