rohitgandikota / sliders Goto Github PK

View Code? Open in Web Editor NEW

784.0 784.0 64.0 31.27 MB

Concept Sliders for Precise Control of Diffusion Models

Home Page: https://sliders.baulab.info

License: MIT License

Python 5.08% Jupyter Notebook 94.92%

sliders's People

Contributors

Stargazers

Watchers

sliders's Issues

apply in common UI's

is there a way to load this into a11 or comfyUI? or would this need some special plugin to work? I tried to load it as an embedding or hypernetwork, but neither worked; thanks

How does attribute presevation in 'attributes' work?

Can you kindly explain the technical details? For example, what does 'male, female' in attributes do behind the scene that causes these attributes to be preserved?

atifacts when runing demo_image_editing

hi, I run this demo , but got bad results as follows:

the inversion is good, but after editing, the result is bad

Is paired data necessary for the training of visual concept slider?

Question about the prompts file for text sliders

To train the age text slider, is the prompt.yaml with --attributes=""

- target: "male person" # what word for erasing the positive concept from
  positive: "male person, very old" # concept to erase
  unconditional: "male person, very young" # word to take the difference from the positive concept
  neutral: "male person" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1
- target: "female person" # what word for erasing the positive concept from
  positive: "female person, very old" # concept to erase
  unconditional: "female person, very young" # word to take the difference from the positive concept
  neutral: "female person" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1

equivalent to the following prompt with --attributes="male, female" ?

 - target: "person" # what word for erasing the positive concept from
  positive: "person, very old" # concept to erase
  unconditional: "person, very young" # word to take the difference from the positive concept
  neutral: "person" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1

BTW, it seems that erasing/erase should be enhancing/enhance in the prompt.yaml.

how to fix hand in your pipeline ?

nice work , can you support hand-fix checkpoint and examples ?

meaning of the stylegan_latent sliders

Hi! Great work!

I was wondering what kind of attributes the stylegan_latent1 and stylegan_latent2 sliders were editing. Are those the same as dscribed in the paper i.e. the cheekbone structure and inter-ocular distance?

NaN loss at fp16

Attempting to train on a Colab T4, which requires fp16 precision.

All scripts report NaN loss after the first Network update, no matter the hyperparameters I choose.

Any advice for future experimentation?

question about your training code: where is 'network' variable being used for training?

Hello, I'm currently delving into the training code to tailor it for my specific needs. However, I'm puzzled about the integration of the LoRA slider into the main UNet. I noticed that the LoRA adapter is established within the 'network' variable, yet it seems to be utilized solely as a context manager in this section of the code while the training process still calls original unet

https://github.com/rohitgandikota/sliders/blob/main/trainscripts/textsliders/train_lora_xl.py#L205C1-L227C18

Is it some complex Python syntax that I'm not yet familiar with?

question about your Evaluate code: where is Visual Concept Sliders Evaluate code for Evaluating?

Congrats for this great work.
where is Visual Concept Sliders Evaluate code for Evaluating?
Thanks!

Visual Concept Sliders training is not working properly.

I am following the readme and trying to train the visual concept slider(eye).

But when I try to generate an image after the training is done, it generates a mess. Am I missing something?

Here is the pair image I prepared.

python trainscripts/imagesliders/train_lora-scale.py --name "eyeslider" --rank 4 --alpha 1 --config_file "trainscripts/imagesliders/data/config_2.yaml" --folder_main "datasets/eyesize/" --folders "bigsize, smallsize" --scales "1, -1"

bigsize

smallsize

How to compare image dataset (before/after edit for desired concept)?

hello, thx for your nice work,

I wanna know how do you compare image dataset (before/after edit for desired concept)? Could you please provide some examples?

Practicable to applying it to another lora model (when SDXL already has a lora adaptor attached)?

Hello! I appreciate your excellent work. Before I delve into experimenting with the code, I'm interested in understanding the practicality of implementing a concept slider on an existing LoRa model. Specifically, I'm referring to a scenario where the SDXL model already has a LoRa adapter integrated. I've previously trained a LoRa model using the diffusers LoRa training script. Is it possible to utilize this as a secondary adapter for semantic control on top of my existing model?

how can i use repair slider in the demo_image_editing.ipynb

I got this error

Error(s) in loading state_dict for LoRANetwork:
Missing key(s) in state_dict: "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_q.alpha", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_q.lora_down.weight",

when I m trying to use repair.pt in the demo_image_editing.ipynb

Can I use lora sliders with peft?

Hi, thanks for your job, looks promising!

I've come through your inference code, you are using custom LoRANetwork class, however diffusion lib already has all necessary functions for loading and inferencing with lora(including PEFT integration).

Can I load loras, trained by your train script, directly via diffusers/peft api, like pipe.load_lora_weights?

Thanks!

Questions Regarding null text inversion Method, Visual Slider Training in prompts.yaml, and Loss Function Usage

Hello,

I have a few questions that I'm hoping to get some clarity on:

Does the null text inversion method only support the original SD versions v1.x, v2.x, and XL? I've noticed that it performs poorly on my custom-fused SD models. Could you shed some light on this?
Regarding the training of the visual slider in prompts.yaml, I tried both erase and enhance actions, but the post-training effects seem identical. I'm curious as to why this is happening.
In exploring your code, I understand that the difference between erase and enhance lies in the PromptEmbedsPair class's loss function in prompt_util.py. However, I couldn't locate where this loss function is actually called in the code. Could you please guide me to the relevant part?

I appreciate your time and assistance in addressing these queries.

Best regards！

channels issue

Hi. I tried to train with an imageslider, but the training failed to start with this error. Also, the training model did not change from SD1.4. The training environment was run in Runpod (Linux). What am I doing wrong?

0%| | 0/1000 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/workspace/sliders/trainscripts/imagesliders/train_lora-scale.py", line 501, in
main(args)
File "/workspace/sliders/trainscripts/imagesliders/train_lora-scale.py", line 419, in main
train(config=config, prompts=prompts, device=device, folder_main = args.folder_main, folders = folders, scales = scales)
File "/workspace/sliders/trainscripts/imagesliders/train_lora-scale.py", line 225, in train
denoised_latents_low, low_noise = train_util.get_noisy_image(
File "/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/workspace/sliders/trainscripts/imagesliders/train_util.py", line 221, in get_noisy_image
init_latents = vae.encode(image).latent_dist.sample(None)
File "/venv/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/venv/lib/python3.10/site-packages/diffusers/models/autoencoder_kl.py", line 242, in encode
h = self.encoder(x)
File "/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/diffusers/models/vae.py", line 111, in forward
sample = self.conv_in(sample)
File "/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 463, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [128, 3, 3, 3], expected input[1, 4, 256, 256] to have 3 channels, but got 4 channels instead
root@64ea1ab8d463:/workspace/sliders#

Big-eye dataset for Visual Slider

Hi Rohit, thanks for the amazing work. It's really fantastic.
I'm trying Visual Sliders on my private dataset but seems the results is not satisfied.
Thus I would like reproduce some of the brilliant models mentioned in the paper like Eye-size.
I found the paper claim such model is trained on large-eyes Ostris dataset but shame to say I cannot find any dataset in the provided link.
May I ask for the dataset?

One more question, as you mentioned the visual sliders only need ~4-6 pairs of image dataset, is this the dataset size for all visual sliders?

Best,
Michael

UnidentifiedImageError: cannot identify image file 'stock_photo_girl.jpg'

Congrats for this great work. When running the demo_image_editing.ipynb colab, I get an error message
UnidentifiedImageError: cannot identify image file 'stock_photo_girl.jpg'
Am I doing something wrong:
Thanks!

How to compose 2 learned sliders?

As shown in this figure. Is corresponding code available in this repo?

Loss for trainning

XL-sliders-inference

the XL-Sliders-inference notebook depends on annotations in the filename

            if 'full' in lora_weight:
                train_method = 'full'
            elif 'noxattn' in lora_weight:
                train_method = 'noxattn'
            else:
                train_method = 'noxattn'

            #train_method = 'full'

            network_type = "c3lier"
            if train_method == 'xattn':
                network_type = 'lierla'

            #network_type = 'lierla'

            modules = DEFAULT_TARGET_REPLACE
            if network_type == "c3lier":
                modules += UNET_TARGET_REPLACE_MODULE_CONV
            import os
            model_name = lora_weight

            name = os.path.basename(model_name)
            rank = 1
            alpha = 4
            if 'rank4' in lora_weight:
                rank = 4
            if 'rank8' in lora_weight:
                rank = 8
            if 'alpha1' in lora_weight:
                alpha = 1.0

but the pretrained models

https://sliders.baulab.info/weights/xl_sliders/

don't have these annotations

Crash when using `euler_a` sampler

Hi,

The training script does not run if you change the config train.noise_scheduler to euler_a:

  File "T:\code\python\sliders\trainscripts\textsliders\train_lora.py", line 433, in <module>
    main(args)
  File "T:\code\python\sliders\trainscripts\textsliders\train_lora.py", line 378, in main
    train(config=config, prompts=prompts, device=device)
  File "T:\code\python\sliders\trainscripts\textsliders\train_lora.py", line 189, in train
    latents = train_util.get_initial_latents(
  File "T:\code\python\sliders\trainscripts\textsliders\train_util.py", line 55, in get_initial_latents
    latents = noise * scheduler.init_noise_sigma
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
Press any key to continue . . .

Updating get_initial_latents() in train_util.py as shown appears to fix it, although I'm not sure if I broke DDIM or something else in the process:

def get_initial_latents(
    scheduler: SchedulerMixin,
    n_imgs: int,
    height: int,
    width: int,
    n_prompts: int,
    generator=None,
) -> torch.Tensor:
    noise = get_random_noise(n_imgs, height, width, generator=generator).repeat(
        n_prompts, 1, 1, 1
    ).to("cuda")

    init_noise_sigma = torch.tensor(scheduler.init_noise_sigma).to("cuda")
    latents = noise * init_noise_sigma

    return latents

Please make a pull request to Automatic1111 SD Web UI for support

Majority of the Stable Diffusion community using that UI

That will be super uber helpful

Unused code?

https://github.com/rohitgandikota/sliders/blob/4a0ac3ebed727fd0d11da3d2da70aa38077a0a61/trainscripts/textsliders/train_lora.py#L230C19-L230C19

The variable orig_target_latents does not seem to be used further in the code. It is also not present in train_lora_xl.py. Can the code block from 230-241 be removed?

colab demo

please add colab demo

Is it possible to do this with img2img?

I was wondering if img2img could also work with this given the text prompt.

Long training time?

Compared to training a LoRA via for example kohya scripts it takes a great deal longer to train a slider LoRA. Considering that the results are not always that great, because concepts got confused and retraining is needed with additional --attributes, would it be possible to get an idea of whether the training is going anywhere so we can adjust earlier or abort entirely?

Or could training speeds perhaps be increased somehow? Maybe LCM would be an option as it seems actual images are created during training, which is a lot faster with LCM.

wandb logging

hello it seems like the wandb logs is not working properly only logging losses on console with verbose true

Cannot import lora

from lora import LoRANetwork, DEFAULT_TARGET_REPLACE, UNET_TARGET_REPLACE_MODULE_CONV

Where is the lora file? I cannot seem to import.

Hope to add xformers

Thank u for great work!
i had noticed last paers ESD in April,and i used LECO like sliders by adjusting CFG.
This is Augest article about LECO my written.
https://civitai.com/articles/1766

I think sliders is more faster and effiect than normal finetune.
but even in LECO, train SDXL use about 20GB VRAM when use xformers, i think this repo need xformers to save VRAM.

Questions about training/inference implementation details?

In the requirements.txt, the version of diffusers is 0.20.2, but it appears that the __call__ method in the https://github.com/rohitgandikota/sliders/blob/main/eval-scripts/generate_images_xl.py#L39 has been modified based on diffusers version 0.21.0 and above. The original implementation of diffusers is here https://github.com/huggingface/diffusers/blob/v0.21.0/src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py#L544. What is the rationale behind this?
I found that the implementation of LORA is adapted from https://github.com/p1atdev/LECO/blob/main/lora.py, and it uses the '3clier' type of LoRA. What is the reason for this?
I see that most of the results are based on XL. How is the performance on SD1, and do you have any pre-trained models for it?
The SDEdit technique was discussed in #2.

image editing with XL model

Hi! I m so happy to use image editing code

I have a question about the image editing with XL model. how can I do that?

use in real images?

Is there any way to use it on real images?

Why performance of on dog is bad?

I only edited prompt.yaml and train using the official command:

python trainscripts/textsliders/train_lora.py --attributes 'male, female' --name 'ageslider' --rank 4 --alpha 1 --config_file 'trainscripts/textsliders/data/config.yaml'

prompt.yaml:

- target: "dog" # what word for erasing the positive concept from
  positive: "dog, very old" # concept to erase
  unconditional: "dog, very young" # word to take the difference from the positive concept
  neutral: "dog" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  resolution: 512 
  dynamic_resolution: false
  batch_size: 1

The result I got when running demo_image_editing.ipynb (only modifying scales to [-1, 0, 2, 4, 6, 8, 10]):

error: the following arguments are required: --alpha

All of this is set in the config.yaml, and should not be needed on the command line.

There is why it is a typo and that True change to False.

Compare with LEDITS

It seems that the essence of your work is the same as LEDITS. Can you compare with them, please?

TypeError: issubclass() arg 1 must be a class

I just quick test and it give me this error

(sliders) F:\slider lora\sliders>python trainscripts/textsliders/train_lora.py --attributes 'male, female' --name 'ageslider' --rank 4 --alpha 1 --config_file 'trainscripts/textsliders/data/config.yaml'
Traceback (most recent call last):
File "F:\slider lora\sliders\trainscripts\textsliders\train_lora.py", line 18, in
import prompt_util
File "F:\slider lora\sliders\trainscripts\textsliders\prompt_util.py", line 44, in
class PromptSettings(BaseModel): # yaml のやつ
File "pydantic\main.py", line 198, in pydantic.main.ModelMetaclass.new
File "pydantic\fields.py", line 506, in pydantic.fields.ModelField.infer
File "pydantic\fields.py", line 436, in pydantic.fields.ModelField.init
File "pydantic\fields.py", line 552, in pydantic.fields.ModelField.prepare
File "pydantic\fields.py", line 668, in pydantic.fields.ModelField._type_analysis
File "C:\Users\Pond\anaconda3\envs\sliders\lib\typing.py", line 852, in subclasscheck
return issubclass(cls, self.origin)
TypeError: issubclass() arg 1 must be a class

chardet required

Just to let you know that, using conda, I received this message :

Traceback (most recent call last): File "C:\Users\AdminI9\anaconda3\envs\sliders\lib\site-packages\requests\compat.py", line 11, in <module> import chardet ModuleNotFoundError: No module named 'chardet'

Fixed with 'pip install chardet'.

Eventually, chardet should be included on requirements.txt

eta: forgot to mention -> Windows10

Datasets for the trained models

Will it be possible to share some of the images you used to train these models?
Discovered that some models work well and some don't on some images. I'm wondering if it's due to the images that were used to train it or if diffusion model's trying to figure out the concept.

Issues with sliders training

Hi!

Im trying to run the training but I have this problem:

Do I have to modify something in config.xml?

It's weird cause I make it working in other PC, Im running windows. Could you help me? Thanks!

Notebook not found

When I click on the colab notebook button, the colab server gives the following, even after authenticating with GH:

Evaluating pretrained models use the notebook SD1-sliders-inference.ipynb

Dear authors,

Very thank you for sharing this code, while I face problem when I evaluating your method based on your provide file "SD1-sliders-inference.ipynb". (Could you please check the bug below, I only exposed part of the bug)

I firstly training your method by "python trainscripts/textsliders/train_lora.py --attributes 'male, female' --name 'ageslider' --rank 4 --alpha 1 --config_file 'trainscripts/textsliders/data/config.yaml'"

Could you please help to check this problem? Thank you very much!

"""
*** RuntimeError: Error(s) in loading state_dict for LoRANetwork:
Missing key(s) in state_dict: "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_q.alpha", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_q.lora_down.weight", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_q.lora_up.weight", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_k.alpha", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_k.lora_down.weight", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_k.lora_up.weight", "lora_unet_down_blocks_0_attentions_0_transformer_blocks_0_attn1_to_v.alpha",
.......
.......
.......
size mismatch for lora_unet_up_blocks_2_resnets_2_conv2.lora_down.weight: copying a param with shape torch.Size([4, 320, 3, 3]) from checkpoint, the shape in current model is torch.Size([4, 640, 3, 3]).
size mismatch for lora_unet_up_blocks_2_resnets_2_conv2.lora_up.weight: copying a param with shape torch.Size([320, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 4, 1, 1]).
size mismatch for lora_unet_up_blocks_2_resnets_2_conv_shortcut.lora_down.weight: copying a param with shape torch.Size([4, 640, 1, 1]) from checkpoint, the shape in current model is torch.Size([4, 960, 1, 1]).
size mismatch for lora_unet_up_blocks_2_resnets_2_conv_shortcut.lora_up.weight: copying a param with shape torch.Size([320, 4, 1, 1]) from checkpoint, the shape in current model is torch.Size([640, 4, 1, 1]).
"""

Is it possible to train with checkpoints?

The config file says I can use ckpt or safetensors, but when I tried gsdf/Counterfeit-V3.0 as a test, I get an error. Reading the error, it looks like it won't train unless it's a diffuser.

OSError: gsdf/Counterfeit-V3.0 does not appear to have a file named text_encoder/config.json. Checkout 'https://huggingface.co/gsdf/Counterfeit-V3.0/main' for available files.

I was also wondering if it is possible to train with a locale model?

I have a question about Visual Concept Sliders!

Hi! I'm trying to train a slider to remove the background from an image and replace it with white.

But the visual slider didn't work as I expected. I changed Lr and rank but didn't get the desired result. The result is not stable compared to the text slider.

On the right is a green screen Lora that I previously created using the copier Lora technique. What would I need to modify to get a similar result with the visual concept slider? (https://note.com/kohya_ss/n/nb258da07236f)

Below is the image pair I prepared.

Support additional image types & handle resizing more gracefully

First, thank you for publishing your amazing work! My first impression of Textual Concepts training was very good. I'm attempting to train a Visual Concept now, but ran into a couple issues:

The train_lora-scale.py script is hardcoded to use .png files (line 216); it should be trivial to extend support for other common filetypes such as .jpg and maybe .webp.
The script also forces images to resize to 1024x1024. I'm guessing for SD 1.5 this should be 512x512, no? Also would be interesting if the trainer could support aspect ratio bucketing in the future. 🤔

Thanks!

EDIT: Hmm, I'm a little confused as to why the train_lora-scale-xl script resizes inputs to 512x512 while the train_lora-scale script uses 1024x1024. Isn't this backwards?

How can I fix the error "LoRAModule.forward() takes 2 positional arguments but 3 were given"

0%| | 0/50 [00:00<?, ?it/s]

TypeError Traceback (most recent call last)
Cell In[6], line 54
52 for scale in scales:
53 generator = torch.manual_seed(seed)
---> 54 images = pipe(prompt, num_images_per_prompt=1, num_inference_steps=50, generator=generator, network=network, start_noise=start_noise, scale=scale, unet=unet).images[0]
55 image_list.append(images)
56 del unet, network, pipe

File ~/miniconda3/envs/sd/lib/python3.10/site-packages/torch/autograd/grad_mode.py:27, in _DecoratorContextManager.call..decorate_context(*args, **kwargs)
24 @functools.wraps(func)
25 def decorate_context(*args, **kwargs):
26 with self.clone():
---> 27 return func(*args, **kwargs)

Cell In[2], line 313, in call(self, prompt, prompt_2, height, width, num_inference_steps, denoising_end, guidance_scale, negative_prompt, negative_prompt_2, num_images_per_prompt, eta, generator, latents, prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds, output_type, return_dict, callback, callback_steps, cross_attention_kwargs, guidance_rescale, original_size, crops_coords_top_left, target_size, negative_original_size, negative_crops_coords_top_left, negative_target_size, network, start_noise, scale, unet)
311 added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
312 with network:
--> 313 noise_pred = unet(
314 latent_model_input,
315 t,
316 encoder_hidden_states=prompt_embeds,
317 cross_attention_kwargs=cross_attention_kwargs,
318 added_cond_kwargs=added_cond_kwargs,
319 return_dict=False,
320 )[0]
322 # perform guidance
323 if do_classifier_free_guidance:

File ~/miniconda3/envs/sd/lib/python3.10/site-packages/torch/nn/modules/module.py:1194, in Module._call_impl(self, *input, **kwargs)
1190 # If we don't have any hooks, we want to skip the rest of the logic in
1191 # this function, and just call forward.
1192 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1193 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194 return forward_call(*input, **kwargs)
1195 # Do not call functions when jit is used
1196 full_backward_hooks, non_full_backward_hooks = [], []

File ~/miniconda3/envs/sd/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py:966, in UNet2DConditionModel.forward(self, sample, timestep, encoder_hidden_states, class_labels, timestep_cond, attention_mask, cross_attention_kwargs, added_cond_kwargs, down_block_additional_residuals, mid_block_additional_residual, encoder_attention_mask, return_dict)
956 sample, res_samples = downsample_block(
957 hidden_states=sample,
958 temb=emb,
(...)
963 **additional_residuals,
964 )
965 else:
--> 966 sample, res_samples = downsample_block(hidden_states=sample, temb=emb, scale=lora_scale)
968 if is_adapter and len(down_block_additional_residuals) > 0:
969 sample += down_block_additional_residuals.pop(0)

File ~/miniconda3/envs/sd/lib/python3.10/site-packages/diffusers/models/unet_2d_blocks.py:1183, in DownBlock2D.forward(self, hidden_states, temb, scale)
1179 hidden_states = torch.utils.checkpoint.checkpoint(
1180 create_custom_forward(resnet), hidden_states, temb
1181 )
1182 else:
-> 1183 hidden_states = resnet(hidden_states, temb, scale=scale)
1185 output_states = output_states + (hidden_states,)
1187 if self.downsamplers is not None:

File ~/miniconda3/envs/sd/lib/python3.10/site-packages/diffusers/models/resnet.py:637, in ResnetBlock2D.forward(self, input_tensor, temb, scale)
626 input_tensor = (
627 self.downsample(input_tensor, scale=scale)
628 if isinstance(self.downsample, Downsample2D)
629 else self.downsample(input_tensor)
630 )
631 hidden_states = (
632 self.downsample(hidden_states, scale=scale)
633 if isinstance(self.downsample, Downsample2D)
634 else self.downsample(hidden_states)
635 )
--> 637 hidden_states = self.conv1(hidden_states, scale)
639 if self.time_emb_proj is not None:
640 if not self.skip_time_act:

TypeError: LoRAModule.forward() takes 2 positional arguments but 3 were given

TypeError: issubclass() arg 1 must be a class

When trying to run a command to train SD-XL:
python trainscripts/textsliders/train_lora_xl.py --attributes 'male, female' --name 'agesliderXL' --rank 4 --alpha 1 --config_file 'data/config-xl.yaml

An error occurs:

(sliders) C:\!NeuralNetwork\sliders>python trainscripts/textsliders/train_lora_xl.py --attributes 'male, female' --name 'agesliderXL' --rank 4 --alpha 1 --config_file 'data/config-xl.yaml Traceback (most recent call last): File "C:\!NeuralNetwork\sliders\trainscripts\textsliders\train_lora_xl.py", line 18, in <module> import prompt_util File "C:\!NeuralNetwork\sliders\trainscripts\textsliders\prompt_util.py", line 44, in <module> class PromptSettings(BaseModel): # yaml のやつ File "pydantic\main.py", line 198, in pydantic.main.ModelMetaclass.__new__ File "pydantic\fields.py", line 506, in pydantic.fields.ModelField.infer File "pydantic\fields.py", line 436, in pydantic.fields.ModelField.__init__ File "pydantic\fields.py", line 552, in pydantic.fields.ModelField.prepare File "pydantic\fields.py", line 668, in pydantic.fields.ModelField._type_analysis File "C:\Users\Aleksandr.Antropov\AppData\Local\miniconda3\envs\sliders\lib\typing.py", line 852, in __subclasscheck__ return issubclass(cls, self.__origin__) TypeError: issubclass() arg 1 must be a class

Setting the environment and requirements.txt were executed verbatim:

conda create -n sliders python=3.9
conda activate sliders

git clone https://github.com/rohitgandikota/sliders.git
cd sliders
pip install -r requirements.txt

There were no errors during the installation of dependencies.

Request to Share parameter used in training for Fix_hands.pt

Hi, I was particular interested in the Fix_hands.pt slider, but was getting mixed results with Juggernaut-xl, I have used the EyeSize.pt and it works well. was curious to see what kind of parameters were used if available for the shared LoRa files. Please direct me, if it's already been shared, Thanks!!!

how to prepare train dataset for repair slider?

as Figure 10 of the paper shown, i need to prepare train dataset include some image with bad details and some others with fine details, is it right?

rohitgandikota / sliders Goto Github PK

sliders's People

Contributors

Stargazers

Watchers

Forkers

sliders's Issues

0%| | 0/50 [00:00<?, ?it/s]

Recommend Projects

Recommend Topics

Recommend Org