Giter VIP home page Giter VIP logo

controlnetinpaint's Introduction

My name is Mikolaj Czerkawski, I am a Research Fellow at the European Space Agency.

My research interests involve computer vision, signal processing, and machine learning.

A recurring theme in my works is the context of learning in data-limited settings. The topics I tend to deal with include:

  • ๐Ÿฑ Multi-Modal Learning
  • ๐ŸŽจ Generative Models
  • ๐Ÿ–ผ Image Synthesis and Manipulation
  • ๐Ÿ”ฌ Image Super-Resolution
  • ๐ŸŒ†โžก๐ŸŒƒImage-to-Image Translation
  • ๐Ÿ”Ž Model Robustness Assessment
  • ๐Ÿ›ฐ Computer Vision for Remote Sensing Applications
  • ๐Ÿ”Š Computer Vision for Radar Signal Processing

My research involves applying computer vision techniques to real-world applications where (i) the datasets are small or (ii) high risk of poor generalization exists. So far, this has primarily been done with short-range radar data and with satellite imagery.

[LinkedIn] Twitter

controlnetinpaint's People

Contributors

mikonvergence avatar neelays avatar remorses avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

controlnetinpaint's Issues

About 'strength' Parameter in StableDiffusionControlNetInpaintPipeline Compared to StableDiffusionInpaintPipeline

The StableDiffusionInpaintPipeline introduces a strength parameter, as detailed in the documentation here. However, I couldn't locate this parameter in the StableDiffusionControlNetInpaintPipeline.

If I use the parameters num_inference_steps=40 and strength=0.93 in StableDiffusionInpaintPipeline, should I then use num_inference_steps=37 (calculated as 40 * 0.93) in StableDiffusionControlNetInpaintPipeline?

How to get multiple images for multiple prompts

Hello @mikonvergence, your work is awesome and i have a query regarding an issue which is rendering in my brain from days.

I have 10-15 different prompts and i want to infer on a single image, also with T4 GPU, the GPU goes into fragments for single image and single prompt.

Thanks and Regards,
Satwik Sunnam.

Unexpected results when used Collab example with other images

Hello,

I'm trying to use the provided Google Colab file to mask out a piece of cloth from the original image of a person wearing cloth and change the cloth with a textual prompt (like color for eg), but I'm encountering issues with the generated image. Specifically, the generated image appears to be of poor quality and has a mixed-up appearance.

Here are my inputs:
A person wearing a cloth.

image

A person wearing a grey cloth (representing no cloth).

image

Prompt

text_prompt="A woman wearing a green shirt"

It seems intuitive, however, the output image I'm receiving is not what I expected. I've followed the instructions provided in the repo, but I'm still unable to achieve satisfactory results.

OUTPUT
image

note:

  1. I tried converting the grey color of the mask image to black to see if it yields any better results, but it did not, unfortunately.
  2. I tried the canny with image and mask image to see any differences, but the generated image was still like this.

Could you please provide some guidance on how to improve the output image quality? If there are any known issues or limitations with the current implementation, please let me know as well.

Cheers
Seth

Did you retrain the ControlNet for the SD-inpainting backbone?

Hi! Thank you for this repo.

I did not understand if you retrained ControlNet using the SD-inpainting backbone, or if you copied over the weight that were trained for the regular SD-backbone by ControlNot authors, and those weights somehow work on the SD-inpainting backbone as well?

Thank you very much,
Thibault

TypeError: StableDiffusionControlNetPipeline.prepare_image() missing 1 required positional argument: 'do_classifier_free_guidance'

pipe.to('cuda')

# generate image
generator = torch.manual_seed(0)
new_image = pipe(
    text_prompt,
    num_inference_steps=20,
    generator=generator,
    image=image,
    control_image=canny_image,
    controlnet_conditioning_scale = 0.5,
    mask_image=mask_image
).images[0]

new_image.save('output/canny_result.png')

Thanks for your great work, by running the above code in notebook, I get some issues:

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Traceback (most recent call last) โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ in <module>:5                                                                                    โ”‚
โ”‚                                                                                                  โ”‚
โ”‚    2                                                                                             โ”‚
โ”‚    3 # generate image                                                                            โ”‚
โ”‚    4 generator = torch.manual_seed(0)                                                            โ”‚
โ”‚ โฑ  5 new_image = pipe(                                                                           โ”‚
โ”‚    6 โ”‚   text_prompt,                                                                            โ”‚
โ”‚    7 โ”‚   num_inference_steps=20,                                                                 โ”‚
โ”‚    8 โ”‚   generator=generator,                                                                    โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ d:\App\miniconda\envs\aigc\lib\site-packages\torch\autograd\grad_mode.py:27 in decorate_context  โ”‚
โ”‚                                                                                                  โ”‚
โ”‚    24 โ”‚   โ”‚   @functools.wraps(func)                                                             โ”‚
โ”‚    25 โ”‚   โ”‚   def decorate_context(*args, **kwargs):                                             โ”‚
โ”‚    26 โ”‚   โ”‚   โ”‚   with self.clone():                                                             โ”‚
โ”‚ โฑ  27 โ”‚   โ”‚   โ”‚   โ”‚   return func(*args, **kwargs)                                               โ”‚
โ”‚    28 โ”‚   โ”‚   return cast(F, decorate_context)                                                   โ”‚
โ”‚    29 โ”‚                                                                                          โ”‚
โ”‚    30 โ”‚   def _wrap_generator(self, func):                                                       โ”‚
โ”‚                                                                                                  โ”‚
โ”‚ c:\Users\Arthur\Downloads\ControlNetInpaint-main\ControlNetInpaint-main\src\pipeline_stable_diff โ”‚
โ”‚ usion_controlnet_inpaint.py:394 in __call__                                                      โ”‚
โ”‚                                                                                                  โ”‚
โ”‚   391 โ”‚   โ”‚   )                                                                                  โ”‚
โ”‚   392 โ”‚   โ”‚                                                                                      โ”‚
โ”‚   393 โ”‚   โ”‚   # 4. Prepare image                                                                 โ”‚
โ”‚ โฑ 394 โ”‚   โ”‚   control_image = self.prepare_image(                                                โ”‚
โ”‚   395 โ”‚   โ”‚   โ”‚   control_image,                                                                 โ”‚
โ”‚   396 โ”‚   โ”‚   โ”‚   width,                                                                         โ”‚
โ”‚   397 โ”‚   โ”‚   โ”‚   height,                                                                        โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ
TypeError: StableDiffusionControlNetPipeline.prepare_image() missing 1 required positional argument: 
'do_classifier_free_guidance'

RuntimeError: GET was unable to find an engine to execute this computation

RuntimeError Traceback (most recent call last)
Cell In[9], line 4
1 from controlnet_aux import OpenposeDetector
3 openpose = OpenposeDetector.from_pretrained('lllyasviel/ControlNet')
----> 4 pose_image = openpose(image)
5 pose_image

File /home/pai/lib/python3.9/site-packages/controlnet_aux/open_pose/init.py:83, in OpenposeDetector.call(self, input_image, detect_resolution, image_resolution, hand_and_face, return_pil)
81 H, W, C = input_image.shape
82 with torch.no_grad():
---> 83 candidate, subset = self.body_estimation(input_image)
84 hands = []
85 faces = []

File /home/pai/lib/python3.9/site-packages/controlnet_aux/open_pose/body.py:44, in Body.call(self, oriImg)
42 # data = data.permute([2, 0, 1]).unsqueeze(0).float()
43 with torch.no_grad():
---> 44 Mconv7_stage6_L1, Mconv7_stage6_L2 = self.model(data)
45 Mconv7_stage6_L1 = Mconv7_stage6_L1.cpu().numpy()
46 Mconv7_stage6_L2 = Mconv7_stage6_L2.cpu().numpy()

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /home/pai/lib/python3.9/site-packages/controlnet_aux/open_pose/model.py:116, in bodypose_model.forward(self, x)
114 def forward(self, x):
--> 116 out1 = self.model0(x)
118 out1_1 = self.model1_1(out1)
119 out1_2 = self.model1_2(out1)

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/container.py:217, in Sequential.forward(self, input)
215 def forward(self, input):
216 for module in self:
--> 217 input = module(input)
218 return input

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/module.py:1501, in Module._call_impl(self, *args, **kwargs)
1496 # If we don't have any hooks, we want to skip the rest of the logic in
1497 # this function, and just call forward.
1498 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1499 or _global_backward_pre_hooks or _global_backward_hooks
1500 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1501 return forward_call(*args, **kwargs)
1502 # Do not call functions when jit is used
1503 full_backward_hooks, non_full_backward_hooks = [], []

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/conv.py:463, in Conv2d.forward(self, input)
462 def forward(self, input: Tensor) -> Tensor:
--> 463 return self._conv_forward(input, self.weight, self.bias)

File /home/pai/lib/python3.9/site-packages/torch/nn/modules/conv.py:459, in Conv2d._conv_forward(self, input, weight, bias)
455 if self.padding_mode != 'zeros':
456 return F.conv2d(F.pad(input, self._reversed_padding_repeated_twice, mode=self.padding_mode),
457 weight, bias, self.stride,
458 _pair(0), self.dilation, self.groups)
--> 459 return F.conv2d(input, weight, bias, self.stride,
460 self.padding, self.dilation, self.groups)

RuntimeError: GET was unable to find an engine to execute this computation

Can this work with SD 2 Inpainting

Thanks a ton for this repo. I have 2 questions:

  1. Is there a way to make it work with Sd 2 Inpainting and potentially upcoming inpainting models ( XL etc)
  2. If I have a ckpt of an custom inpainting model, how can I convert that to diffusers format?

No removing effect

Thanks for the great repo. I was trying to remove an object from an image. I try to use the canny method and also set the prompt to be nothing and decrease the controlnet_conditioning_scale to 0. This works on the default image in the colab but not with any other image. In fact, it produces sth else in the masked area. Could you please explain what else should be done to have removing effect ?

Inpainting new "concepts"

Great work @mikonvergence!
I have a question that is somewhat related to #1. Say I have a poster image and want to inpaint the face in the poster with a given avatar image like:
Screenshot 2023-06-05 at 3 41 19 PM
How can I achieve this given the fact that these avatars are a new "concept" for the LDM? I did try your method mentioned in the issue but it did not work out for me.

MultiControlNet support?

In the original ControlNet pipeline, we can pass a list of controlnet models like this

        self.ptxt = StableDiffusionControlNetPipeline.from_pretrained(
                "runwayml/stable-diffusion-v1-5",
                safety_checker=None,
                requires_safety_checker=False,
                controlnet=[
                    ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-canny", torch_dtype=torch.float16),
                    ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-depth", torch_dtype=torch.float16)
                ],
                torch_dtype=torch.float16).to("cuda")

Is this supported in this pipeline?

Cheers

promptless inpainting?

Is there a way to do promptless inpainting with ControlNet and stable diffusion 1.5 inpainting model?. I want to recreate this https://civitai.com/articles/1907 in colab but don't know how, and I don't want gradio UIs or server because you can't run sd-webui on free Colab and my pc is weak..

About Training

Hi ! Thanks for your great work!
I'd like to ask you how to train the model? Is it training both unet for inpainting and controlnet? Or do you train these two separately?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.