Giter VIP home page Giter VIP logo

svd-temporal-controlnet's People

Contributors

blakeone avatar ciarastrawberry avatar eltociear avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

svd-temporal-controlnet's Issues

some other conditions

hi
@CiaraStrawberry
Thank you for open sourcing such great work. Currently your work only supports depth maps, not sure if it supports other conditions like pose, map etc. Maybe, can I also train my own based on psoe or other conditions?
Looking forward your reply!!!

Torch not compiled with CUDA enabled

(venv) C:\svd-temporal-controlnet>python run_inference.py
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
    PyTorch 2.1.2+cu121 with CUDA 1201 (you have 2.1.2+cpu)
    Python  3.10.11 (you have 3.10.6)
  Please reinstall xformers (see https://github.com/facebookresearch/xformers#installing-xformers)
  Memory-efficient attention, SwiGLU, sparse and more won't be available.
  Set XFORMERS_MORE_DETAILS=1 for more details
layers per block is 2
Loading pipeline components...: 100%|████████████████████████████████████████████████████| 5/5 [00:00<00:00, 25.25it/s]
Traceback (most recent call last):
  File "C:\svd-temporal-controlnet\run_inference.py", line 274, in <module>
    video_frames = pipeline(validation_image, validation_control_images[:14], decode_chunk_size=8,num_frames=14,motion_bucket_id=100,controlnet_cond_scale=1.0).frames
  File "C:\svd-temporal-controlnet\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "C:\svd-temporal-controlnet\pipeline\pipeline_stable_video_diffusion_controlnet.py", line 441, in __call__
    image_embeddings = self._encode_image(image, device, num_videos_per_prompt, do_classifier_free_guidance)
  File "C:\svd-temporal-controlnet\pipeline\pipeline_stable_video_diffusion_controlnet.py", line 155, in _encode_image
    image = image.to(device=device, dtype=dtype)
  File "C:\svd-temporal-controlnet\venv\lib\site-packages\torch\cuda\__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

and can you tell me where is downloaded models?

Great Work. Curious about more details for training.

Hi,

Kudos on the great work for ControlSVD. I tested it on some of my customized case they looks very appealing. Wonder do you plan to release more details in regards of training? E.g., size of data you leveraged for training, hyper-parameters and etc. And also it'd be very helpful if you plan to release sample of training data. Looking forward to your response.

Regards
combined_frames_20240227-042221 (1)

How to call this control node?

Hello, not long after I started learning svd, I installed your plug-in, but I couldn't find the corresponding node configuration in comfyui. I would like to ask how I can bring out this control node in comfyui,
thank you.

Sampling of sigma

Hi, it seems that a simple log-normal distribution should be used to sample sigma.

def rand_cosine_interpolated(shape, image_d, noise_d_low, noise_d_high, sigma_data=1., min_value=1e-3, max_value=1e3, device='cpu', dtype=torch.float32):

Replace it with:

def rand_log_normal(shape, loc=0., scale=1., device='cpu', dtype=torch.float32):
    """Draws samples from an lognormal distribution."""
    u = torch.rand(shape, dtype=dtype, device=device) * (1 - 2e-7) + 1e-7
    return torch.distributions.Normal(loc, scale).icdf(u).exp()

Relevant discussions can be found here.
pixeli99/SVD_Xtend#21

Size of output

Hi, it takes height, width as parameters, but getting always 256*256 from run_inference
temp_2_20231225-022759

  • merry xmas

Discussion with CFG

For inference of paper https://arxiv.org/pdf/2211.09800.pdf, It seems C_I and C_T as two conditions and there should be three times inference and two guidences scales. You still use C_T not use C_I as condition for inference. Have you try the same way as InstructPix2Pix shows?

What is motion value?

Hi, thanks for your great work!
I'm trying to modify your code on my work. However, I don't know what is the meaning of the 'motion value', could you tell me the usage of it?
Besides, why the motion value can be converted to "add_time_ids"?
Thank you!

Inquiry on VRAM and training time requirement

Hi, and thank you for this excellent open-source project!

Could you provide details on the GPU requirements for training, specifically regarding VRAM usage and expected training duration?

Thanks for your assistance!

Code in repo not working, outdated?

Oops it looks like the code in this repo has gotten out of sync with your working copy...

Here are the issues that I've encountered so far: the motion_bucket_ids were on the wrong device resulting in a runtime error, EulerDiscreteScheduler was being imported improperly resulting in another runtime error, and the number of frames is still somewhere hard coded at 16 it seems...

I can try to PR my minimal fixes in but I wonder if it might be simpler and more comprehensive to upload the new, working code files that you're training with?

Either way thanks for sharing your project, it looks really exciting! :)

"exp_vml_cpu" not implemented for 'Half'

Hello, author. Thank you for the great open-source project. I encountered the following bug while training the code. Do you have any suggestions?

File "train_svd.py", line 1425, in <module> main() File "train_svd.py", line 1172, in main encoder_hidden_states = encode_image( File "train_svd.py", line 1031, in encode_image pixel_values = _resize_with_antialiasing(pixel_values, (224, 224)) File "train_svd.py", line 253, in _resize_with_antialiasing input = _gaussian_blur2d(input, ks, sigmas) File "train_svd.py", line 333, in _gaussian_blur2d kernel_x = _gaussian(kx, sigma[:, 1].view(bs, 1)) File "train_svd.py", line 320, in _gaussian gauss = torch.exp(-x.pow(2.0) / (2 * sigma.pow(2.0))) RuntimeError: "exp_vml_cpu" not implemented for 'Half'

What is purpose of the motion values?

Thank you to the author for providing such a great open-source code!

I tried to train this code with openpose as the condition, but I can't understand what the motion value represents, or what characteristics of the video frame does it reflect? Or how should I process the video to get this motion value?

I have seen similar problems in closed issues link .

Please feel free to enlighten me, thank you.

sv3d

Thank you for your nice work.
Stability-AI recently released SV3D. https://sv3d.github.io/
Can your controlnet be used directly on SV3D? Or do you have plans to develop training code?
Tanks!

reference materials

Thank you very much for your work. I still don’t know much about the svd scheduler. Can you recommend some reference materials?

More explanations for Motion value

Thank you to the author for providing such a great open-source code!

when i try to train the code, I don't quite understand what specific values are stored in "average_motion.txt." Could you provide more explanation to help me successfully run the training code?

diffusers==0.28.0 is not suitable for controlnet?

ImportError: cannot import name 'FromOriginalControlnetMixin' from 'diffusers.loaders'
When i use diffusers==0.28.0 , i encounter this error, but when i downgrade to 0.27.0, this works. how can i use controlnet in diffusers==0.28.0 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.