volotat / sd-cn-animation Goto Github PK

View Code? Open in Web Editor NEW

805.0 805.0 62.0 35.26 MB

This script allows to automate video stylization task using StableDiffusion and ControlNet.

License: MIT License

Python 100.00%

sd-cn-animation's People

Stargazers

Watchers

sd-cn-animation's Issues

API support with SD-WebUI

Could you kindly provide API call compatibility with the SD-WebUI? I appreciate your assistance!

Generating images one frame at a time?

Title. For a few use cases it would be useful to have finer grained control over what goes in without having to encode the inputs as video

RuntimeError('Attempting to deserialize object on a CUDA '

I have installed both the repo and cloned the RAFT repo and downloaded the models.
Modified script.py to update folders and connect to my 9090 SD port.
I did a first test using and it renders the first frame of output.mp4 but then it stops and I get the following errors in the SD-CN-Animation console and Stable Diffusion UI console.

I have a 3060 12Gb VRAM, using -xformers --medvram --port 9090 --api

This is the SD-CN-Animation script console output. (I've included the SD output below as well.)

Thanks in advance.

C:\Users\---\AI\SD-CN-Animation>python [script.py](https://script.py/)

OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'

OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'

1%|█▌ | 1/99 [00:09<16:06, 9.87s/it]

Traceback (most recent call last):

File "C:\Users\\---\\AI\SD-CN-Animation\[script.py](https://script.py/)", line 234, in <module>

_, alpha_mask, warped_styled = RAFT_estimate_flow_diff(prev_frame, frame, prev_frame_styled)

File "C:\Users\\---\\AI\SD-CN-Animation\[script.py](https://script.py/)", line 100, in RAFT_estimate_flow_diff

RAFT_model.load_state_dict(torch.load(args.model))

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 789, in load

return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1131, in _load

result = unpickler.load()

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1101, in persistent_load

load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1083, in load_tensor

wrap_storage=restore_location(storage, location),

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 215, in default_restore_location

result = fn(storage, location)

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 182, in _cuda_deserialize

device = validate_cuda_device(location)

File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 166, in validate_cuda_device

raise RuntimeError('Attempting to deserialize object on a CUDA '

RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Here is the SD output


Error running process: C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)

Traceback (most recent call last):

File "C:\Users\---\AI\AUTO\modules\[scripts.py](https://scripts.py/)", line 417, in process

script.process(p, *script_args)

File "C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)", line 629, in process

unit = self.parse_remote_call(p, unit, idx)

File "C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)", line 541, in parse_remote_call

unit.enabled = selector(p, "control_net_enabled", unit.enabled, idx, strict=True)

AttributeError: 'str' object has no attribute 'enabled'


100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:05<00:00, 1.83it/s]

Total progress: 50it [13:36, 4.54s/it]

txt2video generation failure

when i try to do txt2video it only generates 1 frame then stops

Any way to speed it up? its running 5-10x slower than regular generation

14seconds per it on a 512x768 video.

Is there anything that can be done to speed it up?

Txt2video blurry output after first frame

vid2vid not interacting with SD?

I managed to turn Flow_utils into CPU versions due to my old GPU, it's entirely possible with a good CPU. Confused right now as to why vid2vid and SD aren't interacting.

altered code

`import numpy as np
import cv2

#RAFT dependencies
import sys
sys.path.append('RAFT/core')

from collections import namedtuple
import torch
import argparse
from raft import RAFT
from utils.utils import InputPadder

RAFT_model = None
def RAFT_estimate_flow(frame1, frame2, device = 'cpu'): # Change 'cuda' to 'cpu'
global RAFT_model
if RAFT_model is None:
args = argparse.Namespace(**{
'model': 'RAFT/models/raft-things.pth',
'mixed_precision': True,
'small': False,
'alternate_corr': False,
'path': ""
})
RAFT_model = torch.nn.DataParallel(RAFT(args))
RAFT_model.load_state_dict(torch.load(args.model, map_location=device)) # Use map_location to specify device

RAFT_model = RAFT_model.module
RAFT_model.to(device)
RAFT_model.eval()
with torch.no_grad():
frame1_torch = torch.from_numpy(frame1).permute(2, 0, 1).float()[None].to(device)
frame2_torch = torch.from_numpy(frame2).permute(2, 0, 1).float()[None].to(device)
padder = InputPadder(frame1_torch.shape)
image1, image2 = padder.pad(frame1_torch, frame2_torch)

# estimate optical flow
_, next_flow = RAFT_model(image1, image2, iters=20, test_mode=True)
_, prev_flow = RAFT_model(image2, image1, iters=20, test_mode=True)

next_flow = next_flow[0].permute(1,2,0).cpu().numpy()
prev_flow = prev_flow[0].permute(1,2,0).cpu().numpy()

fb_flow = next_flow + prev_flow
fb_norm = np.linalg.norm(fb_flow, axis=2)

occlusion_mask = fb_norm[..., None].repeat(3, axis = -1)
return next_flow, prev_flow, occlusion_mask

def compute_diff_map(next_flow, prev_flow, prev_frame, cur_frame, prev_frame_styled):
h, w = cur_frame.shape[:2]

next_flow = cv2.resize(next_flow, (w, h))
prev_flow = cv2.resize(prev_flow, (w, h))

flow_map = -next_flow.copy()
flow_map[:,:,0] += np.arange(w)
flow_map[:,:,1] += np.arange(h)[:,np.newaxis]

warped_frame = cv2.remap(prev_frame, flow_map, None, cv2.INTER_NEAREST)
warped_frame_styled = cv2.remap(prev_frame_styled, flow_map, None, cv2.INTER_NEAREST)

compute occlusion mask

fb_flow = next_flow + prev_flow
fb_norm = np.linalg.norm(fb_flow, axis=2)

occlusion_mask = fb_norm[..., None]

diff_mask_org = np.abs(warped_frame.astype(np.float32) - cur_frame.astype(np.float32)) / 255
diff_mask_org = diff_mask_org.max(axis = -1, keepdims=True)

diff_mask_stl = np.abs(warped_frame_styled.astype(np.float32) - cur_frame.astype(np.float32)) / 255
diff_mask_stl = diff_mask_stl.max(axis = -1, keepdims=True)

alpha_mask = np.maximum(occlusion_mask * 0.3, diff_mask_org * 4, diff_mask_stl * 2)
alpha_mask = alpha_mask.repeat(3, axis = -1)

#alpha_mask_blured = cv2.dilate(alpha_mask, np.ones((5, 5), np.float32))
alpha_mask = cv2.GaussianBlur(alpha_mask, (51,51), 5, cv2.BORDER_DEFAULT)

alpha_mask = np.clip(alpha_mask, 0, 1)

return alpha_mask, warped_frame_styled
`

The popout window worked like it was supposed to, allowing to view the process and leaving me with a flow.h5 file.

After that I tried to bash the ui but my PC kept saying there is no python3, instead I added the flags into the batch and just ran the batch. As SD was up and running I checked out the link to the API which leave me with: {"detail":"Method Not Allowed"} and upon running vid2vid.py nothing happens aside from an empty mp4 being created.

This is the altering I have done to the vid2vid.py:
from cpuFlow_utils import compute_diff_map

Kinda new to Python and all too but been making my way through, nonetheless I keep running into my PC not recognizing my python3 installation.

slowly reverts back to the original image

Hey there, been trying a few different settings so far trying to get a feel for this.

One thing I've noticed is that the first few frames will be a new image (as intended) then it slowly fades back to the original.
For example, a person dancing and it's only 20 frames long, they'll start off looking like a new person, but roughly halfway through the video won't look much different from the original gif. By the end it'll look almost exactly the same.

This happens less often when frame strength is above 0.5 but then the videos get more wild.
Attempting processing strength around 0.65, and 0.25
Haven't really figured a good setting to try and retain the original image but replace the character used.

Tested a video with 18 frames, brown haired girl dancing.

Frames 1 - 10 start with a girl with short orange hair and a hat, but slowly fade, by frame 10 it looks exactly the same as the original gif. Long brown haired girl again.

Gradio error on generation

Happens on both vid2vid and textvid after clicking 'Generate'.

  File "/content/sd/lib/python3.10/site-packages/gradio/routes.py", line 337, in run_predict
    output = await app.get_blocks().process_api(
  File "/content/sd/lib/python3.10/site-packages/gradio/blocks.py", line 1015, in process_api
    result = await self.call_function(
  File "/content/sd/lib/python3.10/site-packages/gradio/blocks.py", line 843, in call_function
    raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.

All Text to Video animations change from normal to purple.

Text to Video output seems to turn it's hue to purple and red. I've tried several times with different libraries.

Use PDCNet+ for occlusion mask

There's a similar project https://github.com/zyddnys/sd_animation_optical_flow that uses PDCNet+ for optical flow confidence, have you tried using PDCNet+?

txt2vid error on Apple Silicon Mac

Traceback (most recent call last):
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 399, in run_predict
    output = await app.get_blocks().process_api(
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1036, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 488, in async_iteration
    return next(iterator)
  File "/Users/hein/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 123, in process
    yield from txt2vid.start_process(*args)
  File "/Users/hein/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 76, in start_process
    FloweR_load_model(args_dict['width'], args_dict['height'])
  File "/Users/hein/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 51, in FloweR_load_model
    FloweR_model.load_state_dict(torch.load(model_path))
  File "/Users/hein/stable-diffusion-webui/modules/safe.py", line 107, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "/Users/hein/stable-diffusion-webui/modules/safe.py", line 152, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
    result = unpickler.load()
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1142, in persistent_load
    typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1116, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 217, in default_restore_location
    result = fn(storage, location)
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Gives me this error after generating the first frame just fine. Is this just not supported on M1/M2 Macs, or should I change some settings?

Error generating Vid2Vid

Hello,

I am getting this error and i am not sure what i am missing, made sure of input size, folder locations and everything, running it through a1111

Already up to date.
venv "F:\AI\Automatic1111\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)]
Commit hash: 5ab7f213bec2f816f9c5644becb32eb72c8ffb89
Installing requirements
[Auto-Photoshop-SD] Attempting auto-update...
[Auto-Photoshop-SD] switch branch to extension branch.
checkout_result: Your branch is up to date with 'origin/master'.

[Auto-Photoshop-SD] Current Branch.
branch_result: * master

[Auto-Photoshop-SD] Fetch upstream.
fetch_result:
[Auto-Photoshop-SD] Pull upstream.
pull_result: Already up to date.

Installing SD-CN-Animation requirement: scikit-image==0.19.2

Installing requirements for TemporalKit extension

Launching Web UI with arguments: --opt-sdp-attention --api
No module 'xformers'. Proceeding without it.
python_server_full_path: F:\AI\Automatic1111\stable-diffusion-webui\extensions\Auto-Photoshop-StableDiffusion-Plugin\server/python_server
Civitai Helper: Get Custom Model Folder
Civitai Helper: Load setting from: F:\AI\Automatic1111\stable-diffusion-webui\extensions\Stable-Diffusion-Webui-Civitai-Helper\setting.json
Civitai Helper: No setting file, use default
Better Prompt version is v0.2.0
ControlNet v1.1.150
ControlNet v1.1.150
Loading weights [d10ad6063d] from F:\AI\Automatic1111\stable-diffusion-webui\models\Stable-diffusion\unvailAI3DKXV2_3dkxV2.safetensors
Creating model from config: F:\AI\Automatic1111\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights specified in settings: F:\AI\Automatic1111\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.ckpt
Applying scaled dot product cross attention optimization.
Textual inversion embeddings loaded(4): uiyiuy, beauty512, CharTurner, neg_realism512
Textual inversion embeddings skipped(5): 21charturnerv2, nartfixer, nfixer, nrealfixer, rz-neg-general
Removing ToMe patch (if exists)
Model loaded in 2.3s (create model: 0.2s, apply weights to model: 0.3s, apply half(): 0.4s, load VAE: 0.2s, move model to device: 0.4s, load textual inversion embeddings: 0.7s).
Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch().
Startup time: 9.8s (import torch: 1.0s, import gradio: 0.8s, import ldm: 0.3s, other imports: 0.5s, list SD models: 0.1s, load scripts: 1.9s, load SD checkpoint: 2.4s, create ui: 2.5s, gradio launch: 0.1s).
Consuming a byte in the end state
Consuming a byte in the end state
controlnet batch mode
Loading model: control_sd15_hed [fef5e48e]
Loaded state_dict from [F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\models\control_sd15_hed.pth]
Loading config: F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\models\control_sd15_hed.yaml
ControlNet model control_sd15_hed [fef5e48e] loaded.
Error running process: F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
File "F:\AI\Automatic1111\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
File "F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1035, in process
input_image = HWC3(image['image'])
File "F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\util.py", line 6, in HWC3
assert x.dtype == np.uint8
File "F:\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\PIL\Image.py", line 529, in getattr
raise AttributeError(name)
AttributeError: dtype

Bug. "Traceback (most recent call last)"

GPU 3070
CPU - I7
OS - Windows 11

Unable to generate, throws an error.

Traceback (most recent call last):
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 843, in call_function
raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.

vid2vid Only 1 frame generated? - on mac

Using macOS, I seem able to generate only 1 frame? then it does not seem to process anymore.

Frames prepared: 1 / 488; Frames processed: 1 / 488

Enhancements: Time left for completion

Hey, thanks for doing the awesome work of re-writing the code to make it into an Auto1111 extension. This is great. I'm already trying it and I love how streamlined now the process is. <3

Just a minor thing, In the previous command line version I remember some timer that told me how much time was going to take.
Would it be possible to add "Time elapsed/left" next to "Frames prepared" / "Frames processed"?

Anyways, love the new extension, thanks for this!

CLI or Compatibility with Remote Instance

I'm currently running AUTOMATIC1111 from a Docker container on a remote instance.
seems like the script doesn't work since it's trying to show image with OpenCV and I have no UI on my Ubuntu instance. is it possible to run it with no display or with web interface instead ?

All videos take on a purple or reddish hue. There seems to be no option which stops this. Anyone know the fix for this?

All of the txt2vid videos I do start turning purple or red after a few frames. Any ideas what could be causing this?

unable to open flow.h5

(raft) I:\stable-diffusion-webui>python vid2vid.py
Traceback (most recent call last):
File "vid2vid.py", line 145, in
with h5py.File(FLOW_MAPS, 'r') as f:
File "C:\Users\david\miniconda3\envs\raft\lib\site-packages\h5py_hl\files.py", line 567, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
File "C:\Users\david\miniconda3\envs\raft\lib\site-packages\h5py_hl\files.py", line 231, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 106, in h5py.h5f.open
FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'i:/flow.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

Reduce image impact on vid2vid

I am trying to use a video only for posing and have the output video entirely dependent on the prompt and controlnet (using openpose). However, the vid2vid always contains a lot of the original video information (beyond the pose). There doesn't seem to be an obvious way to reduce the impact of the original video frames. I'm assuming it's using some sort of img2img, but without a way to change the denoising strength.

Add sampler settings to txt-to-video

Would be nice to have a control of sampling method/sampling steps in txt-to-video

Windows

Any guide on how to run it on windows?

image quality degradation as generation continues

Its seems like my initial will be okay but every image slowly just adds more burn layers before the preview becomes a complete acid trip.

also I get this every generation:

Error running process: H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
File "H:\NovelAI\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
File "H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 629, in process
unit = self.parse_remote_call(p, unit, idx)
File "H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 541, in parse_remote_call
unit.enabled = selector(p, "control_net_enabled", unit.enabled, idx, strict=True)
AttributeError: 'float' object has no attribute 'enabled'

ATTENTION: Negative prompt weight is set to 3.0
(EDIT: Okay so the negative prompt thing was the result of another extension, I had to turn off a couple of extensions to remove more errors)

Exception ignored in: <generator object process at 0x000001DE3FA46810> Traceback (most recent call last)

Traceback (most recent call last):
File "..\extensions\SD-CN-Animation\scripts\base_ui.py", line 123, in process
yield from txt2vid.start_process(*args)
RuntimeError: generator ignored GeneratorExit

The error message above, appears sometimes, before finishing the animation in T2V (without CN).
CPU: intel
Graphics: Nvidia RTX
Browser: Google Chrome
Win11
Latest A1111 (Commit commit: 5ab7f213)

The only thing I do to continue creating, closing A1111 completely and restart it.

Thank you very much for this amazing extension 👍

There's a ghosting problem when the animation is on the same plan for too long.

Hi the script is awesome but do you know how I can fix the ghosting problem ? thanks

compute_flow.py is reacting to something on a pure black background

I use runwayml to remove the background from a video, compute_flow renders an odd pattern in the black where it should be seeing nothing, i assume its compression artifacts, but i also ran it through after effects rotobrush to strip anything out and it looks the same, if i change the resolution to 2x the video size then nearly the whole frame is white.

ok ive just rendered it out from after effects into uncompressed avi

this is a solid black background, am i doing something wrong here?

scikit-image==0.19.2

Every time I run the webui i get

Installing SD-CN-Animation requirement: scikit-image==0.19.2

but other than that, the extension looks very promising, cheers!

Fix usage of deprecated arguments

On automatic1111 there is currently this warning on each frame in the console

sd-cn-animation/scripts\core\vid2vid.py:115: FutureWarning: multichannel is a deprecated argument name for match_histograms. It will be removed in version 1.0. Please use channel_axis instead.

On https://github.com/vladmandic/automatic this deprecated argument has already been removed, causing the script to fail at the first step.

Use of multi control nets

I think (suggested from a redditor, not my idea) using multi control nets would make a really great version, can it be added? I tried it and cannot get it to load properly

I have used multiple controlnet on stills (depth, hed, color and canny) and the result on stills from a mp4 is virtually no flickering, and very consistent so if it can be added to this process I believe the results would be a lot better.

it sems simple but it gets stuck on loading the image for the other control nets.

class controlnetRequest():
    def __init__(self, b64_cur_img, b64_hed_img, b64_hed_img1, ds=0.35, w=w, h=h, mask=None, mask1=None):

        self.url = "http://localhost:7860/sdapi/v1/img2img"
        self.body = {
            "init_images": [b64_cur_img],
            "mask": mask,
            "mask_blur": 0,
            "inpainting_fill": 1,
            "inpainting_mask_invert": 0,
            "prompt": PROMPT,
            "negative_prompt": N_PROMPT,
            "seed": SEED,
            "subseed": -1,
            "subseed_strength": 0,
            "batch_size": 1,
            "n_iter": 1,
            "steps": 20,
            "cfg_scale": 7,
            "denoising_strength": ds,
            "width": w,
            "height": h,
            "restore_faces": False,
            "eta": 0,
            "sampler_index": "DPM++ 2S a",
            "control_net_enabled": True,
            "alwayson_scripts": {
                "ControlNet": {
                    "args": [
                        {
                            "input_image": b64_hed_img,
                            "module": "depth",
                            "model": "control_depth-fp16 [400750f6]",
                            "weight": 1,
                            "resize_mode": "Just Resize",
                            "lowvram": False,
                            "processor_res": 720,
                            "guidance": 1,
                            "guessmode": False
                        }
                    ]
                },
                "ControlNet1": {
                    "args": [
                        {
                            "input_image": b64_hed_img1,
                            "module": "hed",
                            "model": "control_hed-fp16 [13fee50b]",
                            "weight": 1,
                            "resize_mode": "Just Resize",
                            "lowvram": False,
                            "processor_res": 720,
                            "guidance": 1,
                            "guessmode": False
                        }
                    ]
                }
            },
        }

results in

line 140, in
image_bytes = base64.b64decode(data_js["images"][0])
KeyError: 'images'

Fix integration with Vladmandic/automatic

Because of deprecated gradio usage, this extension won't work with Vladmandic/automatic. See related issue: vladmandic/automatic#817

Could you collaborate?

txt2vid error on cuda only

issue with macOS cpu cuda only

Traceback (most recent call last):
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 898, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 549, in async_iteration
    return next(iterator)
  File "/Users/jimmygunawan/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 123, in process
    yield from txt2vid.start_process(*args)
  File "/Users/jimmygunawan/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 76, in start_process
    FloweR_load_model(args_dict['width'], args_dict['height'])
  File "/Users/jimmygunawan/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 51, in FloweR_load_model
    FloweR_model.load_state_dict(torch.load(model_path))
  File "/Users/jimmygunawan/stable-diffusion-webui/modules/safe.py", line 106, in load
    return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
  File "/Users/jimmygunawan/stable-diffusion-webui/modules/safe.py", line 151, in load_with_extra
    return unsafe_torch_load(filename, *args, **kwargs)
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 789, in load
    return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1131, in _load
    result = unpickler.load()
  File "/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1213, in load
    dispatch[key[0]](self)
  File "/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1254, in load_binpersid
    self.append(self.persistent_load(pid))
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1101, in persistent_load
    load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1083, in load_tensor
    wrap_storage=restore_location(storage, location),
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 215, in default_restore_location
    result = fn(storage, location)
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
    device = validate_cuda_device(location)
  File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 166, in validate_cuda_device
    raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

Video With No Frames

So a video shows up in the folder but it only has a single frame I'm using the default resolution but its still not working.

Suspend/Resume processing

I'm using Google Colab, which will kick me off after a period of time and, often, the integration with Google Drive just breaks out of nowhere. I'm currently running a larger-format v2v and it's taking hours (for just 60 frames). Which is fine, but I'm a bit paranoid about losing progress if I get kicked off of free Colab and/or if something breaks.

It would be wonderful to be able to manually save progress mid-generation (or automatically save every X frames) and be able to resume later. I suspect something like this would be nice for other setups/use cases as well.

Thanks for making this. It's awesome.

No module named 'raft'

Hi I have a error when I try to run the script :
C:\AUTOMATIC1111>py script.py
Traceback (most recent call last):
File "C:\AUTOMATIC1111\script.py", line 15, in
from raft import RAFT
ModuleNotFoundError: No module named 'raft'

Apparently I need to install raft but how I add raft to automatic1111 ? thanks

webui extension

Hey, i think this script would work better as an extension inside of the Automatic1111 webui. preferably as a script you can choose under img2img in the script dropdown.
then you would upload a video and it would end up in a temporary folder, it renders it, then recombines the frames.

i think this script would be a lot more user friendly if it was available from the Automatic1111 webui, and it doesn't seem to have a good reason to be a separate script, as it requires the Automatic1111 webui to function.

and if you decide not to, i could do it if you wanted me to.

Help with running it

Hello, I would like to know if someone could help me figure out why I keep on getting the same mistake multiple time related to torch

Here is my issue, When I run the line below, I get the module torch not found even though it is there.

'D:\stable-diffusion-webui\SD-CN-Animation>python3 compute_flow.py -i "C:\Users\Arnaud***\Downloads\Remove0417_720p.mov" -o "D:\stable-diffusion-webui\SD-CN-Animation\EXPORT" -v -W 720 -H 540
Traceback (most recent call last):
File "D:\stable-diffusion-webui\SD-CN-Animation\compute_flow.py", line 7, in
from flow_utils import RAFT_estimate_flow
File "D:\stable-diffusion-webui\SD-CN-Animation\flow_utils.py", line 9, in
import torch
ModuleNotFoundError: No module named 'torch' '

What could be the problem?

LoRA compatibility

Is there a way to make LoRA works with your script ? I have enter the name of my lora file in the prompt but it doesn't works, thanks

Several errors when running txt2vid, video generated fine

Using default settings and prompt "lemon" after installing the extension, updating automatic1111 and all my extensions and restarting everything:

100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00,  1.99it/s]
Error running postprocess_batch: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py█████████| 6/6 [00:02<00:00,  2.70it/s]
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 453, in postprocess_batch
    script.postprocess_batch(p, *script_args, images=images, **kwargs)
TypeError: Script.postprocess_batch() missing 1 required positional argument: 'enabled'

Error running postprocess_batch: D:\stable-diffusion-webui\extensions\sd-dynamic-thresholding\scripts\dynamic_thresholding.py
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 453, in postprocess_batch
    script.postprocess_batch(p, *script_args, images=images, **kwargs)
TypeError: Script.postprocess_batch() missing 8 required positional arguments: 'enabled', 'mimic_scale', 'threshold_percentile', 'mimic_mode', 'mimic_scale_min', 'cfg_mode', 'cfg_scale_min', and 'powerscale_power'

Error running postprocess: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 444, in postprocess
    script.postprocess(p, processed, *script_args)
TypeError: Script.postprocess() missing 1 required positional argument: 'enabled'

Error running postprocess: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 444, in postprocess
    script.postprocess(p, processed, *script_args)
TypeError: Script.postprocess() missing 1 required positional argument: 'enabled'

Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00,  2.05it/s]
Error running process: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py███████████████████| 6/6 [00:02<00:00,  2.70it/s]
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 417, in process
    script.process(p, *script_args)
TypeError: Script.process() missing 21 required positional arguments: 'enabled', 'method', 'overwrite_size', 'keep_input_size', 'image_width', 'image_height', 'tile_width', 'tile_height', 'overlap', 'tile_batch_size', 'upscaler_name', 'scale_factor', 'noise_inverse', 'noise_inverse_steps', 'noise_inverse_retouch', 'noise_inverse_renoise_strength', 'noise_inverse_renoise_kernel', 'control_tensor_cpu', 'enable_bbox_control', 'draw_background', and 'causal_layers'

Error running process: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 417, in process
    script.process(p, *script_args)
TypeError: Script.process() missing 7 required positional arguments: 'enabled', 'encoder_tile_size', 'decoder_tile_size', 'vae_to_gpu', 'fast_decoder', 'fast_encoder', and 'color_fix'

Error running process_batch: D:\stable-diffusion-webui\extensions\sd-dynamic-thresholding\scripts\dynamic_thresholding.py
Traceback (most recent call last):
  File "D:\stable-diffusion-webui\modules\scripts.py", line 435, in process_batch
    script.process_batch(p, *script_args, **kwargs)
TypeError: Script.process_batch() missing 8 required positional arguments: 'enabled', 'mimic_scale', 'threshold_percentile', 'mimic_mode', 'mimic_scale_min', 'cfg_mode', 'cfg_scale_min', and 'powerscale_power'

 33%|██████████████████████████████████████████████▎                                                                                            | 4/12 [00:02<00:03,  2.07it/s]

However the video seems to be generated fine, so the errors are ignored or are not critical.

Both vid2vid and txt2vid fail, prompting processed_frame = np.array(processed_frames[0]) IndexError: list index out of range

sd-cn-animation/scripts\core\vid2vid.py", line 114, in start_process
processed_frame = np.array(processed_frames[0])
IndexError: list index out of range

Do need to enable other settings in this operation?

Error loading script: base_ui.py

Error loading script: base_ui.py
Traceback (most recent call last):
File "/content/stable-diffusion-webui/modules/scripts.py", line 248, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "/content/stable-diffusion-webui/modules/script_loading.py", line 11, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/content/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 30, in
from core import vid2vid, txt2vid, utils
ModuleNotFoundError: No module named 'core'

Got this error trying to run in a colab

exception occur and quit at the first frame

script.py", line 235, in
_, alpha_mask, warped_styled = RAFT_estimate_flow_diff(prev_frame, frame, prev_frame_styled)
File "D:\study\SD-CN-Animation\script.py", line 135, in RAFT_estimate_flow_diff
diff_mask = np.abs(warped_frame.astype(np.float32) - frame2.astype(np.float32)) / 255
ValueError: operands could not be broadcast together with shapes (872,488,3) (866,486,3)

[Bug] AttributeError: 'str' object has no attribute 'dict'

Getting this with vid2vid. v0.6 + https://github.com/vladmandic/automatic:

override_settings: []
Traceback (most recent call last):
  File "e:\.ai\automatic\venv\lib\site-packages\gradio\routes.py", line 399, in run_predict
    output = await app.get_blocks().process_api(
  File "e:\.ai\automatic\venv\lib\site-packages\gradio\blocks.py", line 1299, in process_api
    result = await self.call_function(
  File "e:\.ai\automatic\venv\lib\site-packages\gradio\blocks.py", line 1036, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "e:\.ai\automatic\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "e:\.ai\automatic\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "e:\.ai\automatic\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "e:\.ai\automatic\venv\lib\site-packages\gradio\utils.py", line 488, in async_iteration
    return next(iterator)
  File "e:\.ai\automatic/extensions/sd-cn-animation/scripts\vid2vid.py", line 172, in start_process
    processed_frames, _, _, _ = img2img(args_dict)
  File "e:\.ai\automatic/extensions/sd-cn-animation/scripts\vid2vid.py", line 445, in img2img
    print('script_inputs 1:', args.script_inputs[1].__dict__)
AttributeError: 'str' object has no attribute '__dict__'

[BUG] I have a duplication moving images with the v0.5

Hi I have a weird bug with the v0.5 is someone know what is it ? thanks

duplication_effect.mp4

not sure what this error could be...

i updated the script to the latest version and now I cant seem to avoid this error. Anybody know what it means?

compute_flow.py: error: unrecognized arguments: Diffusion\stable-diffusion-webui\scripts\SD-CN-Animation-main\1.mp4 Diffusion\stable-diffusion-webui\scripts\SD-CN-Animation-main\flow.h5

i'm guessing the path structure is incorrect on my part. not sure but any help would be appreciated and please accept my humblest apology if this error was posted and solved somewhere else.

Issue with Submitting

I've triple checked everything and even tried with a fresh install of webui and a rollmarch to pre March 25th as well. No matter what I get errors.

(sdoptical) PS D:\Github\SD-CN-Animation> python vid2vid.py
0%|
Traceback (most recent call last):
File "D:\Github\SD-CN-Animation\vid2vid.py", line 175, in
out_image = controlnetRequest(to_b64(frame), to_b64(frame), PROCESSING_STRENGTH, w, h, mask = None).sendRequest()
File "D:\Github\SD-CN-Animation\vid2vid.py", line 123, in sendRequest
image_bytes = base64.b64decode(data_js["images"][0])
KeyError: 'images'

Any ideas?

Txt2video seems to get stuck after random amount of frames generated

It ocurred to me in 3 instances where I was trying to generate max length videos. Stopping randonmly without error message in the console at 104, 123, 88. All with different settings (max fps, length, etc)

I'll try to have a look through the code soon if we can't seem to diagnose this.

'Need to enable queue to use generators.' error - when running txt2vid

Traceback (most recent call last):
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 843, in call_function raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.

Vid2Vid different resolution

I was trying to do vid2vid on a video with different resolution than 1024x576 , but it only generates the 1st frame and then it stops.

All frames processed by vid2vid, but no video generated

I'm trying to process a short 5 second video using vid2vid + ControlNet. It seems to work fine and it will process all 150 frames.

However when it's done, nothing will happen. I was expecting to get some kind of download link to download the result, just I can't find anything on the UI. Also there are no errors reported anywhere on the UI or the Stable Diffusion console.

Am I just blind or is that some kind of bug?

vid2vid

On trying vid2vid conversion, I get this error. Not sure what I'm doing wrong.

This happens with or without controlnet enabled. Also a question on controlnet. Can you just use the input directory for controlnet ? Or do you have to export the video as frames and feed it to the batch controlnet ?

volotat / sd-cn-animation Goto Github PK

sd-cn-animation's People

Stargazers

Watchers

Forkers

sd-cn-animation's Issues

compute occlusion mask

Recommend Projects

Recommend Topics

Recommend Org