volotat / sd-cn-animation Goto Github PK
View Code? Open in Web Editor NEWThis script allows to automate video stylization task using StableDiffusion and ControlNet.
License: MIT License
This script allows to automate video stylization task using StableDiffusion and ControlNet.
License: MIT License
Could you kindly provide API call compatibility with the SD-WebUI? I appreciate your assistance!
Title. For a few use cases it would be useful to have finer grained control over what goes in without having to encode the inputs as video
I have a 3060 12Gb VRAM, using -xformers --medvram --port 9090 --api
This is the SD-CN-Animation script console output. (I've included the SD output below as well.)
Thanks in advance.
C:\Users\---\AI\SD-CN-Animation>python [script.py](https://script.py/)
OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'
1%|█▌ | 1/99 [00:09<16:06, 9.87s/it]
Traceback (most recent call last):
File "C:\Users\\---\\AI\SD-CN-Animation\[script.py](https://script.py/)", line 234, in <module>
_, alpha_mask, warped_styled = RAFT_estimate_flow_diff(prev_frame, frame, prev_frame_styled)
File "C:\Users\\---\\AI\SD-CN-Animation\[script.py](https://script.py/)", line 100, in RAFT_estimate_flow_diff
RAFT_model.load_state_dict(torch.load(args.model))
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1131, in _load
result = unpickler.load()
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1101, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 1083, in load_tensor
wrap_storage=restore_location(storage, location),
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 215, in default_restore_location
result = fn(storage, location)
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "C:\Users\\---\\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\[serialization.py](https://serialization.py/)", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Here is the SD output
Error running process: C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)
Traceback (most recent call last):
File "C:\Users\---\AI\AUTO\modules\[scripts.py](https://scripts.py/)", line 417, in process
script.process(p, *script_args)
File "C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)", line 629, in process
unit = self.parse_remote_call(p, unit, idx)
File "C:\Users\---\AI\AUTO\extensions\sd-webui-controlnet\scripts\[controlnet.py](https://controlnet.py/)", line 541, in parse_remote_call
unit.enabled = selector(p, "control_net_enabled", unit.enabled, idx, strict=True)
AttributeError: 'str' object has no attribute 'enabled'
100%|██████████████████████████████████████████████████████████████████████████████████| 10/10 [00:05<00:00, 1.83it/s]
Total progress: 50it [13:36, 4.54s/it]
when i try to do txt2video it only generates 1 frame then stops
14seconds per it on a 512x768 video.
Is there anything that can be done to speed it up?
I managed to turn Flow_utils into CPU versions due to my old GPU, it's entirely possible with a good CPU. Confused right now as to why vid2vid and SD aren't interacting.
altered code
`import numpy as np
import cv2#RAFT dependencies
import sys
sys.path.append('RAFT/core')from collections import namedtuple
import torch
import argparse
from raft import RAFT
from utils.utils import InputPadderRAFT_model = None
def RAFT_estimate_flow(frame1, frame2, device = 'cpu'): # Change 'cuda' to 'cpu'
global RAFT_model
if RAFT_model is None:
args = argparse.Namespace(**{
'model': 'RAFT/models/raft-things.pth',
'mixed_precision': True,
'small': False,
'alternate_corr': False,
'path': ""
})RAFT_model = torch.nn.DataParallel(RAFT(args)) RAFT_model.load_state_dict(torch.load(args.model, map_location=device)) # Use map_location to specify device RAFT_model = RAFT_model.module RAFT_model.to(device) RAFT_model.eval()
with torch.no_grad():
frame1_torch = torch.from_numpy(frame1).permute(2, 0, 1).float()[None].to(device)
frame2_torch = torch.from_numpy(frame2).permute(2, 0, 1).float()[None].to(device)padder = InputPadder(frame1_torch.shape) image1, image2 = padder.pad(frame1_torch, frame2_torch) # estimate optical flow _, next_flow = RAFT_model(image1, image2, iters=20, test_mode=True) _, prev_flow = RAFT_model(image2, image1, iters=20, test_mode=True) next_flow = next_flow[0].permute(1,2,0).cpu().numpy() prev_flow = prev_flow[0].permute(1,2,0).cpu().numpy() fb_flow = next_flow + prev_flow fb_norm = np.linalg.norm(fb_flow, axis=2) occlusion_mask = fb_norm[..., None].repeat(3, axis = -1)
return next_flow, prev_flow, occlusion_mask
def compute_diff_map(next_flow, prev_flow, prev_frame, cur_frame, prev_frame_styled):
h, w = cur_frame.shape[:2]next_flow = cv2.resize(next_flow, (w, h))
prev_flow = cv2.resize(prev_flow, (w, h))flow_map = -next_flow.copy()
flow_map[:,:,0] += np.arange(w)
flow_map[:,:,1] += np.arange(h)[:,np.newaxis]warped_frame = cv2.remap(prev_frame, flow_map, None, cv2.INTER_NEAREST)
warped_frame_styled = cv2.remap(prev_frame_styled, flow_map, None, cv2.INTER_NEAREST)compute occlusion mask
fb_flow = next_flow + prev_flow
fb_norm = np.linalg.norm(fb_flow, axis=2)occlusion_mask = fb_norm[..., None]
diff_mask_org = np.abs(warped_frame.astype(np.float32) - cur_frame.astype(np.float32)) / 255
diff_mask_org = diff_mask_org.max(axis = -1, keepdims=True)diff_mask_stl = np.abs(warped_frame_styled.astype(np.float32) - cur_frame.astype(np.float32)) / 255
diff_mask_stl = diff_mask_stl.max(axis = -1, keepdims=True)alpha_mask = np.maximum(occlusion_mask * 0.3, diff_mask_org * 4, diff_mask_stl * 2)
alpha_mask = alpha_mask.repeat(3, axis = -1)#alpha_mask_blured = cv2.dilate(alpha_mask, np.ones((5, 5), np.float32))
alpha_mask = cv2.GaussianBlur(alpha_mask, (51,51), 5, cv2.BORDER_DEFAULT)alpha_mask = np.clip(alpha_mask, 0, 1)
return alpha_mask, warped_frame_styled
`
The popout window worked like it was supposed to, allowing to view the process and leaving me with a flow.h5 file.
After that I tried to bash the ui but my PC kept saying there is no python3, instead I added the flags into the batch and just ran the batch. As SD was up and running I checked out the link to the API which leave me with: {"detail":"Method Not Allowed"} and upon running vid2vid.py nothing happens aside from an empty mp4 being created.
This is the altering I have done to the vid2vid.py:
from cpuFlow_utils import compute_diff_map
Kinda new to Python and all too but been making my way through, nonetheless I keep running into my PC not recognizing my python3 installation.
Hey there, been trying a few different settings so far trying to get a feel for this.
One thing I've noticed is that the first few frames will be a new image (as intended) then it slowly fades back to the original.
For example, a person dancing and it's only 20 frames long, they'll start off looking like a new person, but roughly halfway through the video won't look much different from the original gif. By the end it'll look almost exactly the same.
This happens less often when frame strength is above 0.5 but then the videos get more wild.
Attempting processing strength around 0.65, and 0.25
Haven't really figured a good setting to try and retain the original image but replace the character used.
Tested a video with 18 frames, brown haired girl dancing.
Frames 1 - 10 start with a girl with short orange hair and a hat, but slowly fade, by frame 10 it looks exactly the same as the original gif. Long brown haired girl again.
Happens on both vid2vid and textvid after clicking 'Generate'.
File "/content/sd/lib/python3.10/site-packages/gradio/routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "/content/sd/lib/python3.10/site-packages/gradio/blocks.py", line 1015, in process_api
result = await self.call_function(
File "/content/sd/lib/python3.10/site-packages/gradio/blocks.py", line 843, in call_function
raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.
There's a similar project https://github.com/zyddnys/sd_animation_optical_flow that uses PDCNet+ for optical flow confidence, have you tried using PDCNet+?
Traceback (most recent call last):
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 399, in run_predict
output = await app.get_blocks().process_api(
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1299, in process_api
result = await self.call_function(
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1036, in call_function
prediction = await anyio.to_thread.run_sync(
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 488, in async_iteration
return next(iterator)
File "/Users/hein/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 123, in process
yield from txt2vid.start_process(*args)
File "/Users/hein/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 76, in start_process
FloweR_load_model(args_dict['width'], args_dict['height'])
File "/Users/hein/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 51, in FloweR_load_model
FloweR_model.load_state_dict(torch.load(model_path))
File "/Users/hein/stable-diffusion-webui/modules/safe.py", line 107, in load
return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
File "/Users/hein/stable-diffusion-webui/modules/safe.py", line 152, in load_with_extra
return unsafe_torch_load(filename, *args, **kwargs)
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 809, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1172, in _load
result = unpickler.load()
File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1213, in load
dispatch[key[0]](self)
File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1254, in load_binpersid
self.append(self.persistent_load(pid))
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1142, in persistent_load
typed_storage = load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1116, in load_tensor
wrap_storage=restore_location(storage, location),
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 217, in default_restore_location
result = fn(storage, location)
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "/Users/hein/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
Gives me this error after generating the first frame just fine. Is this just not supported on M1/M2 Macs, or should I change some settings?
Hello,
I am getting this error and i am not sure what i am missing, made sure of input size, folder locations and everything, running it through a1111
Already up to date.
venv "F:\AI\Automatic1111\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)]
Commit hash: 5ab7f213bec2f816f9c5644becb32eb72c8ffb89
Installing requirements
[Auto-Photoshop-SD] Attempting auto-update...
[Auto-Photoshop-SD] switch branch to extension branch.
checkout_result: Your branch is up to date with 'origin/master'.
[Auto-Photoshop-SD] Current Branch.
branch_result: * master
[Auto-Photoshop-SD] Fetch upstream.
fetch_result:
[Auto-Photoshop-SD] Pull upstream.
pull_result: Already up to date.
Installing SD-CN-Animation requirement: scikit-image==0.19.2
Installing requirements for TemporalKit extension
Launching Web UI with arguments: --opt-sdp-attention --api
No module 'xformers'. Proceeding without it.
python_server_full_path: F:\AI\Automatic1111\stable-diffusion-webui\extensions\Auto-Photoshop-StableDiffusion-Plugin\server/python_server
Civitai Helper: Get Custom Model Folder
Civitai Helper: Load setting from: F:\AI\Automatic1111\stable-diffusion-webui\extensions\Stable-Diffusion-Webui-Civitai-Helper\setting.json
Civitai Helper: No setting file, use default
Better Prompt version is v0.2.0
ControlNet v1.1.150
ControlNet v1.1.150
Loading weights [d10ad6063d] from F:\AI\Automatic1111\stable-diffusion-webui\models\Stable-diffusion\unvailAI3DKXV2_3dkxV2.safetensors
Creating model from config: F:\AI\Automatic1111\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Loading VAE weights specified in settings: F:\AI\Automatic1111\stable-diffusion-webui\models\VAE\vae-ft-mse-840000-ema-pruned.ckpt
Applying scaled dot product cross attention optimization.
Textual inversion embeddings loaded(4): uiyiuy, beauty512, CharTurner, neg_realism512
Textual inversion embeddings skipped(5): 21charturnerv2, nartfixer, nfixer, nrealfixer, rz-neg-general
Removing ToMe patch (if exists)
Model loaded in 2.3s (create model: 0.2s, apply weights to model: 0.3s, apply half(): 0.4s, load VAE: 0.2s, move model to device: 0.4s, load textual inversion embeddings: 0.7s).
Running on local URL: http://127.0.0.1:7860
To create a public link, set share=True
in launch()
.
Startup time: 9.8s (import torch: 1.0s, import gradio: 0.8s, import ldm: 0.3s, other imports: 0.5s, list SD models: 0.1s, load scripts: 1.9s, load SD checkpoint: 2.4s, create ui: 2.5s, gradio launch: 0.1s).
Consuming a byte in the end state
Consuming a byte in the end state
controlnet batch mode
Loading model: control_sd15_hed [fef5e48e]
Loaded state_dict from [F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\models\control_sd15_hed.pth]
Loading config: F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\models\control_sd15_hed.yaml
ControlNet model control_sd15_hed [fef5e48e] loaded.
Error running process: F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
File "F:\AI\Automatic1111\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
File "F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 1035, in process
input_image = HWC3(image['image'])
File "F:\AI\Automatic1111\stable-diffusion-webui\extensions\sd-webui-controlnet\annotator\util.py", line 6, in HWC3
assert x.dtype == np.uint8
File "F:\AI\Automatic1111\stable-diffusion-webui\venv\lib\site-packages\PIL\Image.py", line 529, in getattr
raise AttributeError(name)
AttributeError: dtype
GPU 3070
CPU - I7
OS - Windows 11
Unable to generate, throws an error.
Traceback (most recent call last):
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(
File "C:\Neural networks\Stable Diffusion\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 843, in call_function
raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.
Using macOS, I seem able to generate only 1 frame? then it does not seem to process anymore.
Frames prepared: 1 / 488; Frames processed: 1 / 488
Hey, thanks for doing the awesome work of re-writing the code to make it into an Auto1111 extension. This is great. I'm already trying it and I love how streamlined now the process is. <3
Just a minor thing, In the previous command line version I remember some timer that told me how much time was going to take.
Would it be possible to add "Time elapsed/left" next to "Frames prepared" / "Frames processed"?
Anyways, love the new extension, thanks for this!
I'm currently running AUTOMATIC1111 from a Docker container on a remote instance.
seems like the script doesn't work since it's trying to show image with OpenCV and I have no UI on my Ubuntu instance. is it possible to run it with no display or with web interface instead ?
All of the txt2vid videos I do start turning purple or red after a few frames. Any ideas what could be causing this?
(raft) I:\stable-diffusion-webui>python vid2vid.py
Traceback (most recent call last):
File "vid2vid.py", line 145, in
with h5py.File(FLOW_MAPS, 'r') as f:
File "C:\Users\david\miniconda3\envs\raft\lib\site-packages\h5py_hl\files.py", line 567, in init
fid = make_fid(name, mode, userblock_size, fapl, fcpl, swmr=swmr)
File "C:\Users\david\miniconda3\envs\raft\lib\site-packages\h5py_hl\files.py", line 231, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 106, in h5py.h5f.open
FileNotFoundError: [Errno 2] Unable to open file (unable to open file: name = 'i:/flow.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
I am trying to use a video only for posing and have the output video entirely dependent on the prompt and controlnet (using openpose). However, the vid2vid always contains a lot of the original video information (beyond the pose). There doesn't seem to be an obvious way to reduce the impact of the original video frames. I'm assuming it's using some sort of img2img, but without a way to change the denoising strength.
Would be nice to have a control of sampling method/sampling steps in txt-to-video
Any guide on how to run it on windows?
Its seems like my initial will be okay but every image slowly just adds more burn layers before the preview becomes a complete acid trip.
also I get this every generation:
Error running process: H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py
Traceback (most recent call last):
File "H:\NovelAI\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
File "H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 629, in process
unit = self.parse_remote_call(p, unit, idx)
File "H:\NovelAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\controlnet.py", line 541, in parse_remote_call
unit.enabled = selector(p, "control_net_enabled", unit.enabled, idx, strict=True)
AttributeError: 'float' object has no attribute 'enabled'
ATTENTION: Negative prompt weight is set to 3.0
(EDIT: Okay so the negative prompt thing was the result of another extension, I had to turn off a couple of extensions to remove more errors)
Traceback (most recent call last):
File "..\extensions\SD-CN-Animation\scripts\base_ui.py", line 123, in process
yield from txt2vid.start_process(*args)
RuntimeError: generator ignored GeneratorExit
The error message above, appears sometimes, before finishing the animation in T2V (without CN).
CPU: intel
Graphics: Nvidia RTX
Browser: Google Chrome
Win11
Latest A1111 (Commit commit: 5ab7f213)
The only thing I do to continue creating, closing A1111 completely and restart it.
Thank you very much for this amazing extension 👍
Hi the script is awesome but do you know how I can fix the ghosting problem ? thanks
I use runwayml to remove the background from a video, compute_flow renders an odd pattern in the black where it should be seeing nothing, i assume its compression artifacts, but i also ran it through after effects rotobrush to strip anything out and it looks the same, if i change the resolution to 2x the video size then nearly the whole frame is white.
ok ive just rendered it out from after effects into uncompressed avi
this is a solid black background, am i doing something wrong here?
Every time I run the webui i get
Installing SD-CN-Animation requirement: scikit-image==0.19.2
but other than that, the extension looks very promising, cheers!
On automatic1111 there is currently this warning on each frame in the console
sd-cn-animation/scripts\core\vid2vid.py:115: FutureWarning:
multichannel
is a deprecated argument name formatch_histograms
. It will be removed in version 1.0. Please usechannel_axis
instead.
On https://github.com/vladmandic/automatic this deprecated argument has already been removed, causing the script to fail at the first step.
I think (suggested from a redditor, not my idea) using multi control nets would make a really great version, can it be added? I tried it and cannot get it to load properly
I have used multiple controlnet on stills (depth, hed, color and canny) and the result on stills from a mp4 is virtually no flickering, and very consistent so if it can be added to this process I believe the results would be a lot better.
it sems simple but it gets stuck on loading the image for the other control nets.
class controlnetRequest():
def __init__(self, b64_cur_img, b64_hed_img, b64_hed_img1, ds=0.35, w=w, h=h, mask=None, mask1=None):
self.url = "http://localhost:7860/sdapi/v1/img2img"
self.body = {
"init_images": [b64_cur_img],
"mask": mask,
"mask_blur": 0,
"inpainting_fill": 1,
"inpainting_mask_invert": 0,
"prompt": PROMPT,
"negative_prompt": N_PROMPT,
"seed": SEED,
"subseed": -1,
"subseed_strength": 0,
"batch_size": 1,
"n_iter": 1,
"steps": 20,
"cfg_scale": 7,
"denoising_strength": ds,
"width": w,
"height": h,
"restore_faces": False,
"eta": 0,
"sampler_index": "DPM++ 2S a",
"control_net_enabled": True,
"alwayson_scripts": {
"ControlNet": {
"args": [
{
"input_image": b64_hed_img,
"module": "depth",
"model": "control_depth-fp16 [400750f6]",
"weight": 1,
"resize_mode": "Just Resize",
"lowvram": False,
"processor_res": 720,
"guidance": 1,
"guessmode": False
}
]
},
"ControlNet1": {
"args": [
{
"input_image": b64_hed_img1,
"module": "hed",
"model": "control_hed-fp16 [13fee50b]",
"weight": 1,
"resize_mode": "Just Resize",
"lowvram": False,
"processor_res": 720,
"guidance": 1,
"guessmode": False
}
]
}
},
}
results in
line 140, in
image_bytes = base64.b64decode(data_js["images"][0])
KeyError: 'images'
Because of deprecated gradio usage, this extension won't work with Vladmandic/automatic. See related issue: vladmandic/automatic#817
Could you collaborate?
issue with macOS cpu cuda only
Traceback (most recent call last):
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 394, in run_predict
output = await app.get_blocks().process_api(
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1075, in process_api
result = await self.call_function(
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 898, in call_function
prediction = await anyio.to_thread.run_sync(
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 549, in async_iteration
return next(iterator)
File "/Users/jimmygunawan/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 123, in process
yield from txt2vid.start_process(*args)
File "/Users/jimmygunawan/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 76, in start_process
FloweR_load_model(args_dict['width'], args_dict['height'])
File "/Users/jimmygunawan/stable-diffusion-webui/extensions/sd-cn-animation/scripts/core/txt2vid.py", line 51, in FloweR_load_model
FloweR_model.load_state_dict(torch.load(model_path))
File "/Users/jimmygunawan/stable-diffusion-webui/modules/safe.py", line 106, in load
return load_with_extra(filename, extra_handler=global_extra_handler, *args, **kwargs)
File "/Users/jimmygunawan/stable-diffusion-webui/modules/safe.py", line 151, in load_with_extra
return unsafe_torch_load(filename, *args, **kwargs)
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 789, in load
return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1131, in _load
result = unpickler.load()
File "/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1213, in load
dispatch[key[0]](self)
File "/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/pickle.py", line 1254, in load_binpersid
self.append(self.persistent_load(pid))
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1101, in persistent_load
load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location))
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 1083, in load_tensor
wrap_storage=restore_location(storage, location),
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 215, in default_restore_location
result = fn(storage, location)
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 182, in _cuda_deserialize
device = validate_cuda_device(location)
File "/Users/jimmygunawan/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/serialization.py", line 166, in validate_cuda_device
raise RuntimeError('Attempting to deserialize object on a CUDA '
RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.
So a video shows up in the folder but it only has a single frame I'm using the default resolution but its still not working.
I'm using Google Colab, which will kick me off after a period of time and, often, the integration with Google Drive just breaks out of nowhere. I'm currently running a larger-format v2v and it's taking hours (for just 60 frames). Which is fine, but I'm a bit paranoid about losing progress if I get kicked off of free Colab and/or if something breaks.
It would be wonderful to be able to manually save progress mid-generation (or automatically save every X frames) and be able to resume later. I suspect something like this would be nice for other setups/use cases as well.
Thanks for making this. It's awesome.
Hi I have a error when I try to run the script :
C:\AUTOMATIC1111>py script.py
Traceback (most recent call last):
File "C:\AUTOMATIC1111\script.py", line 15, in
from raft import RAFT
ModuleNotFoundError: No module named 'raft'
Apparently I need to install raft but how I add raft to automatic1111 ? thanks
Hey, i think this script would work better as an extension inside of the Automatic1111 webui. preferably as a script you can choose under img2img in the script dropdown.
then you would upload a video and it would end up in a temporary folder, it renders it, then recombines the frames.
i think this script would be a lot more user friendly if it was available from the Automatic1111 webui, and it doesn't seem to have a good reason to be a separate script, as it requires the Automatic1111 webui to function.
and if you decide not to, i could do it if you wanted me to.
Hello, I would like to know if someone could help me figure out why I keep on getting the same mistake multiple time related to torch
Here is my issue, When I run the line below, I get the module torch not found even though it is there.
'D:\stable-diffusion-webui\SD-CN-Animation>python3 compute_flow.py -i "C:\Users\Arnaud***\Downloads\Remove0417_720p.mov" -o "D:\stable-diffusion-webui\SD-CN-Animation\EXPORT" -v -W 720 -H 540
Traceback (most recent call last):
File "D:\stable-diffusion-webui\SD-CN-Animation\compute_flow.py", line 7, in
from flow_utils import RAFT_estimate_flow
File "D:\stable-diffusion-webui\SD-CN-Animation\flow_utils.py", line 9, in
import torch
ModuleNotFoundError: No module named 'torch' '
What could be the problem?
Is there a way to make LoRA works with your script ? I have enter the name of my lora file in the prompt but it doesn't works, thanks
Using default settings and prompt "lemon" after installing the extension, updating automatic1111 and all my extensions and restarting everything:
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:03<00:00, 1.99it/s]
Error running postprocess_batch: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py█████████| 6/6 [00:02<00:00, 2.70it/s]
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 453, in postprocess_batch
script.postprocess_batch(p, *script_args, images=images, **kwargs)
TypeError: Script.postprocess_batch() missing 1 required positional argument: 'enabled'
Error running postprocess_batch: D:\stable-diffusion-webui\extensions\sd-dynamic-thresholding\scripts\dynamic_thresholding.py
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 453, in postprocess_batch
script.postprocess_batch(p, *script_args, images=images, **kwargs)
TypeError: Script.postprocess_batch() missing 8 required positional arguments: 'enabled', 'mimic_scale', 'threshold_percentile', 'mimic_mode', 'mimic_scale_min', 'cfg_mode', 'cfg_scale_min', and 'powerscale_power'
Error running postprocess: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 444, in postprocess
script.postprocess(p, processed, *script_args)
TypeError: Script.postprocess() missing 1 required positional argument: 'enabled'
Error running postprocess: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 444, in postprocess
script.postprocess(p, processed, *script_args)
TypeError: Script.postprocess() missing 1 required positional argument: 'enabled'
Total progress: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 6/6 [00:02<00:00, 2.05it/s]
Error running process: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\tilediffusion.py███████████████████| 6/6 [00:02<00:00, 2.70it/s]
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
TypeError: Script.process() missing 21 required positional arguments: 'enabled', 'method', 'overwrite_size', 'keep_input_size', 'image_width', 'image_height', 'tile_width', 'tile_height', 'overlap', 'tile_batch_size', 'upscaler_name', 'scale_factor', 'noise_inverse', 'noise_inverse_steps', 'noise_inverse_retouch', 'noise_inverse_renoise_strength', 'noise_inverse_renoise_kernel', 'control_tensor_cpu', 'enable_bbox_control', 'draw_background', and 'causal_layers'
Error running process: D:\stable-diffusion-webui\extensions\multidiffusion-upscaler-for-automatic1111\scripts\vae_optimize.py
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 417, in process
script.process(p, *script_args)
TypeError: Script.process() missing 7 required positional arguments: 'enabled', 'encoder_tile_size', 'decoder_tile_size', 'vae_to_gpu', 'fast_decoder', 'fast_encoder', and 'color_fix'
Error running process_batch: D:\stable-diffusion-webui\extensions\sd-dynamic-thresholding\scripts\dynamic_thresholding.py
Traceback (most recent call last):
File "D:\stable-diffusion-webui\modules\scripts.py", line 435, in process_batch
script.process_batch(p, *script_args, **kwargs)
TypeError: Script.process_batch() missing 8 required positional arguments: 'enabled', 'mimic_scale', 'threshold_percentile', 'mimic_mode', 'mimic_scale_min', 'cfg_mode', 'cfg_scale_min', and 'powerscale_power'
33%|██████████████████████████████████████████████▎ | 4/12 [00:02<00:03, 2.07it/s]
However the video seems to be generated fine, so the errors are ignored or are not critical.
sd-cn-animation/scripts\core\vid2vid.py", line 114, in start_process
processed_frame = np.array(processed_frames[0])
IndexError: list index out of range
Do need to enable other settings in this operation?
Error loading script: base_ui.py
Traceback (most recent call last):
File "/content/stable-diffusion-webui/modules/scripts.py", line 248, in load_scripts
script_module = script_loading.load_module(scriptfile.path)
File "/content/stable-diffusion-webui/modules/script_loading.py", line 11, in load_module
module_spec.loader.exec_module(module)
File "", line 883, in exec_module
File "", line 241, in _call_with_frames_removed
File "/content/stable-diffusion-webui/extensions/SD-CN-Animation/scripts/base_ui.py", line 30, in
from core import vid2vid, txt2vid, utils
ModuleNotFoundError: No module named 'core'
Got this error trying to run in a colab
script.py", line 235, in
_, alpha_mask, warped_styled = RAFT_estimate_flow_diff(prev_frame, frame, prev_frame_styled)
File "D:\study\SD-CN-Animation\script.py", line 135, in RAFT_estimate_flow_diff
diff_mask = np.abs(warped_frame.astype(np.float32) - frame2.astype(np.float32)) / 255
ValueError: operands could not be broadcast together with shapes (872,488,3) (866,486,3)
Getting this with vid2vid. v0.6 + https://github.com/vladmandic/automatic:
override_settings: []
Traceback (most recent call last):
File "e:\.ai\automatic\venv\lib\site-packages\gradio\routes.py", line 399, in run_predict
output = await app.get_blocks().process_api(
File "e:\.ai\automatic\venv\lib\site-packages\gradio\blocks.py", line 1299, in process_api
result = await self.call_function(
File "e:\.ai\automatic\venv\lib\site-packages\gradio\blocks.py", line 1036, in call_function
prediction = await anyio.to_thread.run_sync(
File "e:\.ai\automatic\venv\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "e:\.ai\automatic\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "e:\.ai\automatic\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
result = context.run(func, *args)
File "e:\.ai\automatic\venv\lib\site-packages\gradio\utils.py", line 488, in async_iteration
return next(iterator)
File "e:\.ai\automatic/extensions/sd-cn-animation/scripts\vid2vid.py", line 172, in start_process
processed_frames, _, _, _ = img2img(args_dict)
File "e:\.ai\automatic/extensions/sd-cn-animation/scripts\vid2vid.py", line 445, in img2img
print('script_inputs 1:', args.script_inputs[1].__dict__)
AttributeError: 'str' object has no attribute '__dict__'
Hi I have a weird bug with the v0.5 is someone know what is it ? thanks
i updated the script to the latest version and now I cant seem to avoid this error. Anybody know what it means?
compute_flow.py: error: unrecognized arguments: Diffusion\stable-diffusion-webui\scripts\SD-CN-Animation-main\1.mp4 Diffusion\stable-diffusion-webui\scripts\SD-CN-Animation-main\flow.h5
i'm guessing the path structure is incorrect on my part. not sure but any help would be appreciated and please accept my humblest apology if this error was posted and solved somewhere else.
I've triple checked everything and even tried with a fresh install of webui and a rollmarch to pre March 25th as well. No matter what I get errors.
(sdoptical) PS D:\Github\SD-CN-Animation> python vid2vid.py
0%|
Traceback (most recent call last):
File "D:\Github\SD-CN-Animation\vid2vid.py", line 175, in
out_image = controlnetRequest(to_b64(frame), to_b64(frame), PROCESSING_STRENGTH, w, h, mask = None).sendRequest()
File "D:\Github\SD-CN-Animation\vid2vid.py", line 123, in sendRequest
image_bytes = base64.b64decode(data_js["images"][0])
KeyError: 'images'
Any ideas?
It ocurred to me in 3 instances where I was trying to generate max length videos. Stopping randonmly without error message in the console at 104, 123, 88. All with different settings (max fps, length, etc)
I'll try to have a look through the code soon if we can't seem to diagnose this.
Traceback (most recent call last):
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 337, in run_predict
output = await app.get_blocks().process_api(
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1015, in process_api
result = await self.call_function(
File "C:\Users\ga_ma\A1111\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 843, in call_function raise ValueError("Need to enable queue to use generators.")
ValueError: Need to enable queue to use generators.
I'm trying to process a short 5 second video using vid2vid + ControlNet. It seems to work fine and it will process all 150 frames.
However when it's done, nothing will happen. I was expecting to get some kind of download link to download the result, just I can't find anything on the UI. Also there are no errors reported anywhere on the UI or the Stable Diffusion console.
Am I just blind or is that some kind of bug?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.