ashleykleynhans / stable-diffusion-docker Goto Github PK

Docker image for Stable Diffusion WebUI with ControlNet, After Detailer, Dreambooth, Deforum and ReActor extensions, as well as Kohya_ss and ComfyUI

License: GNU General Public License v3.0

Dockerfile 49.17% Python 2.96% Shell 40.93% HCL 6.94%

deforum deforum-stable-diffusion docker dreambooth face-swap kohya-webui roop runpod stable-diffusion stable-diffusion-webui

stable-diffusion-docker's Introduction

Docker image for A1111 Stable Diffusion Web UI, Kohya_ss and ComfyUI

Now with SDXL support.

Installs

Ubuntu 22.04 LTS
CUDA 11.8
Python 3.10.12
Torch 2.1.2
xformers 0.0.23.post1
Jupyter Lab
Automatic1111 Stable Diffusion Web UI 1.9.3
Dreambooth extension 1.0.14
ControlNet extension v1.1.445
After Detailer extension v24.4.2
Locon extension
ReActor extension (replaces roop)
Deforum extension
Inpaint Anything extension
Infinite Image Browsing extension
CivitAI extension
CivitAI Browser+ extension
TensorRT extension
Kohya_ss v24.0.7
ComfyUI
ComfyUI Manager
sd_xl_base_1.0.safetensors
sd_xl_refiner_1.0.safetensors
sdxl_vae.safetensors
inswapper_128.onnx
runpodctl
OhMyRunPod
RunPod File Uploader
croc
rclone
Application Manager
CivitAI Downloader

Available on RunPod

This image is designed to work on RunPod. You can use my custom RunPod template to launch it on RunPod.

Building the Docker image

Note

You will need to edit the docker-bake.hcl file and update REGISTRY_USER, and RELEASE. You can obviously edit the other values too, but these are the most important ones.

Important

In order to cache the models, you will need at least 32GB of CPU/system memory (not VRAM) due to the large size of the models. If you have less than 32GB of system memory, you can comment out or remove the code in the Dockerfile that caches the models.

# Clone the repo
git clone https://github.com/ashleykleynhans/stable-diffusion-docker.git

# Download the models
cd stable-diffusion-docker
wget https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned.safetensors
wget https://huggingface.co/stabilityai/sd-vae-ft-mse-original/resolve/main/vae-ft-mse-840000-ema-pruned.safetensors
wget https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors
wget https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0/resolve/main/sd_xl_refiner_1.0.safetensors
wget https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/resolve/main/sdxl_vae.safetensors

# Log in to Docker Hub
docker login

# Build the image, tag the image, and push the image to Docker Hub
docker buildx bake -f docker-bake.hcl --push

# Same as above but customize registry/user/release:
REGISTRY=ghcr.io REGISTRY_USER=myuser RELEASE=my-release docker buildx \
    bake -f docker-bake.hcl --push

Running Locally

Install Nvidia CUDA Driver

Start the Docker container

docker run -d \
  --gpus all \
  -v /workspace \
  -p 3000:3001 \
  -p 3010:3011 \
  -p 3020:3021 \
  -p 6006:6066 \
  -p 8000:8000 \
  -p 8888:8888 \
  -p 2999:2999 \
  -e JUPYTER_PASSWORD=Jup1t3R! \
  -e ENABLE_TENSORBOARD=1 \
  ashleykza/stable-diffusion-webui:latest

You can obviously substitute the image name and tag with your own.

Ports

Connect Port	Internal Port	Description
3000	3001	A1111 Stable Diffusion Web UI
3010	3011	Kohya_ss
3020	3021	ComfyUI
6006	6066	Tensorboard
8000	8000	Application Manager
8888	8888	Jupyter Lab
2999	2999	RunPod File Uploader

Environment Variables

Variable	Description	Default
VENV_PATH	Set the path for the Python venv for the app	/workspace/venvs/stable-diffusion-webui
JUPYTER_LAB_PASSWORD	Set a password for Jupyter lab	not set - no password
DISABLE_AUTOLAUNCH	Disable Web UIs from launching automatically	enabled
ENABLE_TENSORBOARD	Enables Tensorboard on port 6006	enabled

Logs

Stable Diffusion Web UI, Kohya SS, and ComfyUI each create log files, and you can tail the log files instead of killing the services to view the logs

Application	Log file
Stable Diffusion Web UI	/workspace/logs/webui.log
Kohya SS	/workspace/logs/kohya_ss.log
ComfyUI	/workspace/logs/comfyui.log

Community and Contributing

Pull requests and issues on GitHub are welcome. Bug fixes and new features are encouraged.

Appreciate my work?

stable-diffusion-docker's People

Contributors

Stargazers

Watchers

stable-diffusion-docker's Issues

tail log for kohya_ss on vast.ai

hi! great thanks for your images!
i use your kohya_ss image from templates on vast.ai , training works fine, but i cant get full live log

i tried command from readme
tail -f /workspace/logs/kohya_ss.log
it doesnt work

tried
tail -f /kohya_ss/setup.log
it works only for initial commands, not for training process

can u please help enable training log?

Slow startup time due to unnecessary disk IO

This container is very slow to start on RunPod, 10-20 minutes even on latest gen secure-cloud pod types with NVMe storage.
I believe the cause is excess disk IO in pre_start.sh

stable-diffusion-docker/scripts/pre_start.sh

Lines 13 to 14 in 5f292d1

 rsync -au /stable-diffusion-webui/ /workspace/stable-diffusion-webui/ 

 rm -rf /stable-diffusion-webui

stable-diffusion-docker/scripts/pre_start.sh

Lines 18 to 19 in 5f292d1

 rsync -au /kohya_ss/ /workspace/kohya_ss/ 

 rm -rf /kohya_ss

stable-diffusion-docker/scripts/pre_start.sh

Lines 28 to 29 in 5f292d1

 rsync -au /app-manager/ /workspace/app-manager/ 

 rm -rf /app-manager

These operations are doing a copy + delete of very large directories. This could be significantly optimized by instead moving the existing files to their intended destination directly.

It's also unclear why rsync is being used, it could be for recursive merging into existing directories but I don't think there are existing directories.
If recursive merging is not needed mv should be used directly.
If recursive merge is needed, it can be achieved with much better performance by using find and mv like so:

cd /path/to/source
find . -type f -exec bash -c 'file="{}"; dir="${file#.}"; target="/path/to/target${dir%/*}"; mkdir -p "$target"; mv "$file" "$target/"' \;

Noob can't start the program

Sorry for the noob post.

I have successfully built the docker container. With docker desktop app I can run the container, but I don't know how to access to any of the apps.

I have tried to solve it with chat gpt but could not figure out.

Any help would be appreciated.

RunPod 3.2.0: webui.sh: line 255: 146 Illegal instruction

Hi!

First of all, thank you for this amazing docker image. I use it very often within the RunPod instance.

I encountered an issue deploying a fresh instance. I am unsure if it is related to the image, but I rely on your help to solve it.
Unfortunately, my attempts were unsuccessful.

When I deploy a new instance on RunPod, after starting Automatic1111 it doesn't start with the following error:

...
################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
Version: v1.6.0
Commit hash: 5ef669de080814067961f28357256e8fe27544f4
Launching Web UI with arguments: -f --port 3001 --skip-install --listen --api --xformers --enable-insecure-extension-access --no-half-vae
./webui.sh: line 255:   146 Illegal instruction     (core dumped) "${python_cmd}" -u "${LAUNCH_SCRIPT}" "$@"

Thank you for any help in advance.

A1111 Forge fork

Hi! I was wondering if it is possible to adapt this image to use A1111 Forge, so I forked your setup, and on the way made a lot of improvements.

Disable components from running based on ENV variables

Hello, would it be possible to implement the ability to disable addons in the docker file such as Automatic1111, ComfyUI etc, to speed up install/restart time?

Here is an example to put in your Dockerfile to enable/disable ComfyUI using the environment argument
COMFYUI_ENABLE= TRUE

# Install ComfyUI based on COMFYUI_ENABLE env variable
RUN if [ "$COMFYUI_ENABLE" = "TRUE" ]; then \
    git clone https://github.com/comfyanonymous/ComfyUI.git /ComfyUI && \
    cd /ComfyUI && \
    python3 -m venv --system-site-packages venv && \
    source venv/bin/activate && \
    pip3 install --no-cache-dir torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 && \
    pip3 install --no-cache-dir xformers==0.0.22 && \
    pip3 install -r requirements.txt && \
    pip3 cache purge && \
    deactivate; \
fi

Then to run without it
docker run -e COMFYUI_ENABLE=FALSE your_image_name

"Host is down" error when launching on RunPod

Been getting a "Host is down" error any time I try to launch the template in RunPod for the last few days, even in the default config with no network volume in use.

2024-01-21T15:07:17.650838855Z Container Started, configuration in progress...
2024-01-21T15:07:17.650853685Z Starting Nginx service...
2024-01-21T15:07:17.657525555Z * Starting nginx nginx
2024-01-21T15:07:17.661570040Z ...done.
2024-01-21T15:07:17.661825269Z Running pre-start script...
2024-01-21T15:07:17.662690084Z Template version: 3.11.2
2024-01-21T15:07:17.665610436Z Existing version is the same as the template version, no syncing required.
2024-01-21T15:07:17.667540275Z Starting Stable Diffusion Web UI
2024-01-21T15:07:17.667649574Z Stable Diffusion Web UI started
2024-01-21T15:07:17.667653134Z Log file: /workspace/logs/webui.log
2024-01-21T15:07:17.668197551Z Starting Kohya_ss Web UI
2024-01-21T15:07:17.668323050Z Kohya_ss started
2024-01-21T15:07:17.668324830Z Log file: /workspace/logs/kohya_ss.log
2024-01-21T15:07:17.669866241Z Starting ComfyUI
2024-01-21T15:07:17.672021388Z ComfyUI started
2024-01-21T15:07:17.672024928Z Log file: /workspace/logs/comfyui.log
2024-01-21T15:07:17.673444130Z Starting Tensorboard
2024-01-21T15:07:17.675582517Z ln: failed to create symbolic link '/workspace/logs/dreambooth/dreambooth': File exists
2024-01-21T15:07:17.676185703Z ln: failed to create symbolic link '/workspace/logs/ti/textual_inversion': File exists
2024-01-21T15:07:17.676587351Z Tensorboard Started
2024-01-21T15:07:17.676761160Z All services have been started
2024-01-21T15:07:17.709346984Z sshd: no hostkeys available -- exiting.
2024-01-21T15:07:17.711097634Z System has not been booted with systemd as init system (PID 1). Can't operate.
2024-01-21T15:07:17.711102844Z Failed to connect to bus: Host is down

Updating the version crash the UIs

Hi Ashleykleynhans,

I noticed that everytime I update the version of the image (ex 1.7.9 to 1.8.4), the UIs have errors; nothing in the container log, but in the logs of the different apps :

tail -f workspace/logs/webui.log

Traceback (most recent call last):
File "/workspace/stable-diffusion-webui/launch.py", line 39, in
main()
File "/workspace/stable-diffusion-webui/launch.py", line 35, in main
start()
File "/workspace/stable-diffusion-webui/modules/launch_utils.py", line 390, in start
import webui
File "/workspace/stable-diffusion-webui/webui.py", line 14, in
from fastapi import FastAPI
ModuleNotFoundError: No module named 'fastapi'`

tail -f workspace/logs/kohya_ss.log

Traceback (most recent call last):
File "/workspace/kohya_ss/setup/validate_requirements.py", line 19, in
from library.custom_logging import setup_logging
File "/workspace/kohya_ss/library/custom_logging.py", line 6, in
from rich.theme import Theme
ModuleNotFoundError: No module named 'rich'
`

tail -f workspace/logs/comfyui.log

Traceback (most recent call last):
File "/workspace/ComfyUI/main.py", line 66, in
import comfy.utils
File "/workspace/ComfyUI/comfy/utils.py", line 5, in
import safetensors.torch
ModuleNotFoundError: No module named 'safetensors'
`

For Auto, I just comment the --skip-install argument to fix this, and for the others UI I have to install the modules manually (pip install..).

I noticed that there should be a /post_start.sh script that should be run, but the file is missing in the image. Maybe it comes from this ?

Best regards!

Cuda out of memory with webui v1.6 (runpod template 1.9.0 and above)

Hey Ashleykleynhans,

first let me thank you for your amazing work, your runpod template is a godsend for SD training and testing :)

I just wanted to let you know that since yesterday (the release of your 1.9.0 v docker image) me and a lot of people are getting cuda out of memory errors when trying to train a Lora with kohya (using 24GB vram gpus)

I suspect it's because the 1.9.0 and above are using the newest auto1111 sd webui 1.6.0 and it's somehow using more vram than the previous version (I'm not sure, this is just speculation on my part).

Changing the template to a previous version works perfectly fine (anything before the 1.9.0 update) so that's not a big deal but most people are not aware of that which can create a lot of troubleshooting issues.

I don't know if you can do something about it or if I should just tell people to use older versions of the image if they get a cuda error, let me know.

Thanks again for your amazing work ;)

Rebuild container to update sd-civitai-browser-plus

Could you rebuild the image? BlafKing/sd-civitai-browser-plus#249 needed a hotfix, but since the built container is outdated, it isn't available when the image is deployed.

A workaround is to ssh into the container and run:

cd /workspace/stable-diffusion-webui/extensions/sd-civitai-browser-plus;
git pull

And then finally restart the container (not the UI).

Docker build failed

Thank you for this awesome repo

I cloned it and ran docker build bu I am facing this issue below realted to pip packages
I am using Mac and Docker version 4.22.0 (117440)

 => ERROR [base 4/5] RUN pip3 install --no-cache-dir torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 &&     pip3 install --no-cache-dir xformers==0.0.22 tensorrt                                                                                                                                                                                                                                   6.9s
------                                                                                                                                                                                                                                                                                                                                                                                                                                          
 > [base 4/5] RUN pip3 install --no-cache-dir torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 &&     pip3 install --no-cache-dir xformers==0.0.22 tensorrt:                                                                                                                                                                                                                                              
0.438 Looking in indexes: https://download.pytorch.org/whl/cu118                                                                                                                                                                                                                                                                                                                                                                                
1.040 Collecting torch==2.0.1                                                                                                                                                                                                                                                                                                                                                                                                                   
1.055   Downloading https://download.pytorch.org/whl/torch-2.0.1-cp310-cp310-manylinux2014_aarch64.whl (74.0 MB)                                                                                                                                                                                                                                                                                                                                
1.857      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.0/74.0 MB 104.2 MB/s eta 0:00:00                                                                                                                                                                                                                                                                                                                                                          
2.450 Collecting torchvision
2.466   Downloading https://download.pytorch.org/whl/torchvision-0.16.0-cp310-cp310-linux_aarch64.whl (14.1 MB)
2.605      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.1/14.1 MB 104.8 MB/s eta 0:00:00
3.120 Collecting torchaudio
3.133   Downloading https://download.pytorch.org/whl/torchaudio-2.1.0-cp310-cp310-linux_aarch64.whl (1.6 MB)
3.160      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 64.3 MB/s eta 0:00:00
3.573 Collecting filelock
3.596   Downloading https://download.pytorch.org/whl/filelock-3.9.0-py3-none-any.whl (9.7 kB)
3.963 Collecting networkx
3.978   Downloading https://download.pytorch.org/whl/networkx-3.0-py3-none-any.whl (2.0 MB)
4.005      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 84.6 MB/s eta 0:00:00
4.426 Collecting sympy
4.440   Downloading https://download.pytorch.org/whl/sympy-1.12-py3-none-any.whl (5.7 MB)
4.505      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.7/5.7 MB 92.0 MB/s eta 0:00:00
4.908 Collecting jinja2
4.928   Downloading https://download.pytorch.org/whl/Jinja2-3.1.2-py3-none-any.whl (133 kB)
4.933      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 133.1/133.1 KB 179.5 MB/s eta 0:00:00
5.321 Collecting typing-extensions
5.344   Downloading https://download.pytorch.org/whl/typing_extensions-4.4.0-py3-none-any.whl (26 kB)
5.756 Collecting torchvision
5.775   Downloading https://download.pytorch.org/whl/torchvision-0.15.2-cp310-cp310-manylinux2014_aarch64.whl (1.2 MB)
5.792      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 90.3 MB/s eta 0:00:00
5.822   Downloading https://download.pytorch.org/whl/torchvision-0.15.1-cp310-cp310-manylinux2014_aarch64.whl (1.2 MB)
5.838      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.2/1.2 MB 94.8 MB/s eta 0:00:00
5.859   Downloading https://download.pytorch.org/whl/torchvision-0.15.0-cp310-cp310-manylinux2014_aarch64.whl (865 kB)
5.872      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 865.1/865.1 KB 81.0 MB/s eta 0:00:00
5.893   Downloading https://download.pytorch.org/whl/torchvision-0.14.1-cp310-cp310-manylinux2014_aarch64.whl (760 kB)
5.905      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 760.3/760.3 KB 72.4 MB/s eta 0:00:00
5.934   Downloading https://download.pytorch.org/whl/torchvision-0.14.0-cp310-cp310-manylinux2014_aarch64.whl (760 kB)
5.951      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 760.3/760.3 KB 48.8 MB/s eta 0:00:00
5.969   Downloading https://download.pytorch.org/whl/torchvision-0.13.1-cp310-cp310-manylinux2014_aarch64.whl (701 kB)
5.981      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 701.7/701.7 KB 67.6 MB/s eta 0:00:00
6.001   Downloading https://download.pytorch.org/whl/torchvision-0.13.0-cp310-cp310-manylinux2014_aarch64.whl (701 kB)
6.013      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 701.7/701.7 KB 67.2 MB/s eta 0:00:00
6.044   Downloading https://download.pytorch.org/whl/torchvision-0.12.0-cp310-cp310-manylinux2014_aarch64.whl (13.7 MB)
6.234      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.7/13.7 MB 79.1 MB/s eta 0:00:00
6.267   Downloading https://download.pytorch.org/whl/torchvision-0.2.0-py2.py3-none-any.whl (48 kB)
6.269      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.8/48.8 KB 159.2 MB/s eta 0:00:00
6.287   Downloading https://download.pytorch.org/whl/torchvision-0.1.6-py3-none-any.whl (16 kB)
6.293 Collecting torchaudio
6.307   Downloading https://download.pytorch.org/whl/torchaudio-2.0.2-cp310-cp310-manylinux2014_aarch64.whl (4.0 MB)
6.358      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.0/4.0 MB 82.7 MB/s eta 0:00:00
6.743 INFO: pip is looking at multiple versions of filelock to determine which version is compatible with other requirements. This could take a while.
6.743 INFO: pip is looking at multiple versions of torchaudio to determine which version is compatible with other requirements. This could take a while.
6.744 INFO: pip is looking at multiple versions of torchvision to determine which version is compatible with other requirements. This could take a while.
6.745 INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
6.745 INFO: pip is looking at multiple versions of torch to determine which version is compatible with other requirements. This could take a while.
6.746 ERROR: Could not find a version that satisfies the requirement MarkupSafe>=2.0 (from jinja2) (from versions: none)
6.746 ERROR: No matching distribution found for MarkupSafe>=2.0
------
Dockerfile:69
--------------------
  68 |     # Install Torch, xformers and tensorrt
  69 | >>> RUN pip3 install --no-cache-dir torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 && \
  70 | >>>     pip3 install --no-cache-dir xformers==0.0.22 tensorrt
  71 |     
--------------------
ERROR: failed to solve: process "/bin/bash -o pipefail -c pip3 install --no-cache-dir torch==2.0.1 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 &&     pip3 install --no-cache-dir xformers==0.0.22 tensorrt" did not complete successfully: exit code: 1

I tried to install MarkupSafe>=2.0 by adding RUN pip install MarkupSafe>=2.0

but this time I am facing different issue realted to tensorrt

Building wheel for tensorrt (setup.py): started
86.24   Building wheel for tensorrt (setup.py): finished with status 'error'
86.25   error: subprocess-exited-with-error
86.25   
86.25   × python setup.py bdist_wheel did not run successfully.
86.25   │ exit code: 1
86.25   ╰─> [60 lines of output]
86.25       running bdist_wheel
86.25       running build
86.25       running build_py
86.25       creating build
86.25       creating build/lib
86.25       creating build/lib/tensorrt
86.25       copying tensorrt/__init__.py -> build/lib/tensorrt
86.25       running egg_info
86.25       writing tensorrt.egg-info/PKG-INFO
86.25       writing dependency_links to tensorrt.egg-info/dependency_links.txt
86.25       writing requirements to tensorrt.egg-info/requires.txt
86.25       writing top-level names to tensorrt.egg-info/top_level.txt
86.25       reading manifest file 'tensorrt.egg-info/SOURCES.txt'
86.25       adding license file 'LICENSE.txt'
86.25       writing manifest file 'tensorrt.egg-info/SOURCES.txt'
86.25       /usr/lib/python3/dist-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
86.25         warnings.warn(
86.25       installing to build/bdist.linux-aarch64/wheel
86.25       running install
86.25       Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
86.25       ERROR: Could not find a version that satisfies the requirement tensorrt_libs==8.6.1 (from versions: 9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4)
86.25       ERROR: No matching distribution found for tensorrt_libs==8.6.1
86.25       Looking in indexes: https://pypi.org/simple, https://pypi.nvidia.com
86.25       ERROR: Could not find a version that satisfies the requirement tensorrt_libs==8.6.1 (from versions: 9.0.0.post11.dev1, 9.0.0.post12.dev1, 9.0.1.post11.dev4, 9.0.1.post12.dev4)
86.25       ERROR: No matching distribution found for tensorrt_libs==8.6.1
86.25       Traceback (most recent call last):
86.25         File "/tmp/pip-install-pxs9ysjb/tensorrt_a260c7deb996445caf121387ef781256/setup.py", line 40, in run_pip_command
86.25           return call_func([sys.executable, "-m", "pip"] + args, env=env)
86.25         File "/usr/lib/python3.10/subprocess.py", line 369, in check_call
86.25           raise CalledProcessError(retcode, cmd)
86.25       subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', 'pip', 'install', '--extra-index-url', 'https://pypi.nvidia.com', 'tensorrt_libs==8.6.1', 'tensorrt_bindings==8.6.1']' returned non-zero exit status 1.
86.25

ROOP extension: HTTP Error 401: Unauthorized

When using this "Stable Diffusion Kohya_ss ComfyUI Ultimate" template on runpod, the roop extension is not working in the Automatic1111 interface.

Uploading a face image , with ROOP "enabled", output has no face swap effect.
It is the same even if I deleted my pod, and restart a brand new pod.

When checking /workspace/logs/webui.log, it appears that the ROOP extension is not installed correctly:

oot@462be51d70b6:/workspace/logs# cat webui.log

################################################################
Install script for stable-diffusion + Web UI
Tested on Debian 11 (Bullseye)
################################################################

################################################################
Running on root user
################################################################

################################################################
Repo already cloned, using it as install directory
################################################################

################################################################
python venv already activate or run without venv: /workspace/venv
################################################################

################################################################
Launching launch.py...
################################################################
Using TCMalloc: libtcmalloc_minimal.so.4
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0]
Version: v1.5.1
Commit hash: 68f336bd994bed5442ad95bad6b6ad5564a5409a
Docker, returning.

*** Error running install.py for extension /workspace/stable-diffusion-webui/extensions/sd-webui-roop.
*** Command: "/workspace/venv/bin/python3" "/workspace/stable-diffusion-webui/extensions/sd-webui-roop/install.py"
*** Error code: 1
*** stderr: Traceback (most recent call last):
*** File "/workspace/stable-diffusion-webui/extensions/sd-webui-roop/install.py", line 25, in
*** download(model_url, model_path)
*** File "/workspace/stable-diffusion-webui/extensions/sd-webui-roop/install.py", line 16, in download
*** request = urllib.request.urlopen(url)
*** File "/usr/lib/python3.10/urllib/request.py", line 216, in urlopen
*** return opener.open(url, data, timeout)
*** File "/usr/lib/python3.10/urllib/request.py", line 525, in open
*** response = meth(req, response)
*** File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
*** response = self.parent.error(
*** File "/usr/lib/python3.10/urllib/request.py", line 563, in error
*** return self._call_chain(*args)
*** File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
*** result = func(*args)
*** File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
*** raise HTTPError(req.full_url, code, msg, hdrs, fp)
*** urllib.error.HTTPError: HTTP Error 401: Unauthorized

Launching Web UI with arguments: -f --port 3001 --skip-install --listen --api --xformers --enable-insecure-extension-access --no-half-vae
2023-08-23 07:14:09.741307: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-08-23 07:14:12.951525: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Additional Network extension not installed, Only hijack built-in lora
LoCon Extension hijack built-in lora successfully
[-] ADetailer initialized. version: 23.8.0, num models: 9
2023-08-23 07:14:40,916 - ControlNet - INFO - ControlNet v1.1.238
ControlNet preprocessor location: /workspace/stable-diffusion-webui/extensions/sd-webui-controlnet/annotator/downloads
2023-08-23 07:14:41,173 - ControlNet - INFO - ControlNet v1.1.238
2023-08-23 07:14:42,708 - roop - INFO - roop v0.0.2
2023-08-23 07:14:42,713 - roop - INFO - roop v0.0.2
Loading weights [1a189f0be6] from /workspace/stable-diffusion-webui/models/Stable-diffusion/v1-5-pruned.safetensors
Creating model from config: /workspace/stable-diffusion-webui/configs/v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Model loaded in 2.2s (load weights from disk: 0.4s, create model: 0.7s, apply weights to model: 0.5s, apply half(): 0.2s, move model to device: 0.2s).
Applying attention optimization: xformers... done.
2023-08-23 07:14:48,051 - roop - WARNING - You should at least have one model in models directory, please read the doc here : https://github.com/s0md3v/sd-webui-roop/
2023-08-23 07:14:48,153 - roop - WARNING - You should at least have one model in models directory, please read the doc here : https://github.com/s0md3v/sd-webui-roop/
Deforum ControlNet support: enabled
Running on local URL: http://0.0.0.0:3001

To create a public link, set share=True in launch().
Startup time: 86.3s (launcher: 40.5s, import torch: 17.0s, import gradio: 1.9s, setup paths: 2.8s, other imports: 2.2s, setup codeformer: 0.3s, load scripts: 19.9s, create ui: 1.1s, gradio launch: 0.3s).
100%|██████████| 20/20 [00:01<00:00, 14.11it/s]
Total progress: 100%|██████████| 20/20 [00:01<00:00, 14.00it/s]
100%|██████████| 20/20 [00:01<00:00, 16.33it/s]0:00, 16.26it/s]
download_path: /root/.insightface/models/buffalo_l0, 16.28it/s]
Downloading /root/.insightface/models/buffalo_l.zip from https://github.com/deepinsight/insightface/releases/download/v0.7/buffalo_l.zip...
100%|██████████| 281857/281857 [00:08<00:00, 35214.29KB/s]
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/1k3d68.onnx landmark_3d_68 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/2d106det.onnx landmark_2d_106 ['None', 3, 192, 192] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/det_10g.onnx detection [1, 3, '?', '?'] 127.5 128.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/genderage.onnx genderage ['None', 3, 96, 96] 0.0 1.0
Applied providers: ['CPUExecutionProvider'], with options: {'CPUExecutionProvider': {}}
find model: /root/.insightface/models/buffalo_l/w600k_r50.onnx recognition ['None', 3, 112, 112] 127.5 127.5
set det-size: (640, 640)
*** Error running postprocess_image: /workspace/stable-diffusion-webui/extensions/sd-webui-roop/scripts/faceswap.py
Traceback (most recent call last): [00:16<00:00, 16.28it/s]
File "/workspace/stable-diffusion-webui/modules/scripts.py", line 575, in postprocess_image
script.postprocess_image(p, pp, *script_args)
File "/workspace/stable-diffusion-webui/extensions/sd-webui-roop/scripts/faceswap.py", line 184, in postprocess_image
result: ImageResult = swap_face(
File "/workspace/stable-diffusion-webui/extensions/sd-webui-roop/scripts/swapper.py", line 132, in swap_face
result = face_swapper.get(result, target_face, source_face)
AttributeError: 'NoneType' object has no attribute 'get'

A1111 Forge fork

Hi! I was wondering if it is possible to adapt this image to use A1111 Forge, so I forked your setup, and on the way made a lot of improvements.

https://github.com/mnb3000/a1111-forge-svd-docker

I decoupled all weights from image, thus making it slimmer and faster to spin up. Weights are now being downloaded at first start of the container and their sha256 checksums are being validated after download.
Image size reduction made it able to set up a CI/CD pipeline which automatically builds, versions, tags, pushes and releases images.
I also added optional SDXL Turbo download as an env var, and SVD 1.1 download from civitai via API token.
If you are interested in backporting these kinds of changes to your repos - let me know, I might be able to help you with that

Prompt Travel Extension Install Issue on RUNPOD.

I am running this on RUNPOD.IO with success. I am having an issue that when I install the PROMPT TRAVEL extension in Stable Diffusion (https://github.com/Kahsolt/stable-diffusion-webui-prompt-travel) it does NOT correctly add MOVIEPY (https://pypi.org/project/moviepy/) as every startup shows:

The log [Prompt-Travel] package moviepy not installed, will not be able to generate video suggesting that the MoviePy package, which is required for video generation in the Prompt Travel extension, is not installed.

Any suggestions on how to fix this?

Thanks,

Vince

Replace roop with reactor.

Hello, since roop is outdated, could we implement the much improved reactor replacement? Here is an extensions link.

https://github.com/Gourieff/sd-webui-reactor

v1-5-pruned.safetensors: no such file or directory

This one has me beat, tried it all.

~/stable-diffusion-docker$ ls -l /path/to/v1-5-pruned.safetensors
ls: cannot access '/path/to/v1-5-pruned.safetensors': No such file or directory

Kohya_SS LORA Load config do nothing

Hello Ashleykleynhans,

I wanted to express my gratitude for your outstanding effort. Your Docker template has proven to be an invaluable resource for both software development training and testing purposes. :)

I followed your README on https://hub.docker.com/r/ashleykza/stable-diffusion-webui yesterday. I'm running Ubuntu 22.04 with Portainer 2.19.0, but I executed your installation instructions from the terminal using Docker, just as your installation tutorial suggests.

Everything appears to be working fine, but I've encountered an issue with Koyha_ss. The buttons to load or save configurations don't seem to be functioning; they simply do nothing. I suspect it may be a problem with my browser. I've tried using Firefox, Chrome, and Brave on Linux. I even installed Windows 10 on VirtualBox and tested it on my mobile device, just to confirm that it's not a browser-related issue. Unfortunately, I'm still encountering the same error. I'm at a loss as to why none of the loading buttons are working.

Can you help me ?
Thx alot !!

roop doesn't appear on runpod

Stable Diffusion Kohya_ss ComfyUI Ultimate: model of roop doesnt appear, even i put manually in workspace/stable-diffusion-webui/extensions/sd-webui-roop/inswapper_128.onnx

[Feature Request] Lama Cleaner

Either help me make a fork of this with lama cleaner or add it by default, if able.

pretty please

Use common models directory for Automatic111/ComfyUI

With the appropriate symlinks Automatic111 and ComfyUI could share a single source of models, Loras, etc.

rsync to /workspace breaks existing venv

I have been using release 3.10.1 on RunPod for a week or so and deployed it multiple several after the first provisioning, on A100 and A6000 instances. Each time it starts up fine, and all of my extra A1111 extensions are still installed, all of my models are still downloaded, and my outputs from my last session are still there.

When I deployed the template with the volume today, it took way longer than usual to start up, and I saw it was deploying a new version (3.11.0). Sadly, after the deployment was complete, A1111 wouldn't start. I looked in the log, and there was a Python error complaining about pytorch_lightning missing. So I activated the venv and installed it. Then something else was missing, so I installed that. I did that a few more times and came to the conclusion that this wasn't just a thing or two missing after the upgrade. So I ran pip3 install -r requirements.txt. I don't remember what the error was after that, but A1111 still wouldn't start.

I rsync the workspace regularly to a local drive, so I still have a copy of everything from the last time it worked. After syncing the old venv back up to the volume, A1111 started up.

I guess I'm not sure if this template is meant to be run from a volume. I tried it and it worked, so I kept rolling with it. Like I said, it's very handy to use this way, it keeps my customizations in place, and lets me fairly quickly spin up an instance when I need it without transferring in a bunch of models and reinstalling plugins.

Thanks for making this. It works great, except for this little mess. It made it easy for me to get up and running with some beefy GPU resources even as a stable diffusion newbie.

[Feature] Include the "lora-block-weight" addon

https://github.com/hako-mikan/sd-webui-lora-block-weight

ComfyUI ?

Hello Ashleykleynhans,,

I really appreciate your work as your image is up to date and very easy to deploy.
Any chance to add comfyUI to the stack aside webUI and Kohya ?

And a precision if u may ; can you confirm what volumes you recommend when mounting the image ?

Best regards!
lou.

TKinter errors when running locally

I get a _tkinter.TclError: no display name and no $DISPLAY environment variable error when running this locally. I'm using Docker Compose with the following compose.yml config:

version: "3.8"

# https://hub.docker.com/r/ashleykza/stable-diffusion-webui
# https://github.com/ashleykleynhans/stable-diffusion-docker#running-locally

services:
  web:
    image: "ashleykza/stable-diffusion-webui"
    # https://docs.docker.com/compose/gpu-support/
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    volumes:
      - type: volume
        source: display
        target: /tmp/.X11-unix
        volume:
          nocopy: true
      - type: bind
        source: ./dataset
        target: /dataset/ # Don't place it in the 'workspace' directory or all the configs will synced into it.
    ports:
      - 3000:3001
      - 3010:3011
      - 6006:6066
      - 8888:8888
    working_dir: '/workspace'
    environment:
      JUPYTER_PASSWORD: 'Jup1t3R!' # It needs a password!
      #DISABLE_AUTOLAUNCH: 'enabled' # Need the services enabled b/c we aren't in a web environment that will spin them up at the press of a button
      ENABLE_TENSORBOARD: 1
      # Avoid TKinter errors so Kohya actually does things
      # https://stackoverflow.com/questions/49169055/docker-tkinter-tclerror-couldnt-connect-to-display/49229627#49229627
      # https://askubuntu.com/questions/213678/how-to-install-x11-xorg
      # apt-get -y install xorg openbox
      #DISPLAY: ':0'
      #MPLBACKEND: 'Agg'

volumes:
  display:

I've tried specifying the display, but it can't find it. The other referenced solutions were also attempted to no success. Any thoughts?

When I try to run the dockerfile, it exits after 1 line of log

Sorry for being nooby, but I'm really new to this environment.

I have installed everything, build the image, pushed it, but when I try to run it in Docker, it doesn't work.
Here's what the log in Docker says: /usr/bin/env: 'bash\r': No such file or directory
Eagerly waiting for your help!

Regarding the missing repo - runpod-worker-comfyui

Hi @ashleykleynhans
We have been testing our setup with the runpod-worker-comfyUI. But it seems to have been deleted.
Can you let me know how we can proceed? Is the 'stable-diffusion-docker' to be used as the alternate?

Thanks
Bala

Latest Tag implementation

Hello, thank you so much for your docker image. Could you implement the tag 'latest' so I can stay up to date with my system? This would allow me to pull the most recent docker image as soon as its available vs checking your docker hub page to see if there is a new image.

Thank you

Issue with latest version

Hi
There is an issue with the latest version on runpod. Two weeks ago was flawless.

With kohya ss this is the WD14 Captioning
looks like DNN library is not found.

See error message:

compile configuration.
2024-03-16 16:37:41.070504: W tensorflow/core/framework/op_kernel.cc:1839] OP_REQUIRES failed at conv_ops_impl.h:1199 : UNIMPLEMENTED: DNN library is not found.
Traceback (most recent call last):
File "/workspace/kohya_ss/sd-scripts/finetune/tag_images_by_wd14_tagger.py", line 386, in
main(args)
File "/workspace/kohya_ss/sd-scripts/finetune/tag_images_by_wd14_tagger.py", line 290, in main
run_batch(b_imgs)
File "/workspace/kohya_ss/sd-scripts/finetune/tag_images_by_wd14_tagger.py", line 180, in run_batch
probs = model(imgs, training=False)
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/keras/src/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/tensorflow/python/framework/ops.py", line 5883, in raise_from_not_ok_status
raise core._status_to_exception(e) from None # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.UnimplementedError: Exception encountered when calling layer 'root_conv2d' (type Conv2D).

{{function_node _wrapped__Conv2D_device/job:localhost/replica:0/task:0/device:GPU:0}} DNN library is not found. [Op:Conv2D] name:

Call arguments received by layer 'root_conv2d' (type Conv2D):
• inputs=tf.Tensor(shape=(1, 448, 448, 3), dtype=float32)
Traceback (most recent call last):
File "/workspace/kohya_ss/venv/bin/accelerate", line 8, in
sys.exit(main())
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/accelerate_cli.py", line 47, in main
args.func(args)
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 1017, in launch_command
simple_launcher(args)
File "/workspace/kohya_ss/venv/lib/python3.10/site-packages/accelerate/commands/launch.py", line 637, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/workspace/kohya_ss/venv/bin/python3', '/workspace/kohya_ss/sd-scripts/finetune/tag_images_by_wd14_tagger.py', '--batch_size=8', '--general_threshold=0.35', '--character_threshold=0.35', '--caption_extension=.txt', '--caption_separator=,', '--model=SmilingWolf/wd-v1-4-convnextv2-tagger-v2', '--max_data_loader_n_workers=2', '--debug', '--remove_underscore', '--frequency_tags', '/workspace/ia']' returned non-zero exit status 1.
16:37:44-127401 INFO ...captioning done

	rsync -au /stable-diffusion-webui/ /workspace/stable-diffusion-webui/
	rm -rf /stable-diffusion-webui

	rsync -au /app-manager/ /workspace/app-manager/
	rm -rf /app-manager

ashleykleynhans / stable-diffusion-docker Goto Github PK

stable-diffusion-docker's Introduction

Docker image for A1111 Stable Diffusion Web UI, Kohya_ss and ComfyUI

Installs

Available on RunPod

Building the Docker image

Running Locally

Install Nvidia CUDA Driver

Start the Docker container

Ports

Environment Variables

Logs

Community and Contributing

Appreciate my work?

stable-diffusion-docker's People

Contributors

Stargazers

Watchers

Forkers

stable-diffusion-docker's Issues

tail -f workspace/logs/webui.log

tail -f workspace/logs/kohya_ss.log

tail -f workspace/logs/comfyui.log

Recommend Projects

Recommend Topics

Recommend Org