Not sure where to ask and whether it's a known issue but the Web UI's don't seem to re

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

WebUI won't recover when running out of VRAM about stable-diffusion-webui-docker HOT 15 CLOSED

abdbarho commented on May 12, 2024

WebUI won't recover when running out of VRAM

from stable-diffusion-webui-docker.

Comments (15)

jasalt commented on May 12, 2024 2

Yea, while it's getting bit complex with differences on that level I got an ugly workaround together for running auto profile with "self recovery" when running out of memory. Simply tailing the logs and running the restart command with awk.

After starting up the auto profile normally, this would be run on another terminal (in WSL):

docker compose --profile auto logs --follow --tail 5 | awk '/illegal memory access was enc
ountered/ {system("docker compose --profile auto restart")}'

After a crash, it should kick the docker compose back up again and the original terminal will exit but logs can be watched there again with docker compose --profile auto logs --follow. Bit glitchy experience but seems to work for now.

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024 1

Ok, will test that too, liked the auto profile's live preview a lot.

Thank's for the workaround. I'll keep an eye on the upstream repositories.

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024 1

I eyed these parts a bit but Iwill keep my hands off from it for now as it works well enough for me and gotta to get back working on other stuff.

On a side note, the default auto optimizations with 3060 12GB give larger resolution but the render speed drops quite a bit from 4-5 it/s to around 1-2 it/s. GPU CUDA activity on Task Manager going up and down like a saw wave during the render while it's keeping near constant 100% with optimization flags removed from docker-compose.yml. Guessing that it's expected to behave that way with optimizations. Pretty pleased running without them and using the upscaling methods to get to around 1280x1280.

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

@jasalt could you please define what it means to "recover"? you mean that the container restart? or that the app continues to function normally?

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024

To "recover", meaning the app to continue function normally.

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

@jasalt hmmmm, I don't think I can do anything on the container side, if the container stops on error, you can add restart: on-failure to the docker-compose.yml to restart it. However, it seems that Gradio catches the error but does not recover from it.

In any case, if you just want to restart, you can try docker compose --profile auto restart, same effect as stopping and restarting, with less typing.

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024

Ok, thank's. Will experiment with it. The auto-profile has been very stable if not passing resolution over 640x640 with 12GB VRAM. Hard to share access to it for others however before there's some solution.

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

@jasalt You might want to check your config, I can generate a 704 x 704 image on 6GB VRAM. You should be able to go up to 1024 with 12GB?

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024

Was getting 512x512 with all optimizations off that is (12GB VRAM), tested with defaults of hlky-profile now and got up to 832 x 960. Didn't inspect the difference in render quality much but it's not clearly visible at least.

Adding restart: on-failure to the profile didn't work for restarting after running out of VRAM but restart: always does. This would reset the gradio public share url however but that shouldn't be a problem with proper reverse proxy setup. Example config change for hlky-profile which restarts after error:

  hlky:
    <<: *base_service
    profiles: ["hlky"]
    restart: always
    build: ./services/hlky/
    environment:
      - CLI_ARGS=--optimized-turbo

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

@jasalt the auto profile has more optimizations, maybe you can have larger images with it.

For the restart, you would probably need to create an issue in the respective UI repository to handle the errors gracefully, then restart: is not necessary anymore.

from stable-diffusion-webui-docker.

jasalt commented on May 12, 2024

Seems like the restart: always only works with hlky profile which exits docker compose with code 0 when running out of VRAM:

webui-docker-hlky-1  | !!Runtime error (txt2img)!!
webui-docker-hlky-1  |  CUDA error: an illegal memory access was encountered
webui-docker-hlky-1  | CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
webui-docker-hlky-1  | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
webui-docker-hlky-1  | exiting...calling os._exit(0)
webui-docker-hlky-1 exited with code 0

After this it restarts. The auto profile handles out of VRAM error differently, it does not exit and restart but hangs in Docker compose prompt like so:

...
webui-docker-automatic1111-1  |   File "/stable-diffusion-webui/modules/sd_samplers.py", line 43, in sample_to_image
webui-docker-automatic1111-1  |     x_sample = 255. * np.moveaxis(x_sample.cpu().numpy(), 0, 2)
webui-docker-automatic1111-1  | RuntimeError: CUDA error: an illegal memory access was encountered
webui-docker-automatic1111-1  | CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
webui-docker-automatic1111-1  | For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
webui-docker-automatic1111-1  |

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

Yeah, the hlky fork handles the errors explicitly, and exists gracefully here. the auto fork just leaves the error to gradio, which probably does nothing and leaves the app in an invalid state.

from stable-diffusion-webui-docker.

AbdBarho commented on May 12, 2024

Wizard! this still has the cost of reloading the entire app / models from scratch, which takes roughly 20 seconds on my machine, but I think it is still better than nothing.

On thing you could probably do if you really want to hack it away, is have some code that runs as part of the build that adds a try catch to the code responsible for all gpu calls

I already have something similar for adding a link to this repo, maybe you can try as well.

Or just MR to the main repo with your solution.

from stable-diffusion-webui-docker.

github-actions commented on May 12, 2024

This issue is stale because it has been open 14 days with no activity. Remove stale label or comment or this will be closed in 7 days.

from stable-diffusion-webui-docker.

github-actions commented on May 12, 2024

This issue was closed because it has been stalled for 7 days with no activity.

from stable-diffusion-webui-docker.

WebUI won't recover when running out of VRAM about stable-diffusion-webui-docker HOT 15 CLOSED

Comments (15)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent