Yet Another Stable Diffusion Discord Bot

License: MIT License

Python 100.00%

artificial-intelligence clip diffusion discord discord-bot generative-art image2image img2img inpainting outpainting python3 stable-diffusion text2image txt2img upscaling

yasd-discord-bot's Introduction

Yet Another Stable Diffusion Discord Bot

Live now on the LAION Discord Server for you to try!

Features

Highly Scalable: Leverages dalle-flow gRPC interface to independently serve images from any number of GPUs, while higher memory calls to the gRPC through the bot are forked onto individual instances of Python.
Support For Other Popular Models: Latent diffusion GLID3XL or DALLE-MEGA can easily by turned on in addition to Stable Diffusion through dalle-flow for text-to-image generation.
Support For Low VRAM GPUS: Stable Diffusion fork supports image generation with GPUs >= 7 GB.
Supports Slash and Legacy Style Commands: While Discord is moving towards the new slash style commands that feature auto-completion functions, YASD Discord Bot also features direct commands prefixed with > -- whichever you find easier.
Easy User Interface Including Buttons and Loading Indicators: Riffing and upscaling your creations has never been easier! It even comes with a manual!
Stores All Images and Prompts by Default: Never lose your previous generations!

Changelog
Content advisory
What do I need?
Installation
What can it do?
User Manual
Something is broken
Closing Remarks
License

Changelog

2022-10-23: Added support for RunwayML inpainting/outpainting, replacing outriffing. This is now the recommended model.
2022-10-01: Added clipseg to automatically detect and mask images for inpainting, added RealESRGAN upscaler as an alternative to SwinIR.
2022-09-24: Add usage of the SD concepts library, add subprompts and negative/positive conditioning, add ability to prevent users from making prompts if they have not been on the server long enough, remove the optimized-sd branch since we have now moved to a local stable-diffusion branch that is more optimized than that one.
2022-09-11: Add optional NSFW spoiler filter and NSFW wordlist filter. Added the ability to set default steps and queue any quantity of images on a per user basis with a new flag.
2022-09-06: Added the ability to change make images of any size and riff into different sizes ("outriffing").
2022-09-05: The sd-lite branch has been merged upstream, so now low VRAM is available with docker images too.
2022-08-30: optimized-sd branch has moved to sd-lite branch, which will be merged upstream. Includes small bugfixes and enhanced interpolation. Upstream docker image is now functional, so instructions have been added for installing that.
2022-08-30: Updated to add slash commands in addition to legacy commands, added a manual link instead of help, added multi-user support (more than one user may now use the bot at a time without waiting), added interpolate command.
2022-08-28: Add ability to use with low VRAM cards through optimized dalle-flow branch optimized-sd.
2022-08-27: Add content advisory.
2022-08-26: Stable Diffusion branch merged into upstream dalle-flow. Added docker installation instructions.
2022-08-24: Added k_lms and other k-diffusion samplers, with k-lms now the default. DDIM is still electable with "(sampler=ddim)" argument.

Content advisory

This bot does not come equipped with a NSFW filter for content by default and will make any content out of the box. Please be sure to read and agree with the license for the weights, as well as the MIT license, and abide by all applicable laws and regulations in your respective area.

To enable the NSFW filter to automatically add the spoiler tag to any potential NSFW images, use the flag --nsfw-auto-spoiler. You must first pip install -r requirements_nsfw_filter.txt to get the modules required for this.

To enable NSFW prompt detection via BERT, use the flag --nsfw-prompt-detection and be sure to pip install -r requirements_nsfw_filter.txt.

To reject any prompts if they contain a word within a wordlist, use the --nsfw-wordlist flag, e.g. --nsfw-wordlist bad_words.txt. The wordlist should be strings separated by newlines.

What do I need?

Python 3 3.9+ with pip and virtualenv installed (Ubuntu 22.04 works great!)

CUDA runtime environment installed

An NVIDIA GPU with >= 7 GB of VRAM

If running with a low VRAM GPU, you will not have access to the >upscale endpoint as you will run out of RAM. Buying an RTX 3090 or renting a server with one is recommended.

Installation

This installation is intended for debian or arch flavored linux users. YMMV. You will need to have Python 3 and pip installed.

sudo apt install python3 python3-pip
sudo pip3 install virtualenv

Docker installation (docker image)

Install the Nvidia docker container environment if you have not already.

Pull the dalle-flow docker image with:

docker pull jinaai/dalle-flow:latest

Log into Huggingface, agree to RunwayML's terms of service, go to RunwayML's repository for the latest version, then download sd-v1-5-inpainting.ckpt. Rename that to model.ckpt and then, from that directory, run the following commands:

mkdir ~/ldm
mkdir ~/ldm/stable-diffusion-v1
mv model.ckpt ~/ldm/stable-diffusion-v1/model.ckpt

Then run the container with this command:

sudo docker run -e DISABLE_CLIP="1" \
  -e DISABLE_DALLE_MEGA="1" \
  -e DISABLE_GLID3XL="1" \
  -e ENABLE_CLIPSEG="1" \
  -e ENABLE_REALESRGAN="1" \
  -e ENABLE_STABLE_DIFFUSION="1" \
  -p 51005:51005 \
  -it \
  -v ~/ldm:/dalle/stable-diffusion/models/ldm/ \
  -v $HOME/.cache:/home/dalle/.cache \
  --gpus all \
  jinaai/dalle-flow

Somewhere else, clone this repository and follow these steps:

git clone https://github.com/AmericanPresidentJimmyCarter/yasd-discord-bot/
cd yasd-discord-bot
python3 -m virtualenv env
source env/bin/activate
pip install -r requirements.txt

Then you can start the bot with:

cd src
python -m bot YOUR_DISCORD_BOT_TOKEN -g YOUR_GUILD_ID

Be sure you have the "Message Content Intent" flag set to be on in your bot settings!

Where YOUR_DISCORD_BOT_TOKEN is your token and YOUR_GUILD_ID is the integer ID for your server (right click on the server name, then click "Copy ID"). Supplying the guild ID is optional, but it will result in the slash commands being available to your server almost instantly. Once the bot is connected, you can read about how to use it with >help.

The bot uses the folders as a bus to store/shuttle data. All images created are stored in images/.

OPTIONAL: If you aren't running jina on the same box, you will need change the address to connect to declared as constant JINA_SERVER_URL in imagetool.py.

Docker installation (build docker image yourself)

Install the Nvidia docker container environment if you have not already.

Make a folder for dalle-flow:

mkdir ~/dalle
cd ~/dalle
git clone https://github.com/jina-ai/dalle-flow
cd dalle-flow

mkdir ~/ldm
mkdir ~/ldm/stable-diffusion-v1
mv model.ckpt ~/ldm/stable-diffusion-v1/model.ckpt

In the dalle-flow folder (cd ~/dalle/dalle-flow), build with this command:

docker build --build-arg GROUP_ID=$(id -g ${USER}) --build-arg USER_ID=$(id -u ${USER}) -t jinaai/dalle-flow .

Then run the container with this command:

sudo docker run -e DISABLE_CLIP="1" \
  -e DISABLE_DALLE_MEGA="1" \
  -e DISABLE_GLID3XL="1" \
  -e ENABLE_CLIPSEG="1" \
  -e ENABLE_REALESRGAN="1" \
  -e ENABLE_STABLE_DIFFUSION="1" \
  -p 51005:51005 \
  -it \
  -v ~/ldm:/dalle/stable-diffusion/models/ldm/ \
  -v $HOME/.cache:/home/dalle/.cache \
  --gpus all \
  jinaai/dalle-flow

Somewhere else, clone this repository and follow these steps:

git clone https://github.com/AmericanPresidentJimmyCarter/yasd-discord-bot/
cd yasd-discord-bot
python3 -m virtualenv env
source env/bin/activate
pip install -r requirements.txt

Then you can start the bot with:

cd src
python -m bot YOUR_DISCORD_BOT_TOKEN -g YOUR_GUILD_ID

Be sure you have the "Message Content Intent" flag set to be on in your bot settings!

The bot uses the folders as a bus to store/shuttle data. All images created are stored in images/.

OPTIONAL: If you aren't running jina on the same box, you will need change the address to connect to declared as constant JINA_SERVER_URL in imagetool.py.

Native installation

Follow the instructions for dalle-flow to install and run that server. The steps you need to follow can be found under "Run natively". Once flow is up and running, proceed to the next step.

At this time, if you haven't already, you will need to put the stable diffusion weights into dalle/stable-diffusion/models/ldm/stable-diffusion-v1/model.ckpt.

Need to download the weights? Log into Huggingface, agree to RunwayML's terms of service, go to RunwayML's repository for the latest version, then download sd-v1-5-inpainting.ckpt. Rename that to model.ckpt and put it into the location specified above.

To start jina with old models disabled when you're all done:

python flow_parser.py --enable-stable-diffusion --disable-dalle-mega --disable-glid3xl
jina flow --uses flow.tmp.yml

Jina should display lots of pretty pictures to tell you it's working. It may take a bit on first boot to load everything.

Somewhere else, clone this repository and follow these steps:

git clone https://github.com/AmericanPresidentJimmyCarter/yasd-discord-bot/
cd yasd-discord-bot
python3 -m virtualenv env
source env/bin/activate
pip install -r requirements.txt

Then you can start the bot with:

cd src
python -m bot YOUR_DISCORD_BOT_TOKEN -g YOUR_GUILD_ID

Be sure you have the "Message Content Intent" flag set to be on in your bot settings!

The bot uses the folders as a bus to store/shuttle data. All images created are stored in images/.

OPTIONAL: If you aren't running jina on the same box, you will need change the address to connect to declared as constant JINA_SERVER_URL in imagetool.py.

What can it do?

Generate images from text (/image foo bar)
Generate images from text with a frozen seed and variations in array format (/image [foo, bar])
Generate images from text while exploring seeds (/image foo bar (seed_search=t))
Generate images from images (and optionally prompts) (>image2image foo bar)
Diffuse ("riff") on images it has previously generated (/riff <id> <idx>)
Interpolate between two prompts (/interpolate <prompt 1> <prompt 2>)
Use any k-diffusion sampler to generate images
Outpaint images directionally or in all directions at once
Inpaint images with a mask selected automatically from text
Queue generations per user and restrict the user queue to n-many generations at time

Examples:

>image A United States twenty dollar bill with [Jerry Seinfeld, Jason Alexander, Michael Richards, Julia Louis-Dreyfus]'s portrait in the center (seed=2)

>image2image Still from Walt Disney's The Princess and the Frog, 2001 (iterations=4, strength=0.6, scale=15)

Attached image

Output image

>interpolate Sonic the Hedgehog portrait | European Hedgehog Portrait

Something is broken

Open an issue here.

Closing remarks

Be cool, stay in school.

License

MIT

yasd-discord-bot's People

Stargazers

Watchers

Forkers

rsh4d0w ryakr marcus-arcadius nickm2402 ed-bread aaronsantiago nopeanuts andrew071407 goolashe statusd112 researchoor macguyversmusic whield walkerakiz pieceofcake-studios

yasd-discord-bot's Issues

When DMing the bot, completion responses are errors

DMing my bot is a feature I'd like to keep, but the prompt completion responses all return errors:

Got unknown error on riff "UeeGqXSH1x18" index 3: 'NoneType' object has no attribute 'id'

Store additional metadata in Jina DocArray tags

Title.
Some good tags to have:

Which user requested the image
Which channel the request was made in
Which guild the request was made in
ID of the original if it is a riff or an upscale

Print commands after using slash commands

It would be good to print the legacy commands after using slash commands so that people might easily copy paste the commands of others.

image2image retry does not appear to use the source image

Support image sizes from 1:2 to 2:1

Before we could only make squares.

Have an "original image size" select option in the outriffing box for images that were uploaded and which have unusual sizes

Option to select the number of images being output

Not sure if this is already an option or if it's possible to make it configurable?

For example something like below?

image foo bar (samples=2)
image foo bar (images=4)

Role to bypass 72 hour wait time

Add option for users with certain role bypass 72 hour wait time.

Use CLIPTokenizer to ensure that prompts/subprompts do not exceed the number of allowed tokens (77)

https://huggingface.co/docs/transformers/v4.22.1/en/model_doc/clip#transformers.CLIPTokenizer

AttributeError: 'NoneType' object has no attribute 'id'

Getting an error after issuing >help command in discord, any ideas?

discord.ext.commands.context.Context object at 0x7f68d61344f0> Command raised an exception: AttributeError: 'NoneType' object has no attribute 'id'

Error when trying to run server.

For some reason the server just won't start. I don't know why, and this is the third time ive tried it(running linux mint 21)
Im really hoping you understand these logs better than I do.

Logs:
logs.txt

Embed all the parameters used to deterministically generate an image in an invisible watermark

Riff buttons should not be deactivated

Love the new buttons - just a minor improvement. When a riff-button is pressed it is deactivated. This is fine for upscaling but if I want to do another riff of the same image I have to fall back to the traditional copy-pasting.

Thank you for your work - huge improvement!!

Riff should not reuse the same seed, similar to Retry

Bug causing a deep fried effect on every 4th image.

Riffing on prompts with commas cause serialization crashes

Consider substituting full width commas: ，

Progress bars would be a nice to have

For image2image add a "resize=false" option that lets you resize the image before shipping to imagetool, preventing outriff

bot will not start

i get an error at start. below the error.
Traceback (most recent call last):
File "bot.py", line 26, in
currently_fetching_ai_image: dict[str, Union[str, bool]] = {}
TypeError: 'type' object is not subscriptable
running on windows. python 3.8

resample_prior being set to False not being displayed on interpolate

Feature request

A quality of live improvement for the bot would be implementing buttons such as MJ has for redoing the prompt or upscaling each of the 4 images.
a good reference for buttons using discord.py seems to be https://gist.github.com/lykn/bac99b06d45ff8eed34c2220d86b6bf4.
another improvement would be that the message "Now beginning work on..." could be edited with the final result.

very nice work on the bot.

[Feature Request] Prompt Weights

Not sure if this is a thing, but my tests and the manual don't seem to include this. Negative prompting would be nice as well but based on what I've seen its more difficult to implement.

Buttons becoming defunct after a while

Both, Riff and upscale-buttons throw an error after a couple of minutes when pressed.

Riff local settings are using on the Retry button instead of the ones from the previous riff

See:

        if original_request['api'] == 'stablediffuse':
            sampler = original_request['sampler']
            scale = original_request['scale']
            steps = int(original_request['steps'])
            latentless = original_request['latentless']
            strength = original_request['strength']
            await _riff(
                interaction.channel, interaction.user,
                self.short_id_parent, self.idx_parent,
                height=self.pixels_height,
                latentless=latentless,
                sampler=sampler,
                scale=scale,
                steps=steps,
                strength=strength,
                width=self.pixels_width)

">upscale" ignores the index

When using the manual ">upscale" command the bot ignores the index and uses 0. However when you use "/upscale" or the buttons it works fine.

[Feature Request] Button to rerun image command with new seeds

Title. All the parameters should be the same but we should be able to run it again to see more variations.

It would be nice if the same button existed on riffs as well to create more riffs from the same base image.

Split into multiple modules the make the repo easier to read/maintain

Truncate alert embed descriptions if they exceed 1024 characters

Buttons do not work when `_` or `-` in short ID

Solution: use better short ID generator from stdlib eg ''.join(random.choices(string.ascii_lowercase + string.ascii_uppercase + string.digits, k=12))

regex riff ID to ensure that it's actually an ID before prodding the filesystem

Resolutions producing varying results

Some resolutions do not work, here is a list
Square resolutions 512x512 and above = working
Square resolutions less than 512x512 = noisy, fuzzy, weirdly noisy
Any vertical or landscape resolutions = stretched into a square
Vertical or landscape resolutions less then 512x512 = noisy, fuzzy, weirdly noisy AND stretched into a square

Assuming all resolutions are multiples of 64
resolution edited in dalle/dalle-flow/executors/stable/config.yml

Fix seed reuse deepfrying images on iterations>=2

Repos do not exist

git clone https://github.com/CompVis/latent-diffusio324n.git
git clone https://github.com/StableDiffusion/latent-diffusion.git

both do not exist

interpolate throws a DocumentArray error

I tried to run
/interpolate and filled out the two prompts

and got
Now beginning work on "interpolate "cat sitting on a sofa" to dog sitting on a sofa" for murmur5786. Please be patient until I finish that.
Got unknown error on interpolate "cat sitting on a sofa" to dog sitting on a sofa: <DocumentArray (length=0) at 140684863006336> is empty

Include slash command version in the embed response

Stuff like /image prompt:testing 123, test fooo height:384 sampler:ddim scale:10 seed:123 works too if you hit tab after pasting it in.

Fix stacktrace dumps that are longer than 2000 characters crashing and pretty format the stacktraces

getting error on run on windows

Hi i get an error on running from this line.
intents = discord.Intents(messages=True, message_content=True)
error is: AttributeError: 'Intents' object has no attribute 'message_content'

it looks like a discord.py error.
dicrod.py 1.7.3

Add support for image2image with a slash command.

Image2image is already available via > command, but users in my server were having trouble understanding those commands. I managed to get it hackily working by converting the command into a slash and requiring the user to specify an image url.

Add a console side progress bar when restarting the bot to show the loading of buttons.

Currently there is no info on how many buttons, how quick it is going, and when to expect the bot live again. Info like this would be useful for 'quick restarts' to inform users on downtime.

[Feature Request] Implement Textual Inversion with img2img

https://www.youtube.com/watch?v=WsDykBTjo20
https://github.com/hlky/sd-enable-textual-inversion

Create more advanced images from other images with textual inversion

Discord API randomly fails a lot with buttons and slash commands, try to add more helpful messages

Probably shouldn't disable the upscale button on failure too.

Select menus for aspect ratio riffs may not reflect what is stored in the bot

There is currently no way to asynchronously set the selected aspect ratio in the client, so if multiple users are tweaking aspect ratios at the same time it will be impossible to tell what the aspect ratio actually is.

[Feature Request] Be able to lower the default amount of steps.

The title

image2image slash command is missing resize boolean

torch is required even if the NSFW filter isn't enabled

>image2image crashes if the image is too big

Issues with documentation

Firstly love your work, great project!

Just a few things/issues I came across following the documentation for native install.

Got an error that the version of Jax was too old but was able to fix that by not specifying the version, which installed the latest and seems to work fine. pip install jax
I had to clone the original SwinIR repo because upscaling was throwing an error about missing main_test_swinir.py script and models/network_swinir.py (this was since you added the buttons, previously did not have this issue.)
Not related to your code but updating to python 10.3 on Ubuntu 20.04.5 LTS was a massive pain but I eventually got it working.... More of an FYI for Ubuntu users.

[Feature Request] Add commands to a queue instead of rejecting them

Title. It makes it difficult to manage generations as the bot has to be babysat to try out multiple ideas

VRAM troubles

You mentioned that you need 16 GB+ VRAM to run at full resolution.

I have the bot working, but it runs out of Memory instantly at any attempt

RuntimeError: CUDA out of memory. Tried to allocate 512.00 MiB (GPU 0; 7.79 GiB total capacity; 5.08 GiB already allocated; 88.56 MiB free; 5.20 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

Is there any way I can either lower the image resolution or somehow use an optimized fork to run the bot on my pitiful 8GB 3060 Ti. Im just trying to set this bot up to use in me and my friends' discord server. Im also not using docker

Bot does not notify users with a Discord ping when a generation is finished

The bot edits its own message on completion, which does not notify users with a new ping, because it isn't sending a new message.

Add regex to NSFW word list

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

americanpresidentjimmycarter / yasd-discord-bot Goto Github PK

yasd-discord-bot's Introduction

Yet Another Stable Diffusion Discord Bot

Features

Contents

Changelog

Content advisory

What do I need?

Installation

Docker installation (docker image)

Docker installation (build docker image yourself)

Native installation

What can it do?

Something is broken

Closing remarks

License

yasd-discord-bot's People

Stargazers

Watchers

Forkers

yasd-discord-bot's Issues

Recommend Projects

Recommend Topics

Recommend Org