Giter VIP home page Giter VIP logo

compel's Introduction

Compel

A text prompt weighting and blending library for transformers-type text embedding systems, by @damian0815.

With a flexible and intuitive syntax, you can re-weight different parts of a prompt string and thus re-weight the different parts of the embeddning tensor produced from the string.

Tested and developed against Hugging Face's StableDiffusionPipeline but it should work with any diffusers-based system that uses an Tokenizer and a Text Encoder of some kind.

Adapted from the InvokeAI prompting code (also by @damian0815). For now, the syntax is fully documented here - note however that cross-attention control .swap() is currently ignored by Compel.

Installation

pip install compel

Demo

see compel-demo.ipynb

Open In Colab

Quickstart

with Hugging Face diffusers >=0.12:

from diffusers import StableDiffusionPipeline
from compel import Compel

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

# upweight "ball"
prompt = "a cat playing with a ball++ in the forest"
conditioning = compel.build_conditioning_tensor(prompt)
# or: conditioning = compel([prompt])

# generate image
images = pipeline(prompt_embeds=conditioning, num_inference_steps=20).images
images[0].save("image.jpg")

For batched input, use the call interface to compel:

import torch

from diffusers import StableDiffusionPipeline
from compel import Compel

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
compel = Compel(tokenizer=pipeline.tokenizer, text_encoder=pipeline.text_encoder)

prompts = ["a cat playing with a ball++ in the forest", "a dog playing with a ball in the forest"]
prompt_embeds = compel(prompts)
images = pipeline(prompt_embeds=prompt_embeds).images

images[0].save("image0.jpg")
images[1].save("image1.jpg")

Changelog

1.0.3 - better defaults for .swap (damian0815#8)

1.0.2 - fix padding for non-truncated batched embeddings (damian0815#9)

1.0.1 - fix for InvokeAI's --free_gpu_mem option

1.0.0 - new downweighting algorithm

Downweighting now works by applying an attention mask to remove the downweighted tokens, rather than literally removing them from the sequence. This behaviour is the default, but the old behaviour can be re-enabled by passing downweight_mode=DownweightMode.REMOVE on init of the Compel instance.

Formerly, downweighting a token worked by both multiplying the weighting of the token's embedding, and doing an inverse-weighted blend with a copy of the token sequence that had the downweighted tokens removed. The intuition is that as weight approaches zero, the tokens being downweighted should be actually removed from the sequence. However, removing the tokens resulted in the positioning of all downstream tokens becoming messed up. The blend ended up blending a lot more than just the tokens in question.

As of v1.0.0, taking advice from @keturn and @bonlime (damian0815#7) the procedure is by default different. Downweighting still involves a blend but what is blended is a version of the token sequence with the downweighted tokens masked out, rather than removed. This correctly preserves positioning embeddings of the other tokens.

Also a bugfix: fix black images on weight 0 (invoke-ai/InvokeAI#2832)

0.1.10 - add support for prompts longer than the model's max token length.

To enable, initialize Compel with truncate_long_prompts=False (default is True). Prompts that are longer than the model's max_token_length will be chunked and padded out to an integer multiple of max_token_length.

Note that even if you don't use a negative prompt, you'll need to build a conditioning tensor for a negative prompt of at least "", and use compel.pad_conditioning_tensors_to_same_length(), otherwise the you'll get an error about mismatched conditioning tensor lengths:

compel = Compel(..., truncate_long_prompts=False)
prompt = "a cat playing with a ball++ in the forest, amazing, exquisite, stunning, masterpiece, skilled, powerful, incredible, amazing, trending on gregstation, greg, greggy, greggs greggson, greggy mcgregface, ..." # very long prompt
conditioning = compel.build_conditioning_tensor(prompt)
negative_prompt = "" # it's necessary to create an empty prompt - it can also be very long, if you want
negative_conditioning = compel.build_conditioning_tensor(negative_prompt)
[conditioning, negative_conditioning] = compel.pad_conditioning_tensors_to_same_length([conditioning, negative_conditioning])

0.1.9 - broken

0.1.8 - downgrade Python min version to 3.7

0.1.7 - InvokeAI compatibility

compel's People

Contributors

damian0815 avatar mystfit avatar patrickvonplaten avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.