Giter VIP home page Giter VIP logo

t2i-adapter's Introduction

โฌDownload Models | ๐Ÿ’ปHow to Test | ๐Ÿ’ฅ Huggingface Gradio

๐ŸฐAdapter Zoo

T2I adapters naturally support using multiple adapters together.
The running command is here
Image source

๐Ÿšฉ New Features/Updates

Official implementation of T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models.

We propose T2I-Adapter, a simple and small (~70M parameters, ~300M storage space) network that can provide extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models.

T2I-Adapter aligns internal knowledge in T2I models with external control signals. We can train various adapters according to different conditions, and achieve rich control and editing effects.

โฌ Download Models

Put the downloaded models in the T2I-Adapter/models folder.

  1. The T2I-Adapters can be download from https://huggingface.co/TencentARC/T2I-Adapter.
  2. The pretrained Stable Diffusion v1.4 models can be download from https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/tree/main. You need to download the sd-v1-4.ckpt file.
  3. [Optional] If you want to use Anything v4.0 models, you can download the pretrained models from https://huggingface.co/andite/anything-v4.0/tree/main. You need to download the anything-v4.0-pruned.ckpt file.
  4. The pretrained clip-vit-large-patch14 folder can be download from https://huggingface.co/openai/clip-vit-large-patch14/tree/main. Remember to download the whole folder!
  5. The pretrained keypose detection models include FasterRCNN (human detection) from https://download.openmmlab.com/mmdetection/v2.0/faster_rcnn/faster_rcnn_r50_fpn_1x_coco/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth and HRNet (pose detection) from https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_256x192-b9e0b3ab_20200708.pth.

After downloading, the folder structure should be like this:

๐Ÿ”ง Dependencies and Installation

pip install -r requirements.txt

๐Ÿ’ป How to Test

Depth Adapter

python test_depth.py --prompt "Stormtrooper's lecture, best quality, extremely detailed" --path_cond examples/depth/sd.png --ckpt models/v1-5-pruned-emaonly.ckpt --type_in image --sampler ddim --scale 9 --cond_weight 1.5

Huggingface Gradio

Sketch Adapter

  • Sketch to Image Generation

python test_sketch.py --prompt "A car with flying wings" --path_cond examples/sketch/car.png --ckpt models/sd-v1-4.ckpt --type_in sketch

  • Image to Sketch to Image Generation

python test_sketch.py --prompt "A beautiful girl" --path_cond examples/sketch/human.png --ckpt models/sd-v1-4.ckpt --type_in image

  • The adaptor is training based on stable-diffusion-v1.4 but can be generalized to other models, such as Anything-v4 which is an anime diffusion model

python test_sketch.py --prompt "1girl, masterpiece, high-quality, high-res" --path_cond examples/anything_sketch/human.png --ckpt models/anything-v4.0-pruned.ckpt --ckpt_vae models/anything-v4.0.vae.pt --type_in image

Huggingface Gradio

Keypose Adapter

  • Keypose to Image Generation

python test_keypose.py --prompt "A beautiful girl" --path_cond examples/keypose/iron.png --type_in pose

  • Image to Image Generation

python test_keypose.py --prompt "A beautiful girl" --path_cond examples/sketch/human.png --type_in image

  • Generation anime image with Anything-v4 model

python test_keypose.py --prompt "A beautiful girl" --path_cond examples/sketch/human.png --ckpt models/anything-v4.0-pruned.ckpt --ckpt_vae models/anything-v4.0.vae.pt --type_in image

Huggingface Gradio

Segmentation Adapter

python test_seg.py --prompt "A black Honda motorcycle parked in front of a garage" --path_cond examples/seg/motor.png

Huggingface Gradio

Combine multiple Adapters

python test_composable_adapters.py --prompt "An all white kitchen with an electric stovetop" --seg_cond_path examples/seg_sketch/mask.png --sketch_cond_path examples/seg_sketch/edge.png --sketch_cond_weight 0.5

Huggingface Gradio

Local editing with adapters

python test_sketch_edit.py --prompt "A white cat" --path_cond examples/edit_cat/edge_2.png --path_x0 examples/edit_cat/im.png --path_mask examples/edit_cat/mask.png

Stable Diffusion + T2I-Adapters (only ~70M parameters, ~300M storage space)

The following is the detailed structure of a Stable Diffusion model with the T2I-Adapter.

๐Ÿš€ Interesting Applications

Stable Diffusion results guided with the sketch T2I-Adapter

The corresponding edge maps are predicted by PiDiNet. The sketch T2I-Adapter can well generalize to other similar sketch types, for example, sketches from the Internet and user scribbles.

Stable Diffusion results guided with the keypose T2I-Adapter

The keypose results predicted by the MMPose. With the keypose guidance, the keypose T2I-Adapter can also help to generate animals with the same keypose, for example, pandas and tigers.

T2I-Adapter with Anything-v4.0

Once the T2I-Adapter is trained, it can act as a plug-and-play module and can be seamlessly integrated into the finetuned diffusion models without re-training, for example, Anything-4.0.

โœจ Anything results with the plug-and-play sketch T2I-Adapter (no extra training)

Anything results with the plug-and-play keypose T2I-Adapter (no extra training)

Local editing with the sketch adapter

When combined with the inpaiting mode of Stable Diffusion, we can realize local editing with user specific guidance.

โœจ Change the head direction of the cat

โœจ Add rabbit ears on the head of the Iron Man.

Combine different concepts with adapter

Adapter can be used to enhance the SD ability to combine different concepts.

โœจ A car with flying wings. / A doll in the shape of letter โ€˜Aโ€™.

Sequential editing with the sketch adapter

We can realize the sequential editing with the adapter guidance.

Composable Guidance with multiple adapters

Stable Diffusion results guided with the segmentation and sketch adapters together.

๐Ÿค— Acknowledgements

Thank haofanwang for providing a tutorial of T2I-Adapter diffusers.

visitors

Logo materials: adapter, lightbulb

t2i-adapter's People

Contributors

mc-e avatar tothebeginning avatar xinntao avatar bzboys avatar eltociear avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.