Light

michalgeyer / pnp-diffusers Goto Github PK

View Code? Open in Web Editor NEW

83.0 3.0 8.0 5.06 MB

License: MIT License

Python 100.00%

pnp-diffusers's Introduction

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation (CVPR 2023)

[Project Page]

To plug-and-play diffusion features, please follow these steps:

Setup
Latent extraction
Running PnP

Setup

Create the environment and install the dependencies by running:

conda create -n pnp-diffusers python=3.9
conda activate pnp-diffusers
pip install -r requirements.txt

Latent Extraction

We first compute the intermediate noisy latents of the structure guidance image. To do that, run:

python preprocess.py --data_path <path_to_guidance_image> --inversion_prompt <inversion_prompt>

where <inversion_prompt> should describe the content of the guidance image. The intermediate noisy latents will be saved under the path latents_forward/<image_name>, where <image_name> is the filename of the provided guidance image.

Running PnP

Run the following command for applying PnP on the structure guidance image:

python pnp.py --config_path <pnp_config_path>

where <pnp_config_path> is a path to a yaml config file. The config includes fields for providing the guidance image path, the PnP output path, translation prompt, guidance scale, PnP feature and self-attention injection thresholds, and additional hyperparameters. See an example config in config_pnp.yaml.

Citation

@InProceedings{Tumanyan_2023_CVPR,
    author    = {Tumanyan, Narek and Geyer, Michal and Bagon, Shai and Dekel, Tali},
    title     = {Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {1921-1930}
}

pnp-diffusers's People

Contributors

Stargazers

Watchers

Forkers

magrcruz inkyusa soaring666 yqgao716 ferryhuang simran-khanuja finninmunich agusgun

pnp-diffusers's Issues

OSError: stabilityai/stable-diffusion-2-1-base does not appear to have a file named scheduler_config.json.

how to solve this problem
I tried pip install --upgrade transformers, But it doesn't work!

run preprocess with horse and prompt a black horse running in a large, open field, the result is blow, is right?

resolved.

resolved.

Way of obtaining latents embedding

Great work. I see that the latent embedding is extracted from the denoising process in your paper, but you set the 'latents_forward' as the default way in this code. Can you explain this? Is there any difference between these two settings? And Do you think which one is better?

Diffusers with newer version

Hi, the Code does not work anymore with diffusers version e.g. 0.25.0 because the unet expects another input nowadays.
Is there a quick fix how to handle such?

evaluation code

Would kind you mind providing the code for evaluation?

Injection thresholds

Hello, in you paper I found the following information on the injection threshold values
We set our default injection thresholds to: τA = 25, τf = 40 out of the 50 sampling steps; for primitive guidance image, we found that τA = τf = 25 to work better.
I'm not quite sure what you mean by primitive guidance image, and in your config you have pnp_attn_t: 0.5 and pnp_f_t: 0.8, so I'm trying to wrap my head around what the implications of these values are without needing to try them all. Do you have more information on the subject?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.