Giter VIP home page Giter VIP logo

csg's Introduction

Conditional Score Guidance for Text-Driven Image-to-Image Translation

Implementation of "Conditional Score Guidance for Text-Driven Image-to-Image Translation".

Colab Demo

Open demo.ipynb in Google Colab.

Edit synthetic images

You can generate an image using Stable Diffusion and edit it with CSG. Note that posterior_guidance is a hyperparameter related to guidance scale.

python src/edit_synthetic_images.py \
    --results_folder "output/synth_edit" \
    --prompt_str "a high resolution painting of a cat eating a hamburger" \
    --task "cat2squirrel" \
    --random_seed 0 \
    --mask_res 16 --posterior_guidance <GUIDANCE_SCALE>

Synthesized and edited images are saved in output/synth_edit directory.

Edit real images

You can edit a real image with CSG. First, do the DDIM inversion using the command :

python src/inversion.py \
    --input_image "data/cat.png" \
    --results_folder "output/test_cat"

Then, perform image editing.

python src/edit_real_images.py \
    --inversion "output/test_cat/inversion/cat.pt" \
    --prompt "output/test_cat/prompt/cat.txt" \
    --task_name "cat2dog" \
    --results_folder "output/test_cat/" \
    --mask_res 16 --posterior_guidance <GUIDANCE_SCALE>

After all, files at directory output/test_cat are like this:

output/test_cat
  ├── inversion
  │   ├── cat.pt
  │   └── ...
  ├── prompt
  │   ├── cat.txt
  │   └── ...
  ├── edit
  │   ├── cat.png
  │   └── ...
  ├── mask_no_attn
  │   ├── cat.png
  │   └── ...
  ├── mask
  │   ├── cat.png
  │   └── ...
  └── reconstruction
      ├── cat.png
      └── ...

Reconstructed image from DDIM inversion is saved in reconstruction/, and edited image is in edit/ directory. You can also check the content mask in mask_no_attn/ and smoothed content mask in mask/.

Visualize cross-attention maps

You can save and visualize cross-attention maps using synthetic images. Firstly, run save_attention_synth.py to save the cross-attention maps:

python src/save_attention_synth.py \
    --results_folder "output/synth_edit" \
    --prompt_str "a high resolution painting of a cat eating a hamburger" \
    --task "cat2squirrel" \
    --random_seed 0 \
    --mask_res 16 --posterior_guidance <GUIDANCE_SCALE> \
    --save_path "attention_map"

Then, run visualize_attention.py to visualize the saved cross-attention map:

python src/visualize_attention.py --save_path "attention_map"

Requirements

Refer to requirements.txt.

pip install -r requirements.txt

Acknowledgments

This method is implemented based on pix2pix-zero and prompt-to-prompt.

csg's People

Contributors

hleephilip avatar

Stargazers

Jaeah Lee avatar Daneul Michael Kim avatar JaeYoo Park avatar Yejoon Lee avatar Minsoo Kang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.