Giter VIP home page Giter VIP logo

diffusionclip's People

Contributors

chenxwh avatar gwang-kim avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

diffusionclip's Issues

tring to reconstrace paper results

Thank you for your great work
i am trying to train ( fine tune ) my own model
i am running on CelebA-HQ dataset and i try to train my models for changing a face into Neanderthal/pixar but i get bad results

my configuration for Neanderthal fine tuning:
mode = "clip_finetune_eff" #"can't run with clip_finetune - gpu memory error"
exp = './runs/finetune_Neanderthal'
edit_attr = "neanderthal"
n_train_img = 50
n_test_img = 10
n_iter = 50
t_0 = 500
n_inv_step = 40
n_train_step = 6
n_test_step = 40
lr_clip_finetune = 8e-6
id_loss_w = 0
l1_loss_w = 1

the results i get :

after 5 epochs (train):

train_43_2_clip_Neanderthal_5_ngen6

after 8 epochs :

train_43_2_clip_Neanderthal_8_ngen6

after 10 epochs :

train_43_2_clip_Neanderthal_10_ngen6

after 12 epochs :
train_37_2_clip_Neanderthal_12_ngen6

same happen for pixar ( change edit_attr = "pixar" )
after 5 epochs:

train_49_2_clip_3D_render_in_the_style_of_Pixar_5_ngen6

after 10 epochs :

train_49_2_clip_3D_render_in_the_style_of_Pixar_10_ngen6

after 15 epochs :

train_7_2_clip_3D_render_in_the_style_of_Pixar_15_ngen6

what i am doing wrong ?
thank you

add web demo/model to Huggingface

Hi, would you be interested in adding DiffusionCLIP to Hugging Face? The Hub offers free hosting, and it would make your work more accessible and visible to the rest of the ML community.

Example from other organizations:
Keras: https://huggingface.co/keras-io
Microsoft: https://huggingface.co/microsoft
Facebook: https://huggingface.co/facebook

Example spaces with repos:
github: https://github.com/salesforce/BLIP
Spaces: https://huggingface.co/spaces/salesforce/BLIP

github: https://github.com/facebookresearch/omnivore
Spaces: https://huggingface.co/spaces/akhaliq/omnivore

and here are guides for adding spaces/models/datasets to your org

How to add a Space: https://huggingface.co/blog/gradio-spaces
how to add models: https://huggingface.co/docs/hub/adding-a-model
uploading a dataset: https://huggingface.co/docs/datasets/upload_dataset.html

Please let us know if you would be interested and if you have any questions, we can also help with the technical implementation.

About the LICENSE

Thank you for releasing your work. I really appreciate your work!

I find it may be useful for my project, and I would like to know what software license of this repo is? If possible, I'd like to reuse or modify your code for my own purposes.

How to see the reconstructed face ?

Dear author,

The command "python main.py --edit_one_image " let me see the edited face, but I also want to see the reconstructed face without any editing. How could I do that ?

Thanks!

how to make custom dataset

thansk for your job,now I found a question, I had found that your dataset's format is pth,I plan to make a dataset for my job,but I have no way to make a dataset in pth,please tell me a solution,thank you!

Several question about the paper

  1. Main Paper:equation(1),it should be $\sqrt{1-\alpha _ t }w $
  2. Supplement Paper : Algorithm 4, line 5, the same problem with 1)
  3. Supplement Paper : Algorithm 1, line 13,14 & Algorithm 2, line 7,8. Shouldn't these $x_{ \tau _s }^{(i)}$ be with hat ?

Different results in ViT-B/16 and ViT-L/14@336px

Hello dear authors I have a little question about the model choice and parameter optimization.
I set "Human->Zombie", here are the results:
a. ViT-B/16: just one epoch can achieve the good result like follows
image
b. ViT-L/14@336px: but when I use this one the results seem strange. I don't know whether it should be set different parameters to finetune the diffusion model. (left->right:1-5 epochs)
image
image
image
image
image

Finetuning bedroom models

Hi,

This work is amazing!
I am trying to reproduce the bedroom editing results you have shown in the paper.
I can achieve them when I use the pretrained finetuned model provided in google drive.
However when I try to finetune the model myself, I get very poor results.
This is the command I used for finetuning the model. Could you tell me if some of the hyperparemeters are set incorrectly here?

python main.py --clip_finetune \
            --config bedroom.yml      \
            --exp ./test_runs/bedrooms_full        \
            --edit_attr "bedroom_princess"  \
            --do_train 1             \
            --do_test 1              \
            --n_train_img 50         \
            --n_test_img 10          \
            --n_iter 5               \
            --t_0 500                \
            --n_inv_step 40          \
            --n_train_step 6         \
            --n_test_step 40         \
            --lr_clip_finetune 8e-6  \
            --id_loss_w 0            \
            --l1_loss_w 1

-Gaurav

Request the code to train the Diffusion Model

Hi, you work is amazing and inspiring. I am following your great work and trying to train the diffusion model in another dataset. Could you help with training code? Thank you very much.

我有个问题,希望大佬回答

如果在训练的时候我将n_train_step改为2,并且多训练是否可以得到可以接受的结果 。我12g的显存无法跑n_train_step=6的代码

Creating my .ckpt files

Hi

I want to create my own .ckpt file trained on the CUB dataset. Can you please guide me on how to do that?

Thanks

editing specific input image

Thank you for your work
is there any example how to edit specific face ?
For example moving face from right to left / make angry face like the example in the paper

image

thanks

Where is the reconstruction code of Table 1?

Thanks for your great work! I find it amazing that the reconstruction error is subtle in DiffusionCLIP. I wonder where the reconstruction code of Table 1 is? Many thanks.

Look forward your reply~

What is LMDB_train?

lmdb.Error: data/celeba_hq/LMDB_train: No such file or directory
I want to train a new effect on CelebA_HQ_dataset, thank you.

celeba_hq.ckpt .etc pretrain model

hi,
I got an error when running.

Downloading: "https://image-editing-test-12345.s3-us-west-2.amazonaws.com/checkpoints/celeba_hq.ckpt" to C:\Users\xx/.cache\torch\hub\checkpoints\celeba_hq.ckpt
ERROR - main.py - 2023-04-22 20:48:44,842 - Traceback (most recent call last):
File "main.py", line 212, in main
runner.clip_finetune()
File "E:\DL\generation\diffusionCLIP\DiffClip\diffusionclip.py", line 82, in clip_finetune
init_ckpt = torch.hub.load_state_dict_from_url(url, map_location=self.device)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\site-packages\torch\hub.py", line 731, in load_state_dict_from_url
download_url_to_file(url, cached_file, hash_prefix, progress=progress)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\site-packages\torch\hub.py", line 597, in download_url_to_file
u = urlopen(req)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 222, in urlopen
return opener.open(url, data, timeout)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 531, in open
response = meth(req, response)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 640, in http_response
response = self.parent.error(
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 569, in error
return self._call_chain(*args)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 502, in _call_chain
result = func(*args)
File "D:\ProgramData\anaconda3\envs\diffusionclip\lib\urllib\request.py", line 649, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 403: Forbidden

pretrain models aren't automatically downloaded in the code, may you share them in Google Drive?

thanks.

GPU VRAM load slowly increases

Hello.

Thank you for your code. I am executing clip_finetune() on CelebA-HQ-256x256 and I am monitoring my GPU VRAM usage. I am noticing a gradual increase in VRAM while precomputing the latents from the given dataset. Is this normal? Also which part of the code is responsible for this behavior. I thought VRAM usage should remain steady throughout the training procedure and reach its peak from the beginning. Also given this behavior, for a large enough number of n_precomp_img this will eventually lead in memory overflow which is definitely not desired.

Thanks in advance.

Training script.

Hi,
I was trying to use DuffusionClip for a different problem on different datasets. I was wondering if you have the training script available. Also, I tried to open diffusionclip.py, but everything seemed to be in a single line. Is ti possible to get the indented version?

Many thanks!

more face attributes

congratulations for incredible work, I was playing with him and incredible how he edits the faces better than gan inversion .
I'm disappointed that there aren't a lot of face attributes to test.
unfortunately I don't have enough resources to train attributes.
I would really appreciate it if you could add these face attributes:
male-->female
child --> old man
old man---> child

tanned model error

Downloading checkpoint/human_tanned_t201.pth ...

HttpError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pydrive/files.py in FetchMetadata(self, fields, fetch_all)
236 fields=fields)
--> 237 .execute(http=self.http)
238 except errors.HttpError as error:

5 frames
HttpError: <HttpError 404 when requesting https://www.googleapis.com/drive/v2/files/15Twto21spGLwiby7_yGO_xquFXtfJIAo?fields=alternateLink%2CappDataContents%2CcanComment%2CcanReadRevisions%2Ccopyable%2CcreatedDate%2CdefaultOpenWithLink%2Cdescription%2CdownloadUrl%2Ceditable%2CembedLink%2Cetag%2CexplicitlyTrashed%2CexportLinks%2CfileExtension%2CfileSize%2CfolderColorRgb%2CfullFileExtension%2CheadRevisionId%2CiconLink%2Cid%2CimageMediaMetadata%2CindexableText%2CisAppAuthorized%2Ckind%2Clabels%2ClastModifyingUser%2ClastModifyingUserName%2ClastViewedByMeDate%2CmarkedViewedByMeDate%2Cmd5Checksum%2CmimeType%2CmodifiedByMeDate%2CmodifiedDate%2CopenWithLinks%2CoriginalFilename%2CownedByMe%2CownerNames%2Cowners%2Cparents%2Cpermissions%2Cproperties%2CquotaBytesUsed%2CselfLink%2Cshareable%2Cshared%2CsharedWithMeDate%2CsharingUser%2Cspaces%2Cthumbnail%2CthumbnailLink%2Ctitle%2CuserPermission%2Cversion%2CvideoMediaMetadata%2CwebContentLink%2CwebViewLink%2CwritersCanShare&alt=json returned "File not found: 15Twto21spGLwiby7_yGO_xquFXtfJIAo". Details: "File not found: 15Twto21spGLwiby7_yGO_xquFXtfJIAo">

During handling of the above exception, another exception occurred:

ApiRequestError Traceback (most recent call last)
/usr/local/lib/python3.7/dist-packages/pydrive/files.py in FetchMetadata(self, fields, fetch_all)
237 .execute(http=self.http)
238 except errors.HttpError as error:
--> 239 raise ApiRequestError(error)
240 else:
241 self.uploaded = True

ApiRequestError: <HttpError 404 when requesting https://www.googleapis.com/drive/v2/files/15Twto21spGLwiby7_yGO_xquFXtfJIAo?fields=alternateLink%2CappDataContents%2CcanComment%2CcanReadRevisions%2Ccopyable%2CcreatedDate%2CdefaultOpenWithLink%2Cdescription%2CdownloadUrl%2Ceditable%2CembedLink%2Cetag%2CexplicitlyTrashed%2CexportLinks%2CfileExtension%2CfileSize%2CfolderColorRgb%2CfullFileExtension%2CheadRevisionId%2CiconLink%2Cid%2CimageMediaMetadata%2CindexableText%2CisAppAuthorized%2Ckind%2Clabels%2ClastModifyingUser%2ClastModifyingUserName%2ClastViewedByMeDate%2CmarkedViewedByMeDate%2Cmd5Checksum%2CmimeType%2CmodifiedByMeDate%2CmodifiedDate%2CopenWithLinks%2CoriginalFilename%2CownedByMe%2CownerNames%2Cowners%2Cparents%2Cpermissions%2Cproperties%2CquotaBytesUsed%2CselfLink%2Cshareable%2Cshared%2CsharedWithMeDate%2CsharingUser%2Cspaces%2Cthumbnail%2CthumbnailLink%2Ctitle%2CuserPermission%2Cversion%2CvideoMediaMetadata%2CwebContentLink%2CwebViewLink%2CwritersCanShare&alt=json returned "File not found: 15Twto21spGLwiby7_yGO_xquFXtfJIAo". Details: "File not found: 15Twto21spGLwiby7_yGO_xquFXtfJIAo">

Gender Dataset

Hi,

could you give more information on the gender dataset you used?

regards,
Dyah

Great Work! Few queries...

Can you Please provide the code for the obtaining the Quantitative results reported in the Table 1 and Table 3 of the paper?

Error trying to train

Hi there, Im trying to finetune your FFHQ model but get this error:

Loading ResNet ArcFace
Prepare identity latent
precomputed/CelebA_HQ_train_t500_nim100_ninv40_pairs.pth
ERROR - main.py - 2022-04-05 01:37:56,717 - Traceback (most recent call last):
File "main.py", line 211, in main
runner.clip_finetune()
File "/content/DiffusionCLIP/diffusionclip.py", line 143, in clip_finetune
train_dataset, test_dataset = get_dataset(self.config.data.dataset, DATASET_PATHS, self.config)
File "/content/DiffusionCLIP/datasets/data_utils.py", line 13, in get_dataset
train_dataset, test_dataset = get_celeba_dataset(dataset_paths['CelebA_HQ'], config)
File "/content/DiffusionCLIP/datasets/CelebA_HQ_dataset.py", line 55, in get_celeba_dataset
train_transform, config.data.image_size)
File "/content/DiffusionCLIP/datasets/CelebA_HQ_dataset.py", line 16, in init
meminit=False,
lmdb.Error: /data/DiffusionCLIP/celeba_hq/LMDB_train: No such file or directory

Text editing in non-isolated images

Hi,
Thanks for your work. I am trying the pretrained models on a few test images to see what the results look like. I was trying out the tennis_baseball_t500.pth to see how it works. It works well when the tennisball is well isolated but not so much when the object is part of a scene. When we fine tune the model, the paper says I need 30 or so images, were these images well isolated. If I replace it with images where tennis ball is a small part of the image, will the performance improve?

Text_dic for ImageNet manipulation

Hi,
Nice work! What text_dic you used for ImageNet manipulation? I didn't find it in text_dic.py. Besides, will the definition of text_dic affect the results? If so, how can I define the text_dic better? Can you share some experience? Thank you.

finetuning error

Hello,
I am trying to fine-tune a pre-trained model on the AFHQ dataset for the dog_bear task using Colab.
I have successfully saved the pre-trained model and set up the dataset.

data

└── afhq

├── LMDB_test

│   ├── data.mdb

│   └── lock.mdb

├── LMDB_train

│   ├── data.mdb

│   └── lock.mdb

└── LMDB_val

├── data.mdb

└── lock.mdb

└── raw_images

├── test

   ├── images

└── test

   ├── images

└── val

   ├── images

However, a value error occurs when I try to run the following cell.
!python main.py --clip_finetune_eff
--config afhq.yml
--exp ./runs/test
--edit_attr dog_bear
--do_train 1
--do_test 1
--n_train_img 50
--n_test_img 10
--n_iter 5
--t_0 500
--n_inv_step 40
--n_train_step 6
--n_test_step 40
--lr_clip_finetune 8e-6
--id_loss_w 0
--l1_loss_w 1
INFO - main.py - 2024-06-13 17:40:33,558 - Using device: cuda

INFO - main.py - 2024-06-13 17:40:33,559 - Exp instance id = 39862
INFO - main.py - 2024-06-13 17:40:33,559 - Exp comment =
INFO - main.py - 2024-06-13 17:40:33,559 - Config =
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
./runs/test_FT_dog_dog_bear_t500_ninv40_ngen6_id0.0_l11.0_lr8e-06
['Dog']
-> ['Bear']
Improved diffusion Model loaded.
Setting optimizer with lr=8e-06
Loading losses
Prepare identity latent
precomputed/dog_train_t500_nim100_ninv40_pairs.pth
ERROR - main.py - 2024-06-13 17:40:44,029 - Traceback (most recent call last):
File "/content/DiffusionCLIP/main.py", line 213, in main
runner.clip_finetune_eff()
File "/content/DiffusionCLIP/diffusionclip.py", line 423, in clip_finetune_eff
loader_dic = get_dataloader(train_dataset, test_dataset, bs_train=self.args.bs_train,
File "/content/DiffusionCLIP/datasets/data_utils.py", line 23, in get_dataloader
train_loader = DataLoader(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 350, in init
sampler = RandomSampler(dataset, generator=generator) # type: ignore[arg-type]
File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py", line 143, in init
raise ValueError(f"num_samples should be a positive integer value, but got num_samples={self.num_samples}")
ValueError: num_samples should be a positive integer value, but got num_samples=0

Could you help me?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.