Giter VIP home page Giter VIP logo

stylegan-nada's Introduction

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators (SIGGRAPH 2022)

Open In Colab Kaggle arXiv CGPHugging Face Spaces

[Project Website] [Replicate.ai Project]

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators
Rinon Gal, Or Patashnik, Haggai Maron, Amit H. Bermano, Gal Chechik, Daniel Cohen-Or

Abstract:
Can a generative model be trained to produce images from a specific domain, guided by a text prompt only, without seeing any image? In other words: can an image generator be trained blindly? Leveraging the semantic power of large scale Contrastive-Language-Image-Pre-training (CLIP) models, we present a text-driven method that allows shifting a generative model to new domains, without having to collect even a single image from those domains. We show that through natural language prompts and a few minutes of training, our method can adapt a generator across a multitude of domains characterized by diverse styles and shapes. Notably, many of these modifications would be difficult or outright impossible to reach with existing methods. We conduct an extensive set of experiments and comparisons across a wide range of domains. These demonstrate the effectiveness of our approach and show that our shifted models maintain the latent-space properties that make generative models appealing for downstream tasks.

Description

This repo contains the official implementation of StyleGAN-NADA, a Non-Adversarial Domain Adaptation for image generators. At a high level, our method works using two paired generators. We initialize both using a pre-trained model (for example, FFHQ). We hold one generator constant and train the other by demanding that the direction between their generated images in clip space aligns with some given textual direction.

The following diagram illustrates the process:

We set up a colab notebook so you can play with it yourself :) Let us know if you come up with any cool results!

We've also included inversion in the notebook (using ReStyle) so you can use the paired generators to edit real images. Most edits will work well with the pSp version of ReStyle, which also allows for more accurate reconstructions. In some cases, you may need to switch to the e4e based encoder for better editing at the cost of reconstruction accuracy.

Updates

18/05/2022 (A) Added HuggingFace Spaces demo
18/05/2022 (B) Added (partial) StyleGAN-XL support
03/10/2021 (A) Interpolation video script now supports InterfaceGAN based-editing.
03/10/2021 (B) Updated the notebook with support for target style images.
03/10/2021 (C) Added replicate.ai support. You can now run inference or generate videos without needing to setup anything or work with code.
22/08/2021 Added a script for generating cross-domain interpolation videos (similar to the top video in the project page).
21/08/2021 (A) Added the ability to mimic styles from an image set. See the usage section.
21/08/2021 (B) Added dockerized UI tool.
21/08/2021 (C) Added link to drive with pre-trained models.

Generator Domain Adaptation

We provide many examples of converted generators in our project page. Here are a few samples:

Setup

The code relies on the official implementation of CLIP, and the Rosinality pytorch implementation of StyleGAN2.

Requirements

  • Anaconda
  • Pretrained StyleGAN2 generator (can be downloaded from here). You can also download a model from here and convert it with the provited script. See the colab notebook for examples.

In addition, run the following commands:

conda install --yes -c pytorch pytorch=1.7.1 torchvision cudatoolkit=<CUDA_VERSION>
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git

Usage

To convert a generator from one domain to another, use the colab notebook or run the training script in the ZSSGAN directory:

python train.py --size 1024 
                --batch 2 
                --n_sample 4 
                --output_dir /path/to/output/dir 
                --lr 0.002 
                --frozen_gen_ckpt /path/to/stylegan2-ffhq-config-f.pt 
                --iter 301 
                --source_class "photo" 
                --target_class "sketch" 
                --auto_layer_k 18
                --auto_layer_iters 1 
                --auto_layer_batch 8 
                --output_interval 50 
                --clip_models "ViT-B/32" "ViT-B/16" 
                --clip_model_weights 1.0 1.0 
                --mixing 0.0
                --save_interval 150

Where you should adjust size to match the size of the pre-trained model, and the source_class and target_class descriptions control the direction of change. For an explenation of each argument (and a few additional options), please consult ZSSGAN/options/train_options.py. For most modifications these default parameters should be good enough. See the colab notebook for more detailed directions.

21/08/2021 Instead of using source and target texts, you can now target a style represented by a few images. Simply replace the --source_class and --target_class options with:

--style_img_dir /path/to/img/dir

where the directory should contain a few images (png, jpg or jpeg) with the style you want to mimic. There is no need to normalize or preprocess the images in any form.

Some results of converting an FFHQ model using children's drawings, LSUN Cars using Dali paintings and LSUN Cat using abstract sketches:

18/05/2022 StyleGAN3 / StyleGAN-XL models can be trained by appending the --sg3 or --sgxl flags to the training command. Please note that StyleGAN-3 based models (and XL among them) may display grid artifacts under fine-tuning, and that neither model currently supports layer freezing.

See Open In Colab for an example of training with SG3.

Pre-Trained Models

We provide a Google Drive containing an assortment of models used in the paper, tweets and other locations. If you want access to a model not yet included in the drive, please let us know.

Docker

We now provide a simple dockerized interface for training models. The UI currently supports a subset of the colab options, but does not require repeated setups.

In order to use the docker version, you must have a CUDA compatible GPU and must install nvidia-docker and docker-compose first.

After cloning the repo, simply run:

cd StyleGAN-nada/
docker-compose up
  • Downloading the docker for the first time may take a few minutes.
  • While the docker is running, the UI should be available under http://localhost:8888/
  • The UI was tested using an RTX3080 GPU with 16GB of RAM. Smaller GPUs may run into memory limits with large models.

If you find the UI useful and want it expended to allow easier access to saved models, support for real image editing etc., please let us know.

Editing Video

In order to generate a cross-domain editing video (such as the one at the top of our project page), prepare a set of edited latent codes in the original domain and run the following generate_videos.py script in the ZSSGAN directory:

python generate_videos.py --ckpt /model_dir/pixar.pt             \
                                 /model_dir/ukiyoe.pt            \
                                 /model_dir/edvard_munch.pt      \
                                 /model_dir/botero.pt            \
                          --out_dir /output/video/               \
                          --source_latent /latents/latent000.npy \
                          --target_latents /latents/
  • The script relies on ffmpeg to function. On linux it can be installed by running sudo apt install ffmpeg
  • The argument to --ckpt is a list of model checkpoints used to fill the grid.
    • The number of models must be a perfect square, e.g. 1, 4, 9...
  • The argument to --target_latents can be either a directory containing a set of .npy w-space latent codes, or a list of individual files.
  • Please see the script for more details.

We provide example latent codes for the same identity used in our video. If you want to generate your own, we recommend using StyleCLIP, InterFaceGAN, StyleFlow, GANSpace or any other latent space editing method.

03/10/2021 We now provide editing directions for use in video generation. To use the built-in directions, omit the --target_latents argument. You can use specific editing directions from the available list by passing them with the --edit_directions flag. See generate_videos.py for more information.

Related Works

The concept of using CLIP to guide StyleGAN generation results was introduced in StyleCLIP (Patashnik et al.).

We invert real images into the GAN's latent space using ReStyle (Alaluf et al.).

Editing directions for video generation were taken from Anycost GAN (Lin et al.).

Citation

If you make use of our work, please cite our paper:

@misc{gal2021stylegannada,
      title={StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators}, 
      author={Rinon Gal and Or Patashnik and Haggai Maron and Gal Chechik and Daniel Cohen-Or},
      year={2021},
      eprint={2108.00946},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Additional Examples

Our method can be used to enable out-of-domain editing of real images, using pre-trained, off-the-shelf inversion networks. Here are a few more examples:

stylegan-nada's People

Contributors

rinongal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stylegan-nada's Issues

About the bug when running Demo

Hi Rinon,

I'm so interested in the work and tried to run the demo on the project page by inputting my own picture, but I got an error like this:

Xnip2022-06-06_11-42-13

"local variable 'shape' referenced before assignment"

Do you know how to fix it?

Btw, what is the source style of the demo when I specify an input image? Is it extracted from the input image? I'm a bit confused about this.

Thanks a lot for your time!

target_img_list None

Hey, just a heads up. The colab linked in the readme doesn't work as provided. To fix it you have to add "target_img_list": None, into the training_args dictionary.

Super cool project, thanks for work!

Problem in target domain colour transfer

Hi,

Thanks for sharing the great work.

I trained StyleGAN-ADA on a custom dataset and am using the trained weights to initialise StyleGAN-NADA (both, the trainable generator and the frozen generator). I am trying to train StyleGAN-NADA under image guidance, and the image prompt I am using is a set of 5 images, which the StyleGAN-ADA model has already seen during training. However, although StyleGAN-NADA learns to perform domain transfer (in my case, essentially, colour transfer, as the object shape stays roughly the same) onto source images which are close in colour to the target image prompt, the domain (colour) transfer fails when the source images are much different in colour. For example, if a light green object is shown in the target image prompt, it will result in the light green colour being successfully transferred onto a dark green object in a source image, but not onto a pink or blue object in the source image.

Any help would be really appreciated in solving this problem!

how to control the performance of new domain

thank you for your sharing, it is a good idea for using CLIP model as supervise information and train G

as your paper said, it can generate unseen domain, like "Pig", "Super Saiyan", or "Nicolas Cage".

I think it means only the G not seen those domain dataset , but CLIP know thems. So it can supervise G to generate them。

I have two question:
1) How to improve the generation effect for “Pig”, or “Super Saiyan”, (except laten - Mapper) by using “Pig” images?
2) How to generate new domain which CLIP also never seen, like some Medical image, but I have some of them?

How to train the ghost car

Hello, I am using the default setting in the notebook to train the ghost car, and I set training iter=800 instead. But I got the result is far from yours in the paper. Can you share your command to train the ghost car?
image

Question with quantitative evaluation on image diversity

Hello Rinongal,

As far as I know, you mentioned that you clustered around generated images using K-medoids since yours does not use a training set.
How did you set the number of clusters K? (the few-shot adaptation methods choose K to be the number of training samples)

Can I ask you to provide details on evaluating image diversity quantitatively? If you can provide the source code for quantitative evaluation, I would be grateful. Thanks.

Size of output.

Right now the size of the output is 1024x1024 pixels. Can I change the size of the output?

StyleGAN how to generate B image using A source image

Hi,

Thanks for sharing the great work.

I am studying the StyleGAN and it is new for me.

I want to know how to use StyleGAN to create a new B images from A images. I can change the input images for StyleGAN.

Any help would be really appreciated in solving this problem!

Buidling Inference Model

Hello, I'm here to build a inference model to test several things that are exhibited on Additional examples in your README, such as turning photos into cubism painting style

but it is a bit challenging to find proper pre-trained models (if you've got one) for it.

If you do happen to have them, would you be able to share the link for it?

thanks in advance :)

Local run issue

After cloning https://huggingface.co/spaces/rinong/StyleGAN-NADA/tree/main

Attempting to run locally for video generation throws an error:

OS: Win 11 x64
CPU: 8700k
GPU: 3090
Ram: 32GB

Anaconda3 - python3.8:

conda activate chunkmogrify

cd C:\Anaconda3\envs\chunkmogrify\Lib\site-packages\StyleGAN-NADA

python app.py

Browser:
http://127.0.0.1:7860/

After adding any image, and selecting Edit image:

Aligned image has shape: (256, 256)
Traceback (most recent call last):
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\gradio\routes.py", line 275, in predict
output = await app.blocks.process_api(body, username, session_state)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\gradio\blocks.py", line 274, in process_api
predictions = await run_in_threadpool(block_fn.fn, *processed_input)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\starlette\concurrency.py", line 39, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\anyio_backends_asyncio.py", line 818, in run_sync_in_worker_thread
return await future
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\anyio_backends_asyncio.py", line 754, in run
result = context.run(func, *args)
File "app.py", line 194, in inner
return func(self, *args, edit_choices)
File "app.py", line 231, in edit_image
return self.predict(input, output_styles, edit_choices=edit_choices)
File "app.py", line 252, in predict
inverted_latent = self.invert_image(input)
File "app.py", line 152, in invert_image
images, latents = self.run_on_batch(transformed_image.unsqueeze(0))
File "app.py", line 300, in run_on_batch
images, latents = self.e4e_net(
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Anaconda3\envs\chunkmogrify\Lib\site-packages\StyleGAN-NADA\e4e\models\psp.py", line 61, in forward
codes = self.encoder(x)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Anaconda3\envs\chunkmogrify\Lib\site-packages\StyleGAN-NADA\e4e\models\encoders\psp_encoders.py", line 174, in forward
x = self.input_layer(x)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\container.py", line 141, in forward
input = module(input)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\conv.py", line 447, in forward
return self._conv_forward(input, self.weight, self.bias)
File "C:\Anaconda3\envs\chunkmogrify\lib\site-packages\torch\nn\modules\conv.py", line 443, in _conv_forward
return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.FloatTensor) should be the same

The above Runetime Error also occured with:

WSL2 - Ubuntu 20.04.4 LTS

Anaconda3 - python3.9

_run_ninja_build failed on pytorch 1.7 but success on pytorch1.4

Hi, thanks for your excellent work! It is really instructive! however, I met a strange compiling problem while training the network. It reminds "subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1 while building extension 'fused'. This only happens on pytorch 1.7.1 and nothing happens using torch 1.4 (but clip models seem require torch1.7). I have got stuck with this for many days and still cannot find the solution 😀. Any suggestion for this? BTW, the torch 1.7 is installed using pip rather than coda, does this matter? Many thanks for the help 🤪. The environment I use is as listed:
ubuntu 20.14 pytorch 1.7.1 torchvision 0.8.2 torchaudio 0.7.2 CUDA 10.1 ninja 1.8.2

how to set style_img_dir

Hello. You said in the readme that we can replace the source class and target class with style_img_dir. But how to do it in practice? Two different img paths assigned to the same option?

Docker image old version of code

Am I right that in the docker container old version of the code?
Try to pass images to train the model, but the result is different. Model does not use images to train.

TypeError: tensor is not a torch image.

Hi,
Beautiful work!
Installing dependencies on a new conda environment and running as instructed gives the error:

Loading base models...
Models loaded! Starting training...
torch.Size([1, 3, 1024, 1024])
Traceback (most recent call last):
File "train_colab.py", line 144, in
[sampled_src, sampled_dst], clip_loss = net(sample_z)
File "/home/user/miniconda3/envs/nada/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/user/dev/StyleGAN-nada/ZSSGAN/model/ZSSGAN.py", line 278, in forward
clip_loss = torch.sum(torch.stack([self.clip_model_weights[model_name] * self.clip_loss_models[model_name](frozen_img, self.source_class, trainable_img, self.target_class) for model_name in self.clip_model_weights.keys()]))
File "/home/user/dev/StyleGAN-nada/ZSSGAN/model/ZSSGAN.py", line 278, in
clip_loss = torch.sum(torch.stack([self.clip_model_weights[model_name] * self.clip_loss_models[model_name](frozen_img, self.source_class, trainable_img, self.target_class) for model_name in self.clip_model_weights.keys()]))
File "/home/user/miniconda3/envs/nada/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/user/dev/StyleGAN-nada/ZSSGAN/criteria/clip_loss.py", line 299, in forward
clip_loss += self.lambda_direction * self.clip_directional_loss(src_img, source_class, target_img, target_class)
File "/home/user/dev/StyleGAN-nada/ZSSGAN/criteria/clip_loss.py", line 181, in clip_directional_loss
src_encoding = self.get_image_features(src_img)
File "/home/user/dev/StyleGAN-nada/ZSSGAN/criteria/clip_loss.py", line 109, in get_image_features
image_features = self.encode_images(img)
File "/home/user/dev/StyleGAN-nada/ZSSGAN/criteria/clip_loss.py", line 80, in encode_images
images = self.preprocess(images).to(self.device)
File "/home/user/miniconda3/envs/nada/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 60, in call
img = t(img)
File "/home/user/miniconda3/envs/nada/lib/python3.7/site-packages/torchvision/transforms/transforms.py", line 163, in call
return F.normalize(tensor, self.mean, self.std, self.inplace)
File "/home/user/miniconda3/envs/nada/lib/python3.7/site-packages/torchvision/transforms/functional.py", line 201, in normalize
raise TypeError('tensor is not a torch image.')
TypeError: tensor is not a torch image.

Latend code generation

I am trying to provide my own latent code for the "generate_videos.py" skript. I am using ReStyle for generating the latent code from my own pictures. When I start the skript everything is fine for the first 120 frames. After that every image is composed by multiple compressed random images. Is this a common issue?

If validated the latent code I am using. The shape of my latent code is identical to the shape of the latent code you provided. I noticed though, that the range of values of my own latent code is much greater that yours. Do you scale the values produced by ReStyle? Could this be the source of the error?

The skript works just fine for the latent code that you provided.

issue With Colab note book

Hello,

Today when I I tried to run the 3rd step (training) of the Colab notebook I am getting the following error. Could you please assist?

Loading base models...

AttributeError Traceback (most recent call last)
in ()
69
70 print("Loading base models...")
---> 71 net = ZSSGAN(args)
72 print("Models loaded! Starting training...")
73

/content/stylegan_nada/ZSSGAN/model/ZSSGAN.py in init(self, args)
154
155 # Set up frozen (source) generator
--> 156 self.generator_frozen = SG2Generator(args.frozen_gen_ckpt, img_size=args.size, channel_multiplier=args.channel_multiplier).to(self.device)
157 self.generator_frozen.freeze_layers()
158 self.generator_frozen.eval()

AttributeError: 'Namespace' object has no attribute 'channel_multiplier'

Generate interpolation videos from trained models

I was wondering whether there's a possibility to share a script to generate videos from the trained model as shown on your project page. The .pt format is not easily usable for exporting videos from the original NVidia implementation. Any suggestion on how to implement it would be of much help.

Small models for 11GB GPUs

Hi. Thanks for opensourcing this amazing project. I am trying to train the network but I got OOM problem as I don't have any 16GB GPU. Could you please let me know which small models can I try on a 11GB GPU? Thanks so much!

From Human to Shrek and Thanos

Thanks for the great work!

I've managed to replicate most of your examples, except for the some cases. Could you possibly provide the setup for the Human -> Shrek and Human -> Thanos translation?

Targeting Images Google Colab

Hi! I'm quite new when it comes to coding, and I've been struggling to do style targeting from a set of images on the colab. I've tried replacing the source class and target class options, but I only stumble upon a bunch of errors no matter what I do. Is that function not available yet on colab, or am I simply not understanding what I'm supposed to do correctly?

Black images on prediction

image
Net return for me only Nan.
image

Maybe it is connected with converting sg2 model?
Also, it is really hard for me to set up the model locally. I had tried a lot of different ways.

Error generating samples

Hi again :)
Once I train a model, I'm trying to generate samples using not the "# Step 4: Generate samples with the new model", but the sample generator included in stylegan2-ada-rosinality cloned repo:
!python /content/stylegan_ada/generate.py --seeds 10,11 --network /content/output/checkpoint/000001.pt --outdir /content/

I'm getting this error when loading the file:

Loading networks from "/content/output/checkpoint/000001.pt"...
Traceback (most recent call last):
File "/content/stylegan_ada/generate.py", line 121, in
main()
File "/content/stylegan_ada/generate.py", line 116, in main
generate_images(**vars(args))
File "/content/stylegan_ada/generate.py", line 28, in generate_images
_G, _D, Gs = pickle.load(fp)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

I'm doing something wrong? Is it an error? Many thanks

from Human to Plastic Puppet

Thanks for the excellent work!
I want to train the generator from Human to Plastic Puppet using your adaptive layer-freezing approach, but I can't get the result(Figure 13 in the paper) from your train script, I change the improve_shape=True and lambda_global=1.0 in your script, but I'm not sure how to set the other parameters, such as "lambda_global"、"lambda_patch"、"lambda_texture"、"lambda_manifold" and so on.
can you share your train script for adaptive layer-freezing approach?

no checkpoint save

Hi, thanks for the excellent work!
I set save_interval: 50, but no checkpoint saved in output_dir

Gradio related inquiry

How might one load/include an generated .pt for use in the Gradio implementation after the below command has been completed?

Ex:

python train.py --size 1024 --batch 2 --n_sample 4 --output_dir output_images/images_style --lr 0.002 --frozen_gen_ckpt pretrained_models/stylegan2-ffhq-config-f.pt --iter 601 --style_img_dir images_style --auto_layer_k 18 --auto_layer_iters 1 --auto_layer_batch 8 --output_interval 50 --clip_models "ViT-B/32" "ViT-B/16" --clip_model_weights 1.0 1.0 --mixing 0.0 --save_interval 150

Result:

output_images/images_style/checkpoint/000600.pt

how to save the pkl file from colab?

i'm using this colab repo
the new pkl file should end up in the checkpoints folder right?
or do i have to add some particular lines of code?
https://github.com/rinongal/StyleGAN-nada/blob/StyleGAN3-NADA/stylegan3_nada.ipynb

edit:
found the code but it doesn't work in the normal stylegan, is that right? and if so, how do i use it after generating?
i also found out how to get the .pkl files with the save_interval but those don't work with the normal stylegan either
model_name = "network-snapshot-011120.pt"
torch.save(
{
"g_ema": net.generator_trainable.generator.state_dict(),
"g_optim": g_optim.state_dict(),
},
f"{ckpt_dir}/{model_name}",
)
!ls /content/output/checkpoint

is there any way to convert the file into a normal stylegan model again like this repo states it can convert a stylegan2-nada.pt to stylegan2.pkl so maybe this is possible for stylegan3?
https://github.com/eps696/stylegan2ada
all my pictures are slightly tilted to the left and i normally use the visualizer to fix that but it doesnt work with these files :(

how to train on multiple gpus?

hello!
I want to train this model on multiple GPUs, can you give me some advice?
looking forward to your reply!
Thanks.

How to use style mapper

Hi, thanks for your impressive work! I am not sure how to use a style mapper to improve performance in some cases. Could you please share some tips or scripts for training a style mapper and how to use it with a fine-tuned generator?

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED

Hi! when I try to start training, I ran into this problem:


Traceback (most recent call last):
File "train.py", line 147, in
train(args)
File "train.py", line 86, in train
[sampled_src, sampled_dst], loss = net(sample_z)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/pycharm_project_1/ZSSGAN/model/ZSSGAN.py", line 278, in forward
clip_loss = torch.sum(torch.stack([self.clip_model_weights[model_name] * self.clip_loss_models[model_name](frozen_img, self.source_class, trainable_img, self.target_class) for model_name in self.clip_model_weights.keys()]))
File "/mnt/pycharm_project_1/ZSSGAN/model/ZSSGAN.py", line 278, in
clip_loss = torch.sum(torch.stack([self.clip_model_weights[model_name] * self.clip_loss_models[model_name](frozen_img, self.source_class, trainable_img, self.target_class) for model_name in self.clip_model_weights.keys()]))
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/mnt/pycharm_project_1/ZSSGAN/criteria/clip_loss.py", line 294, in forward
clip_loss += self.lambda_direction * self.clip_directional_loss(src_img, source_class, target_img, target_class)
File "/mnt/pycharm_project_1/ZSSGAN/criteria/clip_loss.py", line 175, in clip_directional_loss
self.target_direction = self.compute_text_direction(source_class, target_class)
File "/mnt/pycharm_project_1/ZSSGAN/criteria/clip_loss.py", line 113, in compute_text_direction
source_features = self.get_text_features(source_class)
File "/mnt/pycharm_project_1/ZSSGAN/criteria/clip_loss.py", line 97, in get_text_features
text_features = self.encode_text(tokens).detach()
File "/mnt/pycharm_project_1/ZSSGAN/criteria/clip_loss.py", line 73, in encode_text
return self.model.encode_text(tokens)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/clip/model.py", line 344, in encode_text
x = self.transformer(x)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/clip/model.py", line 199, in forward
return self.resblocks(x)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/clip/model.py", line 186, in forward
x = x + self.attention(self.ln_1(x))
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/clip/model.py", line 183, in attention
return self.attn(x, x, x, need_weights=False, attn_mask=self.attn_mask)[0]
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/modules/activation.py", line 987, in forward
attn_mask=attn_mask)
File "/root/miniconda3/envs/myconda/lib/python3.7/site-packages/torch/nn/functional.py", line 4790, in multi_head_attention_forward
attn_output_weights = torch.bmm(q, k.transpose(1, 2))
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasGemmStridedBatchedExFix( handle, opa, opb, m, n, k, (void*)(&falpha), a, CUDA_R_16F, lda, stridea, b, CUDA_R_16F, ldb, strideb, (void*)(&fbeta), c, CUDA_R_16F, ldc, stridec, num_batches, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)


My environment is:
Python3.7
CUDA 11.1
cuDNN 8.0.5
Pytorch 1.8.1
Ubuntu 18.04

Do you have any ideas about this? Thanks for your help!

Pre-Trained Models

Hi there!

Thanks for your awesome project!

I was going through your project and found this image with different examples:
out

There are few examples the pre-trained models of which I can't find in your shared drive (e.g. shrek, simpsons, thanos and a few more). Would it be possible to add the pre-trained models of those examples as well?

Thanks in advance!

output PT format

Hi Rinongal, first of all many thanks for your amazing repo :)
I assume that generated .pt models under output/checkpoint folder are also in Pytorch Rosinality format, is it correct?
Is there any way to transform them into "standard" Pytorch or Tensorflow format?

Thanks again and sorry if it's off-topic
S.

Uploading trained model to Aws Model Registry

Hi @rinongal , I have finetuned the styleganNada model and saved it by torch.save(net, "model.pt") but it didn't save the ModelClass with it due to which it is not deploying on Model Registry as standalone .pt file to serve the users. Can you help me how I pack the entire model with Class ?

Style transfer of "White Walker"

Hi, thanks for your impressive work! I tried to transfer style with "White Walker" but the result is not good as paper (especially hair and mouth). How should I reproduce the result in your paper?

image
下载

Directional loss without templates

Hi all. I noticed that by default the prompts that will be used to compute the text direction are composed based on a template. This generates a lot of directions and the mean is then used. This behaviour is not documented in the paper, am I missing something? Can I obtain the same results by only using two prompts (source and target) like described originally?

Colab 'Namespace' object has no attribute 'clip_models'` error

41 args = Namespace(**training_args) 42 ---> 43 net = ZSSGAN(args) 44 45 g_reg_ratio = 4 / 5 /content/stylegan_nada/ZSSGAN/model/ZSSGAN.py in __init__(self, args) 172 lambda_texture=args.lambda_texture, 173 clip_model=model_name) --> 174 for model_name in args.clip_models} 175 176 self.clip_model_weights = {model_name: weight for model_name, weight in zip(args.clip_models, args.clip_model_weights)} AttributeError: 'Namespace' object has no attribute 'clip_models'

While running sample colab notebook Step 3, this error happens. Is there any fix for this?

Training setups

Hello. Thank you for this inspiring work!

I have tried to train a few generators for various styles using the train.py script and have noticed that the results depend a lot on the chosen setup: number of iterations, source and target classes, number of layers to train at each iteration (i.e. auto_layer_k) etc. I assume that seeing more examples of "successful" setups (those that lead to qualitative results) can help a lot to actually understand how to select the parameters effectively.

Could you share the commands you used to launch the training for some of the styles mentioned in the article, please?

I am particularly interested in these transformations:

  1. human -> white walker
  2. human -> werewolf
  3. human -> tolkien elf
  4. photo -> painting in the style of Edvard Munch

Image Quality

Hi,

I have a model trained in 2K resolution with StyleGAN-Ada and results have really good quality. I've tried StyleGAN-NADA to transfer my trained model to a new domain using the style-imgs option, and the results of the new image are quite good, however, the resulting images have less quality than the original images generated by StyleGAN-Ada.
The new images are quite blurred so I'm wondering if it's possible to adjust some parameters of the model to get more sharped images.

StyleGan3 Port?

I was not able to find a StyleGan3 version of this project so I gave it a shot but got stuck because the StyleGan 2 and 3 models are apparently quite different.

For instance, the forward function for StyleGan 3 takes two tensors 'z' and 'c'

 def forward(self, z, c, truncation_cutoff=None, update_emas=False):

While StyleGan 2 it only takes one tensor 'styles':

def forward(
    self,
    styles,
    return_latents=False,
    inject_index=None,
    truncation=1,
    truncation_latent=None,
    input_is_latent=False,
    input_is_s_code=False,
    noise=None,
    randomize_noise=True,
):

Any suggestions? This repo seems quite powerful and would be nice if could loose the old TF support.

Question about specifying the style

Hi Rinon,

About the style of images, I have two questions:

  1. Can I specify both the source style (--source_class) and target style (--target_class) according to several images? I know I can specify the target style according to the images for sure in the code as shown below, but how about the source class?

image

  1. Can you give us the list of all the available styles that can be used in the code? I'm a bit confused about how to set some styles correctly in the code, like the style ukiyoe, should I set it to "ukiyoe" or "ukiyo-e"? Because I'd like to reproduce the results in the paper.

  2. Finally, how can I generate the sampling results as shown in the paper by myself? I ran the code and the saved results are four photos every time. So how can I generate diverse results as the paper shows?

dst_001750

dst_001600

Question about directional loss implementation.

Thank you for your great work and I really like it.

One of the problems occurred to me is what's the purpose of normalizing edit_direction, since the directional loss is actually cosine similarity which will normalize the compared feature. And in my experiment, it shows almost the same visual results.

edit_direction /= edit_direction.clone().norm(dim=-1, keepdim=True)

And I also noticed some of the normalization used .clone.norm() while others did not. What's the difference of its effects?

Training on Colab error for StyleGan2

ModuleNotFoundError Traceback (most recent call last)
in ()
----> 1 from ZSSGAN.model.ZSSGAN import ZSSGAN
2
3 import numpy as np
4
5 import torch

/content/stylegan_nada/ZSSGAN/model/ZSSGAN.py in ()
15 from ZSSGAN.model.sg2_model import Generator, Discriminator
16 from ZSSGAN.criteria.clip_loss import CLIPLoss
---> 17 import ZSSGAN.legacy as legacy
18
19 def requires_grad(model, flag=True):

ModuleNotFoundError: No module named 'ZSSGAN.legacy'

Problems trying to convert the sg2 model

I am currently trying to run the program and I need to convert the ffhq.pkl model do a .pt one.

When I enter python stylegan_nada\convert_weight.py --repo stylegan_ada --gen models/ffhq.pkl it says:

Traceback (most recent call last): File "stylegan_nada\convert_weight.py", line 11, in <module> from ZSSGAN.model.sg2_model import Generator, Discriminator File "C:\Users\msk4x\Documents\Projekte\stylegan\stylegan_nada\ZSSGAN\sg2_model.py", line 11, in <module> from op import FusedLeakyReLU, fused_leaky_relu, upfirdn2d, conv2d_gradfix

How can I fix this?

Getting an error 'tensor is not a torch image.' when trying to run train.py

Initializing networks...
  0%|                                                                                                             | 0/301 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 147, in <module>
    train(args)
  File "train.py", line 86, in train
    [sampled_src, sampled_dst], loss = net(sample_z)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/ubuntu/efs/saeid/facelab/StyleGAN-nada/ZSSGAN/model/ZSSGAN.py", line 260, in forward
    train_layers = self.determine_opt_layers()
  File "/home/ubuntu/efs/saeid/facelab/StyleGAN-nada/ZSSGAN/model/ZSSGAN.py", line 216, in determine_opt_layers
    w_loss = [self.clip_model_weights[model_name] * self.clip_loss_models[model_name].global_clip_loss(generated_from_w, self.target_class) for model_name in self.clip_model_weights.keys()]
  File "/home/ubuntu/efs/saeid/facelab/StyleGAN-nada/ZSSGAN/model/ZSSGAN.py", line 216, in <listcomp>
    w_loss = [self.clip_model_weights[model_name] * self.clip_loss_models[model_name].global_clip_loss(generated_from_w, self.target_class) for model_name in self.clip_model_weights.keys()]
  File "/home/ubuntu/efs/saeid/facelab/StyleGAN-nada/ZSSGAN/criteria/clip_loss.py", line 190, in global_clip_loss
    image  = self.preprocess(img)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 60, in __call__
    img = t(img)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/torchvision/transforms/transforms.py", line 163, in __call__
    return F.normalize(tensor, self.mean, self.std, self.inplace)
  File "/home/ubuntu/anaconda3/envs/python3/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 201, in normalize
    raise TypeError('tensor is not a torch image.')
TypeError: tensor is not a torch image.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.