lynnho / cyclegan-tensorflow-2 Goto Github PK

View Code? Open in Web Editor NEW

384.0 19.0 158.0 2.63 MB

License: MIT License

Python 98.33% Shell 1.67%

cyclegan image-translation tensorflow tensorflow2 gan gans

cyclegan-tensorflow-2's Introduction

News

We re-implement CycleGAN by Tensorflow 2! The old versions are here: v1, v0.

CycleGAN - Tensorflow 2

Tensorflow 2 implementation of CycleGAN.

Paper: Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks

Author: Jun-Yan Zhu et al.

Exemplar results

summer2winter

row 1: summer -> winter -> reconstructed summer, row 2: winter -> summer -> reconstructed winter

horse2zebra

row 1: horse -> zebra -> reconstructed horse, row 2: zebra -> horse -> reconstructed zebra

apple2orange

row 1: apple -> orange -> reconstructed apple, row 2: orange -> apple -> reconstructed orange

Usage

Environment
- Python 3.6
- TensorFlow 2.2, TensorFlow Addons 0.10.0
- OpenCV, scikit-image, tqdm, oyaml
- we recommend Anaconda or Miniconda, then you can create the TensorFlow 2.2 environment with commands below
```
conda create -n tensorflow-2.2 python=3.6

source activate tensorflow-2.2

conda install scikit-image tqdm tensorflow-gpu=2.2

conda install -c conda-forge oyaml

pip install tensorflow-addons==0.10.0
```
- NOTICE: if you create a new conda environment, remember to activate it before any other command
```
source activate tensorflow-2.2
```
Dataset
- download the summer2winter dataset
```
sh ./download_dataset.sh summer2winter_yosemite
```
- download the horse2zebra dataset
```
sh ./download_dataset.sh horse2zebra
```
- see download_dataset.sh for more datasets

Example of training

CUDA_VISIBLE_DEVICES=0 python train.py --dataset summer2winter_yosemite

tensorboard for loss visualization

tensorboard --logdir ./output/summer2winter_yosemite/summaries --port 6006

Example of testing

CUDA_VISIBLE_DEVICES=0 python test.py --experiment_dir ./output/summer2winter_yosemite

cyclegan-tensorflow-2's People

Contributors

Stargazers

Watchers

Forkers

taichu012 jinhlov tariqahassan sozercan becktor jacksparal codeaudit baiyancheng20 qgzang stevenlol benjamesbabala fanyuzeng monjovi mrluker gogobd winwinjjiang mjdebord nortd irmdgcn zekesong jbdatascience ligaoyi101 warrenchuo xmuofgjk willdamon charstiles ogugugugugua zhikangd pandinosaurus yi-qi638 notatall666 andrewginns hussain7 zhou121 linzehua fendaq satroan yonatan-katz xiaoanshi emezac jackyspeed grseb9s duangao hongyun1993 psyche-mia offbye mercsaturn gufeicang greyzzzzzzxh bigalearing tianhuo007 shi27feng hussain5577 li-yibing fahad92virgo shayan-taheri briwisdom crystaldust zhengjunyue mitaleee aptlin vince-lynch paulbota kalaiselvannk zouni88 wrenth04 ramidecodes chengmuni66 hollygrimm hakan176 chiuqyan jben-hun iszotic patrick-woo civilservant-666 furret2018 singularity-ai 3thiago betty9895 qiulimoges gradpratik nathan-miller23 flyfeatherok sunghyunwee lieaex victoryuan233 tpham393 borsuk74 lil-mason-mcgough alifahriander feelychau zacbaum sararanjbar thisiskiru foundation-models 200429ref oriyitzhaki piperod mengzifds zig1375

cyclegan-tensorflow-2's Issues

Generator loss starts around 2000 and never decreases

i am working on Anomaly generation from Normal data.The dataset consists of 41 features ,all integers so i have padded the number 255 23 times to make it 8x8image.. discriminator loss keeps fluctuating ... even after 1000 iterations the Gloss is around 3000 range.

the outputs is empty when I run test.py

the outputs is empty when I run test.py. The train.py is normal. And the datasets/monet2photo/<testA/testB> all have contents.And on error report.
the last traceback is
File "F:\python\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "F:\python\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op op_def=op_def) File "F:\python\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__ self._traceback = tf_stack.extract_stack()

Training on 2 GPU's

HI,
thanks for your amazing work, i want to train this model on two rtx 2080ti gpus, do you think that its easy with the existing code or i need to make changes to make it run on 2 gpus? thanks

Add requirements.txt or setup.py

When I try to run the training it fails with the following error (missing dependencies)

$ CUDA_VISIBLE_DEVICES=0 python train.py --dataset summer2winter_yosemite
Traceback (most recent call last):
  File "train.py", line 42, in <module>
    py.args_to_yaml(py.join(output_dir, 'settings.yml'), args)
  File "/content/CycleGAN-Tensorflow-2/pylib/argument.py", line 81, in args_to_yaml
    serialization.save_yaml(path, vars(namespace), **kwagrs)
  File "/content/CycleGAN-Tensorflow-2/pylib/serialization.py", line 36, in save_yaml
    import oyaml as yaml
ModuleNotFoundError: No module named 'oyaml'

Can you add a requirements.txt or a setup.py so that the setup can be easy

Image size in the dataset

I noticed that images in the example dataset all have a square shape with 256*256 resolution. Does that mean I have to reshape my own images to the same image size when preparing my own dataset?
PS: My images have a aspect of 1920:1080.

Should I change the batch size?

Pardon me if I asked a silly question.
I am using your code to train a model on my own datasets (1214 images for styleA and 1921 images for styleB; and the size of the image is 256*256). And I use the defaut batch size of "1".
Then training process is really slow, which takes me nearly 9 hours for each epoch. Is this kind of speed normal?
If not, should I change the batch size? What is the optimal batch size that can attain both good computation efficiency and accuracy?
Thank you in advance.

Using the wgan configuration gives bad results

Hi,
I've tried to train the model with the "wgan" options turned on, and got pretty bad results.
for example (iter 240000):

I can see a grid-like on the first translation (the cycled image looks fine) but I'm not sure what causing it.
the grid is present on every result (excluding the very begining).

the command I'm using:
python3 train.py --output=output_wgan --adversarial_loss_mode=wgan --gradient_penalty_mode=wgan-gp
I wonder if I should change some of the weights for better results, and if so to which value?

Input 'filename' of 'ReadFile' Op has type float32 that does not match expected type of string.

I am getting this error while trying to run train.py
any ideas?

loss D is zero from the beginning

Thanks for your excellent implementation, I want to translate synthetic haze to real haze, and when I training cyclegan with my dataset, the loss of both D is zero from the beginning, and the quality of the translated image is not as good as I expected, how to improve this?
here's the loss of D from tensor board

and here's my dataset
this is syn_haze

and this is real_haze

The dimension of the output of the discriminator

Hello!

I find the dimension of the output of the discriminator is # h4 is (32 x 32 x 1), and then the code calculate the loss ：

a2b_dis = models.discriminator(a2b, 'b', reuse=True)
# losses
g_loss_a2b = tf.identity(ops.l2_loss(a2b_dis, tf.ones_like(a2b_dis)), name='g_loss_a2b')

I am so confused, as I think the the dimension of the output of the discriminator should be 1.

Could you please give some hints?

THX

computer reboot when run test.py

i dont know why this happen.can anyone help me? tks.

Outputs images look strange

I'm utilising the Adobe-MIT dataset to try and train the network to learn a professional editing style. Training set A is original unedited images, training set B is edited images. Images are all 8bit jpg files.

The outputs I get from the trained network using test.py look strange, wondering if anyone has insight.

The network seems to have learnt to invert part or all of the image. Could this be due to an overflow somewhere?

For these examples the network was trained on paired data, though I can't imagine that would have caused problems. Everything else is default.

Images are not being saved.

After every 100 iterations during training, when the image saving code comes the model gives an exception and the checkpoint is saved, but no image is generated.
The same happens during testing no image is generated.
I am using tensorflow v1 code.

Converting saved checkpoints to pb file

I'm trying to convert the saved cktp files into a pb file.

I'm using the following to do so:

import tensorflow as tf

saver = tf.train.import_meta_graph('./checkpoints/pre_norm2pro/Epoch_(209)_(1100of2100).ckpt.meta', clear_devices=True)
graph = tf.get_default_graph()
input_graph_def = graph.as_graph_def()
sess = tf.Session()
saver.restore(sess, "./checkpoints/pre_norm2pro/Epoch_(209)_(1100of2100).ckpt.data")

The errors I'm getting are:

2018-06-04 16:23:27.608892: W tensorflow/core/util/tensor_slice_reader.cc:95] Could not open ./checkpoints/pre_norm2pro/Epoch_(209)_(1100of2100).ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? 2018-06-04 16:23:27.608914: W tensorflow/core/framework/op_kernel.cc:1192] Data loss: Unable to open table file ./checkpoints/pre_norm2pro/Epoch_(209)_(1100of2100).ckpt.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

The model was trained successfully and runs inference using test.py

A complete log file is attached
log.txt

Edit: Also I assume the names of the output nodes are 'pred' and 'h4'?

Order of training D and G?

Hi, thanks for sharing nice and simple implementation.
Have you tried changing the order of training D and G?
i.e., currently you train G and Ds. How about training Ds first and then G?
To my sense, training G for randomly-initialized Ds is of no use.

conversion is not proper

I have attached one of the images from "sample_images_while_training" folder

This image is after 48 epochs
where right image = input
the middle = orange if the input is apple or vice-versa
right = reconstructed image of the input image

The problem I am facing: In the first row, middle image(orange) has got stalk, which it should not. Also, in the second row, the texture of both apple and orange looks same.

Could you please suggest me any solution to these problems.

Errors while changing batch size

I am getting errors when I'm trying to change batch size from 1 to 2 for the apple2orange dataset. The error log is :

Traceback (most recent call last):
  File "train.py", line 184, in <module>
    coord.join(threads)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/usr/local/lib/python3.5/dist-packages/six.py", line 693, in reraise
    raise value
  File "train.py", line 177, in <module>
    im.imwrite(im.immerge(sample_opt, 2, 3), '%s/Epoch_(%d)_(%dof%d).jpg' % (save_dir, epoch, it_epoch, batch_epoch))
  File "/home/vishal/vishal/btp/CYCLE_GAN/CycleGAN-Tensorflow-PyTorch-Simple-master/image_utils.py", line 132, in immerge
    img[j * h:j * h + h, i * w:i * w + w, ...] = image
ValueError: could not broadcast input array from shape (256,256,3) into shape (0,256,3)

Translated output are blurry

Hi Lynn Ho,

I applied the tensorflow-2 cycleGAN implementation on my own dataset. The output translated images are blurry. I went through the paper of cycleGAN but could not find a parameter to be fine-tuned in case of blurriness. Do you have any idea how this can be solved? Also, how gradient penalty mode and gradient penalty weight parameters in the train.py script works?

colored droplets on the result image

colored dots? droplets? are appearing on the result image.
I am really curious on what may caused this issue.

Any tips or idea on how to reduce such noise?

output of memory

When I run the code of pytorch. I get this issue. I have four 12G GPU. how can i do

using image from camera

Hi,
I'm using an image from camera and the image is rotated. But, if I'm using image from online datasets the code work perfectly.

Thank you

It seems tensorflow-addons is no longer existing?

The error is below:
(base) xx:~/Workspace/CycleGAN-Tensorflow-2$ pip install tensorflow-addons
Collecting tensorflow-addons
Could not find a version that satisfies the requirement tensorflow-addons (from versions: )
No matching distribution found for tensorflow-addons

'len_dataset' is not defined in train.py

HI, I'm a green hand,I want to ask a question that why "'len_dataset' is not defined " was reported when I run the file 'train.py'. What the value of len_dataset should be set?

How to Resume Training From checkpoint

Hi,
I see the code in train.py

But I am not sure whether the training would resume from the latest checkpoint.
Does it resume from the latest checkpoint?

Since tensorflow addons is not available for windows this cannot be used.

Is there a way to not be dependent on tensorflow-addons and still be able to run this ?

a network structure diagram of the code

I am a beginner. Could you please give me a network structure diagram of the code (detailed to convolutional layer and normalized layer parameters)?Thankyou!

Network model

Dear sir:
Thank you for the code you provided, I can already run it, but there are still some questions as following:
1,The training effect is not very good, I trained epoch=6, looked at the results is not very good, I don't know if it is because the number of training is too small. what about yours?
2, I read the code you wrote, it is very good, and I have some points do not understand, such as the setting of some parameters, can you provide the model of the network?
Thank you so much

Environment setup

As the tensorflow 2.0 has just been recently released and related libraries have been frequently updated, the prerequisites post in the readme might need to be further clarified. I came across several problems in setting up the environment in the past few days, which turned out to be caused by the compatibility between the libraries of different versions and the code.

So... To run the current code, the following versions of tensorflow-gpu and tensorflow-addons should be used.
Tensorflow 2.0 Alpha pip install tensorflow-gpu==2.0.0-alpha0
Tensorflow Addons 0.3.1 pip install tensorflow-addons==0.3.1

I have tried the following combinations, and they didn't work.
Tensorflow 2.0.0 + Tensorflow Addons 0.6.0
Tensorflow 2.0 Alpha + Tensorflow Addons 0.4.0 (or >0.4.0)

For lower-level configuration, we need gcc, gpu driver, cuda and cudnn. For everyone' reference, I'll just put up my configuration here.
gcc 4.8.5
gpu GeForce GTX 1080 Ti (the driver should be compatible with the gpu, you can find and download it here https://www.nvidia.com/Download/index.aspx?lang=en-us )
cuda 10.0
cudnn 7.4.1

how to use multity GPU to train this network?

i have two GPU, but how to modify the code to use two GPU？

Choosing a good identity_loss_weight

Is 0.0 a good choice for a default value?

Regarding to the paper, the default should be 0.5, shouldn't it?

Code: py.arg('--identity_loss_weight', type=float, default=0.0)

Gradient penalty interpolation is wrong

Hi, I really like your code.
I found a little detail in line 94
https://github.com/LynnHo/CycleGAN-Tensorflow-2/blob/master/tf2gan/loss.py
the operation should be inter = b + alpha * (a - b) no inter = a + alpha * (b - a), doing this fixes the training with gradient penalty

No training progress?

Hello, thanks for the great job!

It is the only TF implementation that works on Hiptensorflow and ROCm without problems.

However, I have observed that during training there is no progress...
After 9 epochs images in sample_images folder are not recognisable at all.

Everything looks pixelated and unrecognizable... And 0 epoch pictures actually look better (shapes can be seen)

Is it ok after 9 epochs?

AssertionError running train.py

AssertionError: images should be in the range of [-1.0,1.0]!

What is this mean??

work is stop at 100th Inner Epoch Loop with that error message
I changed nothing just run with your modules
help me plz

train on my own collected data failed

Traceback (most recent call last):
File "train.py", line 112, in
a_test_pool = data.ImageData(sess, a_test_img_paths, batch_size, load_size=load_size, crop_size=crop_size)
File "/home/aven/cycleGANtf/data.py", line 35, in init
repeat)
File "/home/aven/cycleGANtf/data.py", line 69, in _image_batch
dataset = dataset.map(_parse_func, num_parallel_calls=num_threads)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 988, in map
return ParallelMapDataset(self, map_func, num_parallel_calls)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 2230, in init
super(ParallelMapDataset, self).init(input_dataset, map_func)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 2198, in init
map_func, "Dataset.map()", input_dataset)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1454, in init
self._function.add_to_graph(ops.get_default_graph())
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/framework/function.py", line 481, in add_to_graph
self._create_definition_if_needed()
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/framework/function.py", line 337, in _create_definition_if_needed
self._create_definition_if_needed_impl()
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/framework/function.py", line 346, in _create_definition_if_needed_impl
self._capture_by_value, self._caller_device)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/framework/function.py", line 863, in func_graph_from_py_func
outputs = func(*func_graph.inputs)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/data/ops/dataset_ops.py", line 1392, in tf_data_structured_function_wrapper
ret = func(*nested_args)
File "/home/aven/cycleGANtf/data.py", line 57, in _parse_func
img = tf.read_file(path)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 528, in read_file
"ReadFile", filename=filename, name=name)
File "/usr/local/anaconda3/envs/cycleGANtf/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 533, in _apply_op_helper
(prefix, dtypes.as_dtype(input_arg.type).name))
TypeError: Input 'filename' of 'ReadFile' Op has type float32 that does not match expected type of string.

I got it runing on the downloaded horse2zebra dataset, but if try to switch to my own collected data, it gave me the above error, try to modify the data type a bit but failed. Any suggestion here how to work it out?

What do the samples mean?

This is more of a question than an issue, but what do the samples mean that are generated during training? I would be really grateful if someone could explain, please

Identity loss computed wrongly?

A2B_id_loss = identity_loss_fn(A, A2B)

Here it seems the identity loss is computed between an instance of A and an instance of B. To my understanding the identity loss is supposed to check that an image from domain A is not altered by a generator towards domain A. In other words, it should be computed between G_B2A(A) and A