Giter VIP home page Giter VIP logo

gcp-colorization's Introduction

Towards Vivid and Diverse Image Colorization with Generative Color Prior, ICCV 2021

arXiv
Yanze Wu, Xintao Wang, Yu Li, Honglun Zhang, Xun Zhao, Ying Shan
ARC Lab, Tencent PCG


Installation

Core dependencies

  • CUDA >= 10.0 (test on 10.0, 10.2, and 11.1)
  • gcc >= 7.3
  • pytorch >= 1.6 (test on 1.6, 1.7.1, and 1.9.0)
  • python3 (test on 3.8)
  • yacs

Install DCN package (only required by torchvision < 0.9.0)

# you can skip this step if you are using torchvision >= 0.9.0
cd ops
python setup.py develop

Download the Pre-trained Models

Download the pretrained models from [Google Drive ] and put them into the assets folder. If you want to reproduce or get the results reported in our ICCV 2021 paper for academic purpose, you can check model zoo. Otherwise, you just need to use the default options, which is our best model.

Inference

Test in the wild images

  1. Predict ImageNet label (0-999) for these images

    1. install awesome timm, pip install timm
    2. use a SOTA classification model from timm to predict the labels
      python predict_imagenet_label.py testcase_in_the_wild --model beit_large_patch16_512 --pretrained
      here testcase_in_the_wild folder has the images you want to test
    3. you will get the label map in assets/predicted_label_for_user_image.txt
  2. Inference colorization

    python main.py --expname inference_in_the_wild --test_folder testcase_in_the_wild DATA.FULL_RES_OUTPUT True
    options:
        --expname: the results will be saved in results/{expname}-{time} folder
        --inference_in_the_wild: contains the images you want to test
        --bs: batch size
        DATA.FULL_RES_OUTPUT: True or False. If set to True
                              the full resolution results will be saved in results/{expname}-{time}/full_resolution_results
                              batch size should be 1 if this flag is set to True
        DATA.CENTER_CROP: whether to center crop the input images.
                          This flag and DATA.FULL_RES_OUTPUT flag can not be both set to True
  3. If everything goes well, you will get the results similar as visual_results.md.

Test images from ImageNet val set

  • The most worry-free way is to make sure the images' names are consistent with the official ImageNet name. Because we provide the GT ImageNet labels in imagenet_val_label_map.txt. You can check testcase_imagenet for examples.

  • Also, the test images should better be color images rather than grayscale images. If you want to test on grayscale images, please read the read_data_from_dataiter function from base_solver.py, prepare the grayscale images following the pipeline and hack the related code.

  • Inference

    python main.py --expname inference_imagenet --test_folder testcase_imagenet DATA.FULL_RES_OUTPUT False DATA.CENTER_CROP False
  • If everything goes well, you will get the following quantitative results on the full 50,000 ImageNet validation images (yes, FID is better than the number reported in our ICCV 2021 paper)

    eval@256x256 FID↓ Colorfulness↑ ΔColorfulness↓
    w/o center crop 1.325 34.89 3.45
    w/ center crop 1.262 34.74 4.12

Diverse Colorization

You can achieve diverse colorization by:

  1. adding random noise to the latent code

    python main.py --expname inference_random_diverse_color --latent_direction -1 --test_folder testcase_in_the_wild/testcase19_from_eccv2022_bigcolor.png DATA.FULL_RES_OUTPUT True
    Input Diverse colorization Diverse colorization Diverse colorization
  2. walking through the interpretable latent space

    we use the method described in Unsupervised Discovery of Interpretable Directions in the GAN Latent Space to find the interpretable directions for BigGAN. By setting latent_direction to the color-relevant direction (e.g., 4, 6, 23) we found, you can achieve more controllable diverse colorization.

    ps: we also provide the checkpoint (i.e., the full direction matrix), so you can find more color-relevant directions by yourself.

    # get the label first
    python predict_imagenet_label.py testcase_diverse --model beit_large_patch16_512 --pretrained
    # the 1st direction we found
    python main.py --expname inference_diverse_color_dir4 --latent_direction 4 --test_folder testcase_diverse/test_diverse_inp0.png
    # the 2nd direction we found
    python main.py --expname inference_diverse_color_dir6 --latent_direction 6 --test_folder testcase_diverse/test_diverse_inp1.png
    python main.py --expname inference_diverse_color_dir6 --latent_direction 6 --test_folder testcase_diverse/test_diverse_inp2.png
    # the 3rd direction we found
    python main.py --expname inference_diverse_color_dir23 --latent_direction 23 --test_folder testcase_diverse/test_diverse_inp3.png
    Input Diverse colorization Diverse colorization Diverse colorization
  3. changing the category

    you can modify the assets/predicted_label_for_user_image.txt

TODO

  • add colab demo

Citation

If you find this project useful for your research, please consider citing our paper:

@inproceedings{wu2021towards,
  title={Towards vivid and diverse image colorization with generative color prior},
  author={Wu, Yanze and Wang, Xintao and Li, Yu and Zhang, Honglun and Zhao, Xun and Shan, Ying},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  year={2021}
}

Acknowedgement

This project borrows codes from CoCosNet, DGP, and BigGAN-PyTorch. DCN code from early version of mmcv. predict_imagenet_label.py from timm. Thanks the authors for sharing their awesome projects.

gcp-colorization's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

gcp-colorization's Issues

The training codes?

Thank you for your work and your patience in answering my earlier issue.
I noticed that the training part of the GCP-colorization doesn't seem to be released in the current released code. When I tried to reproduce your results, I encountered some difficulties due to the lack of some implementation details. Do you have any plans to open-source the code of the training part? If so, when will you release it?
Thanks for your work again and looking forward to your reply.

file issue

Which file do you put the pre-trained model in?

Question about inference

/home/fzh/.conda/envs/video/lib/python3.8/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: /home/fzh/.conda/envs/video/lib/python3.8/site-packages/torchvision/image.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE warn(f"Failed to load image Python extension: {e}") Cannot import deform_conv_ext. You can ignore this message if you are using torchvision >= 0.9.0. Otherwise you may need to check whether the DCN has been successfully installed. Adding attention layer in D at resolution 64 Adding attention layer in E at resolution 64 Adding attention layer in G at resolution 64 Traceback (most recent call last): File "main.py", line 32, in <module> main() File "main.py", line 28, in main sol.run() File "/home/fzh/workspace/GCP-Colorization/solvers/base_solver.py", line 29, in run self.test() File "/home/fzh/workspace/GCP-Colorization/solvers/refcolor_solver.py", line 67, in test self.test_dl = data.get_loader(cfg=self.cfg, ds=self.cfg.DATA.NAME) File "/home/fzh/workspace/GCP-Colorization/data/__init__.py", line 21, in get_loader dataset = dataset_cls(cfg) File "/home/fzh/workspace/GCP-Colorization/data/imagenet_inference.py", line 99, in __init__ assert img_name in label_map AssertionError

Which specific ImageNet dataset should be used to train, to reproduce the results of your experiments in the paper?

Thank you for your excellent work. I was greatly inspired after reading your paper and wanted to reproduce your experimental results, but I encountered a problem during training. The ImageNet dataset from the official website:

https://image-net.org/download-images.php

There are a lot of versions to choose from. Which ImageNet subset did you choose for your network?
Since I don't know enough about the ImageNet dataset, this may be a stupid question, thanks, and looking forward to your answer.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.