Giter VIP home page Giter VIP logo

pytorch-adain's Introduction

CSIE 5612: Digital Image Processing (DIP) Final Project (Style Transfer on Image and Video)

This is a final project from the Digital Image Processing (DIP) Course (CSIE 5612) on Fall 2020 in National Taiwan University (NTU), Taiwan. This project aims for rendering images and videos with four required styles: sketch, ink painting (ink), watercolor (water), and oil painting (oil). This project is implemented based on a forked version of an unofficial pytorch implementation of a paper, Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Huang+, ICCV2017]. We are really grateful to the both pytorch implementation and original implementation in Torch by the authors, which is very useful.

Requirements

Please install requirements by pip install -r requirements.txt

  • Python 3.5+
  • PyTorch 0.4+
  • TorchVision
  • Pillow

(optional, for training)

  • tqdm
  • TensorboardX

Usage

Download models

Download vgg_normalized.pth/decoder.pth and put them under models/.

Test

Use --content and --style to provide the respective path to the content and style image.

CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content input/content/cornell.jpg --style input/style/woman_with_hat_matisse.jpg

You can also run the code on directories of content and style images using --content_dir and --style_dir. It will save every possible combination of content and styles to the output directory.

CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content_dir input/content --style_dir input/style

This is an example of mixing four styles by specifying --style and --style_interpolation_weights option.

CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content input/content/avril.jpg --style input/style/picasso_self_portrait.jpg,input/style/impronte_d_artista.jpg,input/style/trial.jpg,input/style/antimonocromatismo.jpg --style_interpolation_weights 1,1,1,1 --content_size 512 --style_size 512 --crop

Some other options:

  • --content_size: New (minimum) size for the content image. Keeping the original size if set to 0.
  • --style_size: New (minimum) size for the content image. Keeping the original size if set to 0.
  • --alpha: Adjust the degree of stylization. It should be a value between 0.0 and 1.0 (default).
  • --preserve_color: Preserve the color of the content image.

Train

Use --content_dir and --style_dir to provide the respective directory to the content and style images.

CUDA_VISIBLE_DEVICES=<gpu_id> python train.py --content_dir <content_dir> --style_dir <style_dir>

For more details and parameters, please refer to --help option.

I share the model trained by this code here

Style Transfer on Image

For better output result, we work on content-color preserving, luminance-style transferring, and spatial controlling.

Associated files

  • my_test.py

Reproducing results

The results in the presentation and all of the necessary content/style images can be found in the folder img. To reproduce all the results, run

./reproduce.sh

or run

./lake.sh
./woman.sh
./street.sh

respectively to reproduce results for each image.

Trying other images

Use --content and --style to provide the respective path to the content and style image.

python my_test.py --content imgs/content/woman/woman.jpg --style imgs/style/woman/oil.png

Some other options:

  • --content_size: New (minimum) size for the content image. Keeping the original size if set to 0.
  • --style_size: New (minimum) size for the content image. Keeping the original size if set to 0.
  • --alpha: Adjust the degree of stylization. It should be a value between 0.0 and 1.0 (default).
  • --preserve_color: Preserve the color of the content image by histogram-matching technique.
  • --luminance_only: Preserve the color of the content image by luminance-transfer technique.

Spatial controlling:

Use --mask to provide the respective path to mask image. Providing different --style and --alpha with respect to the masks to stylize different parts of the image separately.

python my_test.py \
--content imgs/content/lake/lake_bw.jpg \
--style imgs/style/lake/ink_front.jpg,imgs/style/lake/ink_back.jpg \
--mask imgs/content/lake/lake_foreground.jpg,imgs/content/lake/lake_background.jpg \
--alpha 0.8,1 \
--style_size 250 \
--crop_style \
--content_size 0 \
--output imgs/reproduce/lake/ink

Style Transfer on Video (Method 1)

For better output result, we work on content-color preserving, luminance-style transferring, and video deflickering.

Associated files

  • my_video.py
  • luminance_adjustment.py
  • temporal_smoothing.py

Style transfer

Baseline

Use --style to select the desired style: oil/water/sketch/ink.

python my_video.py --style [style options]

By default, in the procedure of style tranfer for sketch and ink, we will convert the contents into graysacle and the grayscale contents will be stylized into outputs with all information, i.e., luminance and color in the style image. For the styles oil and water, we will only transfer the style of luminance of the contents. Additional argument --UV_color can be specified if we want to transfer oil and water with all information in the style image and preserve the color of contents after stylization. Output file(s): (Output videos can be found in the directory /output)

  • [specified_style].mp4: (for sketch and ink) The input video ntu.mp4 is style transferred into the style specified. The color of output video depends on the style image.

  • luma_[specified_style].mp4: (for oil and water) The input video ntu.mp4 is style transferred into the style specified. Style transfer is only applyed on the luminance dimension.

  • UV_[specified_style].mp4: (for oil and water) The input video ntu.mp4 is style transferred into the style specified. Style transfer is achieved through being stylized by whole information in the style image and the U and V values of contents are preserved.

Note that these output files are with flicker. To perform video deflickering, follow the instructions below.

Video deflickering

Temporal smoothing:

python temporal_smoothing.py [videoPath]

Output file(s): (Output videos can be found in the directory /output)

  • TS_[videoName].mp4:

Luminance adjustment: (suggested)

python luminance_adjustment.py [videoPath]

Output file(s): (Output videos can be found in the directory /output)

  • LA_[videoName].mp4:

Style Transfer on Video (Method 2)

  • Unzip OptiaclFlow.zip, main folder: /fast-artistic-videos-master
    cd /fast-artistic-videos-master

Environment Setting

1.Install torch7:
  http://torch.ch/docs/getting-started.html#_
2.Update torch package
  luarocks install torch
  luarocks install nn
  luarocks install image
  luarocks install lua-cjson
  luarocks install hdf5
3.Install loadcaffe

Sample command

  • Run with optical flow, need around 1~2 hour ./stylizeVideo_deepflow.sh input/ntu.mp4 models/checkpoint-picasso-video.t7

Keep original color of video

python3 preserved.py

Output image

Output frames will be in /ntu with filename: 0ut-00xxx.png

Output video

Output video will be in /fast-artistic-videos-master with filename ntu-stylized.mp4

Pre-trained style models

  • pre-trained style models in /models folder checkpoint-picasso-video.t7 checkpoint-WomanHat-video.t7 checkpoint-scream-video.t7 checkpoint-schlief-video.t7 checkpoint-mosaic-video.t7 checkpoint-candy-video.t7

If there is any problem, please contact:

References

pytorch-adain's People

Contributors

naoto0804 avatar chihting-cs avatar 1999adek avatar tundergod avatar levindabhi avatar dependabot[bot] avatar crcrpar avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.