Giter VIP home page Giter VIP logo

richardoey / travelgan_with_perceptual_loss Goto Github PK

View Code? Open in Web Editor NEW
3.0 2.0 1.0 612 KB

The implementation code of Thesis project which entitled "Photo-to-Emoji Transformation with TraVeLGAN and Perceptual Loss" as a final project in my master study.

Python 100.00%
emoji-transformation photo-to-emoji image-transformations gan generative pytorch deep-learning generative-adversarial-network siamese-network vgg19

travelgan_with_perceptual_loss's Introduction

Photo-to-Emoji Transformation with TraVeLGAN and Perceptual Loss

Pytorch implementation of Thesis project entitled "Photo-to-Emoji Transformation with TraVeLGAN and Perceptual Loss" (or in Chinese, "基於TraVeLGAN與Perceptual Loss實現照⽚轉換表情符號之應⽤")

Getting Started (Training)

Steps:

  1. Download all of the files and folders in this repo and prepare the dataset. In my project, in this project we used CelebA dataset and Bitmoji dataset run python create_emojis.py and set the number of bitmoji images on the num_emojis variable.

  2. Put the training CelebA dataset inside dataset/CelebA/trainA/ folder, and test CelebA dataset inside dataset/CelebA/test.

  3. Put all the Bitmoji dataset inside dataset/Bitmoji folder.

  4. Set up the config file inside configs/cifar.json. Generally, You can determine the number of epochs, n_save_steps, and batch_size. I use batch_size=32 for faster converged.

  5. Run program using command

python train.py --log log_photo2emoji --project_name photo2emoji  

Testing

Steps:

  1. Change the saved_model key in config.json to be ./log_photo2emoji/model_500.pt or whenever number of iteration model you use.

  2. run program using command

python testAtoB.py --project_name photo2emoji --log log_photo2emoji

NB: You could download the pretrained model from this link OneDrive Link, and place it in log_photo2emoji folder

Folder structure

The following shows basic folder structure.

├── configs # config.json folder
├── dataset
│   ├── CelebA # Domain A (not included in this repo)
│   │   ├── trainA 
│   │   └── trainA_pair # edge-promoting results of CelebA to be saved here
│   |
│   |── Bitmoji # Domain B (not included in this repo)
│   |   ├── trainB 
|   |   └── trainB_pair # edge-promoting results of Bitmoji to be saved here
|   |
|   |── bitmoji_api_info.md
|   |── create_emojis.py
|   └── create_emojis_parallel.py
|
├── networks
|   └── default.py  # the Generator, Discriminator, Siamese network
|
├── photo2emoji # will be created using --project_name photo2emoji command
├── log_photo2emoji
|   └── model_500.pt # download this file (link at Pretrained Section)
|
├── samples # result samples folder
├── edge_promoting.py
├── losses.py   # loss functions code 
├── testAtoB.py # test code
├── train.py
├── trainer.py
└── utils.py

Result Samples

12345

678910

Comparison

  1. TraVeLGAN (Original)

12345

  1. TraVeLGAN + Perceptual Loss

12345

Pretrained Model

You can download the pretrained model (after 500 epochs) of this implementation in OneDrive Link

Acknowledgments

This implementation code is inspired by

travelgan_with_perceptual_loss's People

Contributors

richardoey avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

stevelin168

travelgan_with_perceptual_loss's Issues

Training wont start, can't understand error

Hello! Thanks for amazing idea and work.
Training wont start with the following error:
File "/content/TraVeLGAN_with_perceptual_loss/losses.py", line 71, in forward
v_o = e_o[pairs[:, 0]] - e_o[pairs[:, 1]]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

I added few "print"'s for debug purposes as follows to the TraVeLGAN_with_perceptual_loss/losses.py file:
class TravelLoss(nn.Module):
def init(self):
super(TravelLoss, self).init()
self.pair_selector = NegativePairSelector()
self.angle_dist = nn.CosineSimilarity()
self.mag_dist = nn.MSELoss(reduction='mean')

# embedding_network = siamese network
def forward(self, x_o, x_t, embedding_network):
    **print('x_o, x_o.size(0)', x_o, x_o.size(0))**
    pairs = self.pair_selector(x_o.size(0))
    **print('pairs',pairs)**
    e_o = embedding_network(x_o)
    **print('e_o',e_o)**
    v_o = e_o[pairs[:, 0]] - e_o[pairs[:, 1]]
    e_t = embedding_network(x_t)
    v_t = e_t[pairs[:, 0]] - e_t[pairs[:, 1]]
    return (self.mag_dist(v_o, v_t) - self.angle_dist(v_o, v_t)).mean()

and here is full output:

/content/TraVeLGAN_with_perceptual_loss
Namespace(hparams='config', input_size=1024, log='./log_photo2emoji/', project_name='photo2emoji')
Random Seed: 6203
Loading data..
Domain A edge-promoting start!!
Domain B edge-promoting start!!
Model loaded on device : cuda
saved model opts.log: ./log_photo2emoji/
Start training..
/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at /pytorch/c10/core/TensorImpl.h:1156.)
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
x_o, x_o.size(0) tensor([[[[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
...,
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.],
[1., 1., 1., ..., 1., 1., 1.]],

     [[1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      ...,
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.]],

     [[1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      ...,
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.],
      [1., 1., 1.,  ..., 1., 1., 1.]]]], device='cuda:0') 1

pairs []
e_o tensor([[[[-0.2140, -0.1830, -0.1830, ..., -0.1830, -0.1830, -0.1962],
[-0.0688, -0.0689, -0.0689, ..., -0.0689, -0.0689, -0.0770],
[-0.0688, -0.0689, -0.0689, ..., -0.0689, -0.0689, -0.0770],
...,
[-0.0688, -0.0689, -0.0689, ..., -0.0689, -0.0689, -0.0770],
[-0.0688, -0.0689, -0.0689, ..., -0.0687, -0.0689, -0.0770],
[ 0.0322, 0.0649, 0.0649, ..., 0.0645, 0.0649, 0.0525]],

     [[ 0.0994,  0.1673,  0.1673,  ...,  0.1673,  0.1673,  0.1499],
      [ 0.1358,  0.1945,  0.1945,  ...,  0.1945,  0.1945,  0.1619],
      [ 0.1358,  0.1945,  0.1945,  ...,  0.1945,  0.1945,  0.1619],
      ...,
      [ 0.1358,  0.1945,  0.1945,  ...,  0.1945,  0.1945,  0.1619],
      [ 0.1358,  0.1945,  0.1945,  ...,  0.1943,  0.1945,  0.1619],
      [ 0.0830,  0.1578,  0.1578,  ...,  0.1570,  0.1578,  0.1395]],

     [[ 0.0879,  0.1331,  0.1331,  ...,  0.1331,  0.1331,  0.0967],
      [ 0.0671,  0.0954,  0.0954,  ...,  0.0954,  0.0954,  0.0433],
      [ 0.0671,  0.0954,  0.0954,  ...,  0.0954,  0.0954,  0.0433],
      ...,
      [ 0.0671,  0.0954,  0.0954,  ...,  0.0954,  0.0954,  0.0433],
      [ 0.0671,  0.0954,  0.0954,  ...,  0.0953,  0.0954,  0.0433],
      [ 0.0205,  0.0348,  0.0348,  ...,  0.0351,  0.0348, -0.0099]],

     ...,

     [[ 0.2573,  0.2636,  0.2636,  ...,  0.2636,  0.2636,  0.3076],
      [ 0.2788,  0.2641,  0.2641,  ...,  0.2641,  0.2641,  0.3184],
      [ 0.2788,  0.2641,  0.2641,  ...,  0.2641,  0.2641,  0.3184],
      ...,
      [ 0.2788,  0.2641,  0.2641,  ...,  0.2642,  0.2641,  0.3184],
      [ 0.2788,  0.2641,  0.2641,  ...,  0.2640,  0.2641,  0.3184],
      [ 0.2795,  0.2725,  0.2725,  ...,  0.2728,  0.2725,  0.2982]],

     [[-0.2516, -0.2385, -0.2385,  ..., -0.2385, -0.2385, -0.3603],
      [-0.3489, -0.3108, -0.3108,  ..., -0.3108, -0.3108, -0.4396],
      [-0.3489, -0.3108, -0.3108,  ..., -0.3108, -0.3108, -0.4396],
      ...,
      [-0.3489, -0.3108, -0.3108,  ..., -0.3108, -0.3108, -0.4396],
      [-0.3489, -0.3108, -0.3108,  ..., -0.3112, -0.3108, -0.4396],
      [-0.3442, -0.3314, -0.3314,  ..., -0.3307, -0.3314, -0.4594]],

     [[ 0.1763,  0.1709,  0.1709,  ...,  0.1709,  0.1709,  0.1492],
      [ 0.1521,  0.1422,  0.1422,  ...,  0.1422,  0.1422,  0.1239],
      [ 0.1521,  0.1422,  0.1422,  ...,  0.1422,  0.1422,  0.1239],
      ...,
      [ 0.1521,  0.1422,  0.1422,  ...,  0.1422,  0.1422,  0.1239],
      [ 0.1521,  0.1422,  0.1422,  ...,  0.1420,  0.1422,  0.1239],
      [ 0.2000,  0.1805,  0.1805,  ...,  0.1804,  0.1805,  0.1575]]]],
   device='cuda:0', grad_fn=<AddBackward0>)

Traceback (most recent call last):
File "train.py", line 101, in
print('model.gen_update(x_a, x_b)',model.gen_update(x_a, x_b))
File "/content/TraVeLGAN_with_perceptual_loss/trainer.py", line 172, in gen_update
travel_loss = self.travel_loss(x_a, x_ab, self.siam) +
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/content/TraVeLGAN_with_perceptual_loss/losses.py", line 71, in forward
v_o = e_o[pairs[:, 0]] - e_o[pairs[:, 1]]
IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

As far as I can see, "pairs" war is empty but should not be. What is the problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.