Giter VIP home page Giter VIP logo

subpixel's Introduction

subpixel: A subpixel convolutional neural network implementation with Tensorflow

Left: input images / Right: output images with 4x super-resolution after 6 epochs:

See more examples inside the images folder.

In CVPR 2016 Shi et. al. from Twitter VX (previously Magic Pony) published a paper called Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network [1]. Here we propose a reimplementation of their method and discuss future applications of the technology.

But first let us discuss some background.

Convolutions, transposed convolutions and subpixel convolutions

Convolutional neural networks (CNN) are now standard neural network layers for computer vision. Transposed convolutions (sometimes referred to as deconvolution) are the GRADIENTS of a convolutional layer. Transposed convolutions were, as far as we know first used by Zeiler and Fergus [2] for visualization purposes while improving their AlexNet model.

For visualization purposes let us check out that convolutions in the present subject are a sequence of inner product of a given filter (or kernel) with pieces of a larger image. This operation is highly parallelizable, since the kernel is the same throughout the image. People used to refer to convolutions as locally connected layers with shared parameters. Checkout the figure bellow by Dumoulin and Visin [3]:

source

Note though that convolutional neural networks can be defined with strides or we can follow the convolution with maxpooling to downsample the input image. The equivalent backward operation of a convolution with strides, in other words its gradient, is an upsampling operation, where zeros a filled in between non-zeros pixels followed by a convolution with the kernel rotated 180 degrees. See representation copied from Dumoulin and Visin again:

source

For classification purposes, all that we need is the feedforward pass of a convolutional neural network to extract features at different scales. But for applications such as image super resolution and autoencoders, both downsampling and upsampling operations are necessary in a feedforward pass. The community took inspiration on how the gradients are implemented in CNNs and applied them as a feedforward layer instead.

But as one may have observed the upsampling operation as implemented above with strided convolution gradients adds zero values to the upscale the image, that have to be later filled in with meaningful values. Maybe even worse, these zero values have no gradient information that can be backpropagated through.

To cope with that problem, Shi et. al [1] proposed what we argue to be one the most useful recent convnet tricks (at least in my opinion as a generative model researcher!) They proposed a subpixel convolutional neural network layer for upscaling. This layer essentially uses regular convolutional layers followed by a specific type of image reshaping called a phase shift. In other words, instead of putting zeros in between pixels and having to do extra computation, they calculate more convolutions in lower resolution and resize the resulting map into an upscaled image. This way, no meaningless zeros are necessary. Checkout the figure below from their paper. Follow the colors to have an intuition about how they do the image resizing. Check this paper for further understanding.

source

Next we will discuss our implementation of this method and later what we foresee to be the implications of it everywhere where upscaling in convolutional neural networks was necessary.

Subpixel CNN layer

Following Shi et. al. the equation for implementing the phase shift for CNNs is:

source

In numpy, we can write this as

def PS(I, r):
  assert len(I.shape) == 3
  assert r>0
  r = int(r)
  O = np.zeros((I.shape[0]*r, I.shape[1]*r, I.shape[2]/(r*2)))
  for x in range(O.shape[0]):
    for y in range(O.shape[1]):
      for c in range(O.shape[2]):
        c += 1
        a = np.floor(x/r).astype("int")
        b = np.floor(y/r).astype("int")
        d = c*r*(y%r) + c*(x%r)
        print a, b, d
        O[x, y, c-1] = I[a, b, d]
  return O

To implement this in Tensorflow we would have to create a custom operator and its equivalent gradient. But after staring for a few minutes in the image depiction of the resulting operation we noticed how to write that using just regular reshape, split and concatenate operations. To understand that note that phase shift simply goes through different channels of the output convolutional map and builds up neighborhoods of r x r pixels. And we can do the same with a few lines of Tensorflow code as:

def _phase_shift(I, r):
    # Helper function with main phase shift operation
    bsize, a, b, c = I.get_shape().as_list()
    X = tf.reshape(I, (bsize, a, b, r, r))
    X = tf.transpose(X, (0, 1, 2, 4, 3))  # bsize, a, b, 1, 1
    X = tf.split(1, a, X)  # a, [bsize, b, r, r]
    X = tf.concat(2, [tf.squeeze(x) for x in X])  # bsize, b, a*r, r
    X = tf.split(1, b, X)  # b, [bsize, a*r, r]
    X = tf.concat(2, [tf.squeeze(x) for x in X])  #
    bsize, a*r, b*r
    return tf.reshape(X, (bsize, a*r, b*r, 1))

def PS(X, r, color=False):
  # Main OP that you can arbitrarily use in you tensorflow code
  if color:
    Xc = tf.split(3, 3, X)
    X = tf.concat(3, [_phase_shift(x, r) for x in Xc])
  else:
    X = _phase_shift(X, r)
  return X

The reminder of this library is an implementation of a subpixel CNN using the proposed PS implementation for super resolution of celeb-A image faces. The code was written on top of carpedm20/DCGAN-tensorflow, as so, follow the same instructions to use it:

$ python download.py --dataset celebA  # if this doesn't work, you will have to download the dataset by hand somewhere else
$ python main.py --dataset celebA --is_train True --is_crop True

Subpixel CNN future is bright

Here we want to forecast that subpixel CNNs are going to ultimately replace transposed convolutions (deconv, conv grad, or whatever you call it) in feedforward neural networks. Phase shift's gradient is much more meaningful and resizing operations are virtually free computationally. Our implementation is a high level one, using default Tensorflow OPs. But next we will rewrite everything with Keras so that an even larger community can use it. Plus, a cuda backend level implementation would be even more appreciated.

But for now we want to encourage the community to experiment replacing deconv layers with subpixel operatinos everywhere. By everywhere we mean:

  • Conv-deconv autoencoders
    Similar to super-resolution, include subpixel in other autoencoder implementations, replace deconv layers
  • Style transfer networks
    This didn't work in a lazy plug and play in our experiments. We have to look more carefully
  • Deep Convolutional Autoencoders (DCGAN)
    We started doing this, but as predicted we have to change hyperparameters. The network power is totally different from deconv layers.
  • Segmentation Networks (SegNets)
    ULTRA LOW hanging fruit! This one will be the easiest. Free paper, you're welcome!
  • wherever upscaling is done with zero padding

Join us in the revolution to get rid of meaningless zeros in feedfoward convnets, give suggestions here, try our code!

Sample results

The top row is the input, the middle row is the output, and the bottom row is the ground truth.

by @dribnet

References

[1] Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. By Shi et. al.
[2] Visualizing and Understanding Convolutional Networks. By Zeiler and Fergus.
[3] A guide to convolution arithmetic for deep learning. By Dumoulin and Visin.

Further reading

Alex J. Champandard made a really interesting analysis of this topic in this thread.
For discussions about differences between phase shift and straight up resize please see the companion notebook and this thread.

subpixel's People

Contributors

baocin avatar dandandan avatar dribnet avatar edersantana avatar goldsmith avatar kyleyee23 avatar morgangiraud avatar resistor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

subpixel's Issues

when loss down?

Hi,i have some questions.
1.in the 1 epoch ,the loss gets lower from 0.1 to 0.05 . after 25 epoch ,no obvious change ..
it is normal?
2. the samples generated during training are like this:

image
some lines can be seen nearly the edge of images . it is something wrong?

btw, how to use trained model to generate sr image ?

thank you

_phase_shift(I, r) error while using batch size 1

I think there might be a problem with function _phase_shift(I,r). My code was crashing when image was resized from 1x1 to 2x2 . Probably it is because of some kind of dimension mismatch.
I changed it to :

def _phase_shift(I, r):
# Helper function with main phase shift operation
bsize, a, b, c = I.get_shape().as_list()
X = tf.reshape(I, (bsize, a, b, r, r))
X = tf.transpose(X, (0, 1, 2, 4, 3)) # bsize, a, b, 1, 1
X = tf.split(1, a, X) # a, [bsize, b, r, r]
X = tf.concat(2, [tf.squeeze(x, squeeze_dims=1) for x in X]) # bsize, b, ar, r
X = tf.split(1, b, X) # b, [bsize, a
r, r]
X = tf.concat(2, [tf.squeeze(x, squeeze_dims=1) for x in X]) #
bsize, ar, br
return tf.reshape(X, (bsize, ar, br, 1))

Its just adding squeeze_dims=1 . Now it seems to work with batch size = 1

Difference from tf.depth_to_space

Hi,

Was amazed by this great idea compared to conv2_transpose. Looking at tensorflow documentation I found tf.depth_to_space, which seems to do something similar (at least the non-color version), am I mistaken? What's the difference? What's the reason you did not use this op?

TypeError: Input 'split_dim' of 'Split' Op has type float32 that does not match expected type of int32.

Traceback (most recent call last):
File "main.py", line 58, in
tf.app.run()
File "/home/yashiro32/virtualenvironment/neural_style_transfer/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 39, in main
dataset_name=FLAGS.dataset, is_crop=FLAGS.is_crop, checkpoint_dir=FLAGS.checkpoint_dir)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/projects/subpixel/model.py", line 58, in init
self.build_model()
File "/home/yashiro32/virtualenvironment/neural_style_transfer/projects/subpixel/model.py", line 75, in build_model
self.G = self.generator(self.inputs)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/projects/subpixel/model.py", line 167, in generator
h2 = PS(h2, 4, color=True)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/projects/subpixel/subpixel.py", line 20, in PS
Xc = tf.split(3, 3, X)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1214, in split
split_dim=axis, num_split=num_or_size_splits, value=value, name=name)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 3261, in _split
num_split=num_split, name=name)
File "/home/yashiro32/virtualenvironment/neural_style_transfer/local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 513, in apply_op
(prefix, dtypes.as_dtype(input_arg.type).name))
TypeError: Input 'split_dim' of 'Split' Op has type float32 that does not match expected type of int32.

Testing our dataset

Hi,
I have trained the model using celebA dataset and now I'm going to test my own data. How should I do it. Where the super-resolution test data (the output) will be saved? In which file exactly?

Thanks for your help. :)

How subpixel handle odd image size?

such as an input images with size (?, 501, 301, ?).
considering a process:
images-> (501, 301)
pool ->(251, 151)
subpixel->(502, 302)

I think it might add a tf.image.resize_bilinear function at each subpixel convolution?

Evaluation

Do you release the quantitative evaluation code?

Implementation for keras

Hi, Thanks for the nice work. I was wondering if you are planning release keras code for subpixel layer nearby? I am currently working on a problem where I am using keras APIs. Thanks!

Thanks!

Query trained model

HI, first a big thank you for publishing this work.
I am trying to use a trained model and query it with a new probe image.
It seems to me a very imprtant functionality , after all that is what you train the network for, right?
But I couldn't find it anywhere. I tried writing something, but I get poor results.
here is what I came up with:
any insights would be most appreciated.
thanks,
Omer

import os
import cv2
import numpy as np
from model import DCGAN
from utils import get_image, image_save, save_images
import tensorflow as tf
from scipy.misc import imresize

flags = tf.app.flags
flags.DEFINE_integer("batch_size", 64, "The size of batch images [64]")
flags.DEFINE_integer("image_size", 128, "The size of image to use")
flags.DEFINE_string("checkpoint_dir", "/home/omer/work/sub_pixel/models",
                    "Directory name to read the checkpoints [checkpoint]")
flags.DEFINE_string("test_image_dir", "/home/omer/work/sub_pixel/data/celebA/valid",
                    "Directory name of the images to evaluate")
flags.DEFINE_string("out_dir", "/home/omer/work/sub_pixel/out", "Directory name of to save results in")

FLAGS = flags.FLAGS


def doresize(x, shape):
    x = np.copy((x + 1.) * 127.5).astype("uint8")
    y = imresize(x, shape)
    return y


def main():
    with tf.Session() as sess:
        dcgan = DCGAN(sess, image_size=FLAGS.image_size, image_shape=[FLAGS.image_size, FLAGS.image_size, 3],
                      batch_size=FLAGS.batch_size,
                      dataset_name='celebA', is_crop=False, checkpoint_dir=FLAGS.checkpoint_dir)
        res = dcgan.load(FLAGS.checkpoint_dir)
        if not res:
            print ("failed loading model from path:" + FLAGS.checkpoint_dir)
            return

        i = 0
        files = []
        num_batches = len(os.listdir(FLAGS.test_image_dir)) / FLAGS.batch_size
        completed_batches = 0
        input_images = np.zeros(shape=(FLAGS.batch_size, FLAGS.image_size, FLAGS.image_size, 3))
        for f in os.listdir(FLAGS.test_image_dir):
            try:
                img_path = os.path.join(FLAGS.test_image_dir, f)
                if os.path.isdir(img_path):
                    i += 1
                    continue
                img = get_image(img_path, FLAGS.image_size, False)
                files.append(f)
                input_images[i] = img

                if i == FLAGS.batch_size - 1 or i == len(os.listdir(FLAGS.test_image_dir)) - 1:
                    batch_ready(dcgan, input_images, sess, files)

                    i = 0
                    input_images = np.zeros(shape=(FLAGS.batch_size, FLAGS.image_size, FLAGS.image_size, 3))
                    files = []
                    completed_batches += 1
                    print('done batch {0} out of {1}'.format(completed_batches, num_batches))
                else:
                    i += 1
            except Exception as e:
                print("problem working on:" + f)
                print (str(e))
                i += 1


def batch_ready(dcgan, input_images, sess, files):
    input_resized = [doresize(xx, (32, 32, 3)) for xx in input_images]
    sample_input_resized = np.array(input_resized).astype(np.float32)
    sample_input_images = np.array(input_images).astype(np.float32)
    output_images = sess.run(fetches=[dcgan.G],
                             feed_dict={dcgan.inputs: sample_input_resized, dcgan.images: sample_input_images})
    save_results(output_images, files)


def save_results(output_images, files):
    for k in range(0, len(files)):
        out_path = os.path.join(FLAGS.out_dir, files[k] + '_.png')
        out_img = output_images[0][k]

        # out_correct = ((out_img + 1) * 127.5).astype(np.uint8)
        # out_correct = cv2.cvtColor(out_correct, cv2.COLOR_RGB2BGR)
        # cv2.imshow('image', out_correct)
        # cv2.waitKey(0)

        image_save(out_img, out_path)


if __name__ == '__main__':
    main()


What to do in grayscale images?

Error trying in grayscale images
ValueError: Error when checking target: expected SubPixel to have shape (250, 250, 3) but got array with shape (250, 250, 1)
Thank you in advance !!!!!!!!!!!

_phase_shift does not generalize to batchsize 1

The following lines
line 12 X = tf.concat(2, [tf.squeeze(x) for x in X]) # bsize, b, a*r, r
line 14 X = tf.concat(2, [tf.squeeze(x) for x in X]) # bsize, a*r, b*r

in subpixel.py cause the first dimension to be dropped when the batch size is one. In fact, line 12 causes the dimension drop and line 14 throws an error.

I propose the following change:
X = tf.concat(2, [tf.squeeze(x, axis = 1) for x in X]) # bsize, b, a*r, r
X = tf.concat(2, [tf.squeeze(x, axis = 1) for x in X]) # bsize, a*r, b*r

g_loss is nan for gpu

The logs seem to show that the g_loss value is always nan for gpu, but strangely works fine on CPU. Running with CUDA 7.5, CudNN v4, GTX 1070 Tensorflow 0.10.0rc0.

Please confirm PS(I,r)

Please confirm that the function

def PS(I, r):
  assert len(I.shape) == 3
  assert r>0
  r = int(r)
  O = np.zeros((I.shape[0]*r, I.shape[1]*r, I.shape[2]/(r*2)))
  for x in range(O.shape[0]):
    for y in range(O.shape[1]):
      for c in range(O.shape[2]):
        c += 1
        a = np.floor(x/r).astype("int")
        b = np.floor(y/r).astype("int")
        d = c*r*(y%r) + c*(x%r)
        print a, b, d
        O[x, y, c-1] = I[a, b, d]
  return O

does not implement the PS(T)_{x,y,c} :

In particular, the line in PS(I,r)

d = c*r*(y%r) + c*(x%r)

does not implement the index in PS(T)_{x,y,c}

C * r * mod(y,r) + C * mod(x,r) + c

where capital letter C and small letter c are two different things.

Pre -Trained model

thank you very much for this work
but would you release the pre-trained model ,thanks

Not using the phase shift

Sorry I am mistaken but I never see the phase shift called in your code. in the model.py file you make three deconv2d layers. Looking at the code for these its just regular conv transposes. Where do you use the phase shift stuff in your main code?

Dynamic image size [batch, None, None, 3] error.

Hi thank you for releasing this implementation.

I try to evaluate the model on images with different size, so I would like to define a placeholder as:

t_image = tf.placeholder('float32', [1, None, None, 3], name='input_image')

then when I use the subpixel layer, I will get:

    X = tf.reshape(I, (bsize, a, b, r, r))
  File "/ssd2/Workspace/env3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2451, in reshape
    name=name)
  File "/ssd2/Workspace/env3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 508, in apply_op
    (input_name, err))
ValueError: Tried to convert 'shape' to a tensor and failed. Error: None values not supported.

Do you have any idea to solve this problem? if we cannot use dynamic size, we need to define different inference for different image, which is resource and time consuming.

Many thanks in advance.

the first paper proposed subpixel convolution?

Hello, thanks for sharing the work. I want to know the first paper that proposed subpixel convolution, not efficient subpixel convolution that you discussed in the work. Please answer me.

ValueError: not enough values to unpack (expected 4, got 0)

I met this problem, when i run the code:
Traceback (most recent call last):
File "main.py", line 58, in
tf.app.run()
File "/root/anaconda3/lib/python3.5/site-packages/tensorflow/python/platform/app.py", line 44, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 39, in main
dataset_name=FLAGS.dataset, is_crop=FLAGS.is_crop, checkpoint_dir=FLAGS.checkpoint_dir)
File "/home/subpixel-master/model.py", line 58, in init
self.build_model()
File "/home/subpixel-master/model.py", line 75, in build_model
self.G = self.generator(self.inputs)
File "/home/subpixel-master/model.py", line 167, in generator
h2 = PS(h2, 4, color=True)
File "/home/subpixel-master/subpixel.py", line 21, in PS
X = tf.concat(3, [_phase_shift(x, r) for x in Xc])
File "/home/subpixel-master/subpixel.py", line 21, in
X = tf.concat(3, [_phase_shift(x, r) for x in Xc])
File "/home/subpixel-master/subpixel.py", line 7, in _phase_shift
bsize, a, b, c = I.get_shape().as_list()
ValueError: not enough values to unpack (expected 4, got 0)

Bug in Phase Shift?

As an example

        x3 = np.arange(10*528*528*48).reshape(10, 528, 528, 48)
        X3 = tf.placeholder("float32", shape=(10, 528, 528, 48), name="X")# tf.Variable(x, name="X")
        Y3 = PS(X3, 3, color=True)
        y3 = sess.run(Y3, feed_dict={X3: x3})

this works fine when the second arg to PS is, say, 4, but it chokes when it's 3

  File "/home/user/Dev/subpixel/model.py", line 75, in build_model
    self.G = self.generator(self.inputs)
  File "/home/user/Dev/subpixel/model.py", line 191, in generator
    h2 = PS(h2, 3, color=True)
  File "/home/user/Dev/subpixel/subpixel.py", line 21, in PS
    Xc = tf.split(3, 3, X)
  File "/home/user/Dev/subpixel/subpixel.py", line 9, in _phase_shift
    X = tf.reshape(I, (bsize, a, b, r, r))
  File "/home/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2166, in reshape
    name=name)
  File "/home/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 748, in apply_op
    op_def=op_def)
  File "/home/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2405, in create_op
    set_shapes_for_outputs(ret)
  File "/home/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1790, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/home/user/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1696, in _ReshapeShape
    (num_elements, known_elements))
ValueError: input has 44605440 elements, which isn't divisible by 2509056

Any ideas?

Keras SubPixel 3D

Hi, Thank you for the Subpixel 2D implementation in Keras, I am able to run the 2D implementation.
I am currently working on a 3D Convolution and trying to implement Subpixel for 3D convolution. I was wondering if there was an implementation for 3D Convolution and / or is there any specific parameters that I need to take into account when implementing Subpixel 3D

Tensorflow error on main.py

Running OSX, I get this error on any python file I try to run

{'batch_size': 64,
 'beta1': 0.5,
 'checkpoint_dir': 'checkpoint',
 'dataset': 'celebA',
 'epoch': 25,
 'image_size': 128,
 'is_crop': True,
 'is_train': False,
 'learning_rate': 0.0002,
 'sample_dir': 'samples',
 'train_size': inf,
 'visualize': False}
Traceback (most recent call last):
  File "main.py", line 58, in <module>
    tf.app.run()
  File "/Library/Python/2.7/site-packages/tensorflow/python/platform/default/_app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "main.py", line 39, in main
    dataset_name=FLAGS.dataset, is_crop=FLAGS.is_crop, checkpoint_dir=FLAGS.checkpoint_dir)
  File "/Users/fraserhemp/Documents/subpixel/model.py", line 58, in __init__
    self.build_model()
  File "/Users/fraserhemp/Documents/subpixel/model.py", line 70, in build_model
    self.G = self.generator(self.inputs)
  File "/Users/fraserhemp/Documents/subpixel/model.py", line 155, in generator
    h2 = PS(h2, 4, color=True)
  File "/Users/fraserhemp/Documents/subpixel/subpixel.py", line 21, in PS
    X = tf.concat(3, [_phase_shift(x, r) for x in Xc])
  File "/Users/fraserhemp/Documents/subpixel/subpixel.py", line 9, in _phase_shift
    X = tf.reshape(I, (bsize, a, b, r, r))
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 682, in reshape
    name=name)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 411, in apply_op
    as_ref=input_arg.is_ref)
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/ops.py", line 529, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/constant_op.py", line 178, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/Library/Python/2.7/site-packages/tensorflow/python/ops/constant_op.py", line 161, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 319, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/Library/Python/2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 259, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

image size problem

@dribnet The last commit where things work for me is 0bee08b
After that I'm getting the following error:

I tensorflow/core/common_runtime/gpu/gpu_device.cc:839] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 680, pci bus id: 0000:04:00.0)
Traceback (most recent call last):
  File "main.py", line 58, in <module>
    tf.app.run()
  File "/home/eder/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv))
  File "main.py", line 42, in main
    dcgan.train(FLAGS)
  File "/home/eder/python/ponynet/model.py", line 108, in train
    save_images(sample_images, [8, 8], './samples/reference.png')
  File "/home/eder/python/ponynet/utils.py", line 21, in save_images
    return imsave(inverse_transform(images), size, image_path)
  File "/home/eder/python/ponynet/utils.py", line 40, in imsave
    return scipy.misc.imsave(path, merge(images, size))
  File "/home/eder/python/ponynet/utils.py", line 30, in merge
    h, w = images.shape[1], images.shape[2]
IndexError: tuple index out of range

I checked that images is an empty tensor. Do you know where that might have been introduced?

Inconsistent use of tabs and space

Hello,

In model.py there are several tabs instead of spaces, you should remove them.

Also, in the docker folder there is the vim backup file ;)

Cheers,
~Nico

using deconv2d instead of PS

Hi, I am trying to compare the performance of deconvolution and subpixel convolution. I change the generator by following:
`
def generator(self, z):

    self.h0, self.h0_w, self.h0_b = deconv2d(z, [self.batch_size, 32, 32, self.gf_dim], k_h=1, k_w=1, d_h=1, d_w=1, name='g_h0', with_w=True)
    h0 = lrelu(self.h0)
    self.h1, self.h1_w, self.h1_b = deconv2d(h0, [self.batch_size, 32, 32, self.gf_dim], name='g_h1', d_h=1, d_w=1, with_w=True)
    h1 = lrelu(self.h1)
    h2, self.h2_w, self.h2_b = deconv2d(h1, [self.batch_size, 128, 128, 3], d_h=1, d_w=1, name='g_h2', with_w=True)
    return tf.nn.tanh(h2)`

But it doesn't works:
Traceback (most recent call last): File "main.py", line 58, in <module> tf.app.run() File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run sys.exit(main(sys.argv[:1] + flags_passthrough)) File "main.py", line 42, in main dcgan.train(FLAGS) File "/home/zehaohuang/subpixel_sr/model.py", line 95, in train .minimize(self.g_loss, var_list=self.g_vars) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 196, in minimize grad_loss=grad_loss) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/optimizer.py", line 253, in compute_gradients colocate_gradients_with_ops=colocate_gradients_with_ops) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gradients.py", line 491, in gradients in_grad.set_shape(t_in.get_shape()) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 408, in set_shape self._shape = self._shape.merge_with(shape) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 583, in merge_with (self, other)) ValueError: Shapes (64, 128, 128, 64) and (64, 32, 32, 64) are not compatible
Is there something wrong in my changing?

BTW, h2 = tf.depth_to_space(h2, 4) works well, 'PS' function can be replaced by 'tf.depth_to_space'!

main.py error

when I do the command which is "python main.py --dataset mnist --is_train True --is_crop True",there are some errors.

tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 30, in run
sys.exit(main(sys.argv))
File "main.py", line 36, in main
dataset_name=FLAGS.dataset, is_crop=FLAGS.is_crop, checkpoint_dir=FLAGS.checkpoint_dir)
File "/home/qyq/q2015work/Image_Reconstruction/subpixel-master/model.py", line 58, in init
self.build_model()
File "/home/qyq/q2015work/Image_Reconstruction/subpixel-master/model.py", line 75, in build_model
self.G = self.generator(self.inputs)
File "/home/qyq/q2015work/Image_Reconstruction/subpixel-master/model.py", line 167, in generator
h2 = PS(h2, 4, color=True)
File "/home/qyq/q2015work/Image_Reconstruction/subpixel-master/subpixel.py", line 21, in PS
X = tf.concat(3, [_phase_shift(x, r) for x in Xc])
File "/home/qyq/q2015work/Image_Reconstruction/subpixel-master/subpixel.py", line 12, in _phase_shift
X = tf.concat(2, [tf.squeeze(x, axis=1) for x in X]) # bsize, b, a*r, r
TypeError: squeeze() got an unexpected keyword argument 'axis'

I'll let you know if make progress as well--thanks for the help!

AttributeError: 'DCGAN' object has no attribute 'g_bn0

When I specify --visualize True as an arg, I get the following

I tensorflow/stream_executor/dso_loader.cc:116] successfully opened CUDA library libcublas.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:116] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:116] successfully opened CUDA library libcufft.so.8.0 locally
I tensorflow/stream_executor/dso_loader.cc:116] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:116] successfully opened CUDA library libcurand.so.8.0 locally
{'batch_size': 64,
 'beta1': 0.5,
 'checkpoint_dir': 'checkpoint',
 'dataset': 'celebC',
 'epoch': 2,
 'image_size': 400,
 'is_crop': True,
 'is_train': True,
 'learning_rate': 0.0002,
 'sample_dir': 'samples',
 'train_size': inf,
 'visualize': True}
I tensorflow/core/common_runtime/gpu/gpu_device.cc:951] Found device 0 with properties: 
name: TITAN X (Pascal)
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:01:00.0
Total memory: 11.90GiB
Free memory: 11.35GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1041] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN X (Pascal), pci bus id: 0000:01:00.0)

 [*] Reading checkpoints...
 [*] Load SUCCESS
Epoch: [ 0] [   0/   3] time: 3.6289, g_loss: 0.38425747
WARNING:tensorflow:*******************************************************
WARNING:tensorflow:TensorFlow's V1 checkpoint format is deprecated; V2 will become the default shortly after 10/31/2016.
WARNING:tensorflow:Consider switching to the more efficient V2 format now:
WARNING:tensorflow:   `tf.train.Saver(write_version=tf.train.SaverDef.V2)`
WARNING:tensorflow:to prevent breakage.
WARNING:tensorflow:*******************************************************
Epoch: [ 0] [   1/   3] time: 8.5043, g_loss: 0.34239292
Epoch: [ 0] [   2/   3] time: 10.3694, g_loss: 0.28749919
Epoch: [ 1] [   0/   3] time: 12.3451, g_loss: 0.35043076
Epoch: [ 1] [   1/   3] time: 14.3204, g_loss: 0.29150483
Epoch: [ 1] [   2/   3] time: 16.3195, g_loss: 0.23925565
Traceback (most recent call last):
  File "main.py", line 60, in <module>
    tf.app.run()
  File "/home/markb/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 49, in main
    to_json("./web/js/layers.js", [dcgan.h0_w, dcgan.h0_b, dcgan.g_bn0],
AttributeError: 'DCGAN' object has no attribute 'g_bn0'

Further, when I disable this call in hopes of getting sampler images to save, I get the following

Traceback (most recent call last):
  File "main.py", line 60, in <module>
    tf.app.run()
  File "/home/markb/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 57, in main
    visualize(sess, dcgan, FLAGS, OPTION)
  File "/home/markb/Dev/subpixel/utils.py", line 148, in visualize
    if option == 0:
AttributeError: 'DCGAN' object has no attribute 'sampler'

I can't find any mention of the dcgan's sampler in the code. Is there an easy way of fixing this?

Thanks!
Mark

error in ipython script

Your script :

1 def pony2(I, r):
2 a, b, c = I.shape
3 B = I.reshape(a/r, b/r, r, r).transpose(0, 1, 3, 2)
4 B = np.concatenate([B[i] for i in range(8)], axis=1)
5 B = B.transpose(1, 2, 0)
6 B = np.concatenate([B[:, :, i] for i in range(8)], axis=1)
7 return B

changed code :
1 def pony2(I, r):
2 a, b, c = I.shape
3 B = I.reshape(a, b, r, r).transpose(0, 1, 3, 2)
4 B = np.concatenate([B[i] for i in range(8)], axis=1)
5 B = B.transpose(1, 2, 0)
6 B = np.concatenate([B[:, :, i] for i in range(8)], axis=1)
7 return B

The code in your third is a error. Please change it.

Testing the model

After training the model with the following line
python main.py --dataset celebA --is_train true --is_crop true
how can I test the model?

I've tried python main.py --dataset celebA --is_train false but it just print this
{'batch_size': 64, 'beta1': 0.5, 'checkpoint_dir': 'checkpoint', 'dataset': 'celebA', 'epoch': 25, 'gpu': '0', 'image_size': 128, 'is_crop': True, 'is_train': False, 'learning_rate': 0.0002, 'sample_dir': 'samples', 'train_size': inf, 'visualize': False} 2017-12-07 20:30:13.309369: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2017-12-07 20:30:13.415852: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:892] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2017-12-07 20:30:13.416285: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Found device 0 with properties: name: GeForce 840M major: 5 minor: 0 memoryClockRate(GHz): 1.124 pciBusID: 0000:08:00.0 totalMemory: 1.96GiB freeMemory: 1.55GiB 2017-12-07 20:30:13.416310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce 840M, pci bus id: 0000:08:00.0, compute capability: 5.0) [*] Reading checkpoints...
then it terminats.

Training example does not run

The README.md says to run python main.py --dataset celebA --is_train True --is_crop True. But when I do this it crashes with the error:

Traceback (most recent call last):
  File "main.py", line 58, in <module>
    tf.app.run()
  File "/usr/local/anaconda2/envs/subpixel/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 30, in run
    sys.exit(main(sys.argv[:1] + flags_passthrough))
  File "main.py", line 39, in main
    dataset_name=FLAGS.dataset, is_crop=FLAGS.is_crop, checkpoint_dir=FLAGS.checkpoint_dir)
  File "/develop/nets/subpixel/model.py", line 58, in __init__
    self.build_model()
  File "/develop/nets/subpixel/model.py", line 64, in build_model
    self.up_inputs = tf.image.resize_images(self.inputs, self.image_shape[0], self.image_shape[1], tf.image.ResizeMethod.NEAREST_NEIGHBOR)
  File "/usr/local/anaconda2/envs/subpixel/lib/python2.7/site-packages/tensorflow/python/ops/image_ops.py", line 787, in resize_images
    raise ValueError('\'size\' must be a 1-D Tensor of 2 elements: '
ValueError: 'size' must be a 1-D Tensor of 2 elements: new_height, new_width

Running the same example from the carpedm20/DCGAN-tensorflow repo works fine for me.

Why subpixel rather than deconv?

It seems like you highly recommend to use subpixel to replace deconv. But as introduced by Wenzhe Shi, subpixel upsampling is identical to deconv upsampling. So I don't understand why you do this while deconv layer has been well developed in existed deep learning tools.

problem when run main.py with mnist, Is there anyone have the same problem?or you know how this happen?please give me some advice!Thanks

File "main.py", line 67, in
tf.app.run()
File "/home/fyh/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "main.py", line 51, in main
dcgan.train(FLAGS)
File "/home/fyh/ESPCN/subpixel-master/model.py", line 141, in train
save_images(sample_input_images, [8, 8], os.path.join(self.sample_dir, "inputs_small.png"))
File "/home/fyh/ESPCN/subpixel-master/utils.py", line 22, in save_images
return imsave(inverse_transform(images[:num_im]), size, image_path)
File "/home/fyh/ESPCN/subpixel-master/utils.py", line 41, in imsave
return scipy.misc.imsave(path, merge(images, size))
File "/home/fyh/ESPCN/subpixel-master/utils.py", line 31, in merge
h, w = images.shape[1], images.shape[2]

Why it can't run on cpu?

Hi, thanks for your sharing, I try to run the code on cpu but failed. could you help me out?
I changed the link for installing tensorflow (on cpu)

Inference - model and code

Could you provide the model data and sample code for doing inference, i.e. generating supper-rosultion images that you provide as examples? Thank you.

how to use the trained model to generate super-res images

Probably a noob question but I have trained the model on the celebA dataset. How should I use this model to generate super-res versions of arbitrary images?

In main.py if the training mode is False, the model is simply loaded from checkpoints. How do I run the inference stage/forward pass of the loaded model?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.