openai / pixel-cnn Goto Github PK

Code for the paper "PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications"

Home Page: https://arxiv.org/abs/1701.05517

License: Other

Python 100.00%

paper

pixel-cnn's Introduction

Status: Archive (code is provided as-is, no updates expected)

pixel-cnn++

This is a Python3 / Tensorflow implementation of PixelCNN++, as described in the following paper:

PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications, by Tim Salimans, Andrej Karpathy, Xi Chen, Diederik P. Kingma, and Yaroslav Bulatov.

Our work builds on PixelCNNs that were originally proposed in van der Oord et al. in June 2016. PixelCNNs are a class of powerful generative models with tractable likelihood that are also easy to sample from. The core convolutional neural network computes a probability distribution over a value of one pixel conditioned on the values of pixels to the left and above it. Below are example samples from a model trained on CIFAR-10 that achieves 2.92 bits per dimension (compared to 3.03 of the PixelCNN in van der Oord et al.):

Samples from the model (left) and samples from a model that is conditioned on the CIFAR-10 class labels (right):

This code supports multi-GPU training of our improved PixelCNN on CIFAR-10 and Small ImageNet, but is easy to adapt for additional datasets. Training on a machine with 8 Maxwell TITAN X GPUs achieves 3.0 bits per dimension in about 10 hours and it takes approximately 5 days to converge to 2.92.

Setup

To run this code you need the following:

a machine with multiple GPUs
Python3
Numpy, TensorFlow and imageio packages:

pip install numpy tensorflow-gpu imageio

Training the model

Use the train.py script to train the model. To train the default model on CIFAR-10 simply use:

python3 train.py

You might want to at least change the --data_dir and --save_dir which point to paths on your system to download the data to (if not available), and where to save the checkpoints.

I want to train on fewer GPUs. To train on fewer GPUs we recommend using CUDA_VISIBLE_DEVICES to narrow the visibility of GPUs to only a few and then run the script. Don't forget to modulate the flag --nr_gpu accordingly.

I want to train on my own dataset. Have a look at the DataLoader classes in the data/ folder. You have to write an analogous data iterator object for your own dataset and the code should work well from there.

Pretrained model checkpoint

You can download our pretrained (TensorFlow) model that achieves 2.92 bpd on CIFAR-10 here (656MB).

Citation

If you find this code useful please cite us in your work:

@inproceedings{Salimans2017PixeCNN,
  title={PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications},
  author={Tim Salimans and Andrej Karpathy and Xi Chen and Diederik P. Kingma},
  booktitle={ICLR},
  year={2017}
}

pixel-cnn-rotations

pixel-cnn's People

Contributors

Stargazers

Watchers

Forkers

adaveiitm ml-lab wanjinchang stevenlol ilovecv vyraun johndpope benjamesbabala xennygrimmato yiboin synpon codeaudit tonydeep jdc08161063 yhkim8412 kekedan fujianhai gzzgz chirayukong miltanchoudhury canbuoy mrgoogol kgl-prml waleedgondal yenchenlin jialrs co9olguy soroushmehr krasin awesome-archive vikkamath chuckcho redrumsherlock center1 yenchih ssurui baiyancheng20 v-italy ruslannurimbetov pjpan amos-zq dreadlord1984 beyondselves tpys laucyun lmaxwell knhuq mehramoh xiaojingyi leezqcst adrianlsk showly billy-addepar sxjscience szhaomsft 32l pbaljeka rishunny qinghuizhao dmellop salehinejad pukkapies fuxicv allensmile lamhocn donvi meshiguge shriphani demonzyj56 wookee3 xhuvom dima1034 fritexvz wavp klonikar loliverhennigh cordeirolibel timothyl96 xjwxjw navallo mldl ermongroup namghiwook lengzi dave1453629500 cheneyfan prichemond shdut danmcduff ngchc rt416 roszcz lucas2012 cypw skobets sandeepganji studanshu ajaytalati soledad89 ruanaragao

pixel-cnn's Issues

Windows: No one_hot kernel in tensorflow-gpu-v1.0

On windows, while running the code as it is provided I received the following error:
Cannot assign a device to node 'one_hot': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.

There were some problems with one_hot kernel in tensorflow and believe that it is disabled in tensorflow version 1.0 ( tensorflow/tensorflow#6822 )

Tensorflow: tensorflow-gpu 1.0.1, tensorflow-gpu 1.1.0rc0
OS: Windows 10, CUDA: 8.0, CUDNN: 5.1, Python 3.5, NVIDIA GTX 1080

Possible workaround is to allow soft-placement, but I am not sure how big impact on performance it has.

The output of conv2d should be updated after g and b are updated in data dependent initialization.

The initial values for g and b are used to keep the pre-activation values normally-distributed. After the tf.assign operation for g and b, the output of the current conv2d layer is changed, so the input to the next layer is changed. I think the initialization of g and b for the next layer should depends on the new conv2d output.
So I think the customized conv2d in nn.py should be modified as the following

def conv2d(x_, num_filters, filter_size=[3, 3], stride=[1, 1], pad='SAME', nonlinearity=None, init_scale=1., counters={},
           init=False, ema=None, **kwargs):
    ''' convolutional layer '''
    name = get_name('conv2d', counters)
    with tf.variable_scope(name):
        V = get_var_maybe_avg('V', ema, shape=filter_size + [int(x.get_shape()[-1]), num_filters], dtype=tf.float32,
                              initializer=tf.random_normal_initializer(0, 0.05), trainable=True)
        g = get_var_maybe_avg('g', ema, shape=[num_filters], dtype=tf.float32,
                              initializer=tf.constant_initializer(1.), trainable=True)
        b = get_var_maybe_avg('b', ema, shape=[num_filters], dtype=tf.float32,
                              initializer=tf.constant_initializer(0.), trainable=True)

        # use weight normalization (Salimans & Kingma, 2016)
        W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])

        # calculate convolutional layer output
        x = tf.nn.bias_add(tf.nn.conv2d(x_, W, [1] + stride + [1], pad), b)

        if init:  # normalize x
            m_init, v_init = tf.nn.moments(x, [0, 1, 2])
            scale_init = init_scale / tf.sqrt(v_init + 1e-10)
            with tf.control_dependencies([g.assign(g * scale_init), b.assign_add(-m_init * scale_init)]):
                # x = tf.identity(x)
                W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])
                x = tf.nn.bias_add(tf.nn.conv2d(x_, W, [1] + stride + [1], pad), b)

        # apply nonlinearity
        if nonlinearity is not None:
            x = nonlinearity(x)

        return x

why your code so weird?

Why, why your network architecture are so weird? What if using an normal network not so weird? I tried it but failed. I don't know why.

Object_detection_image

i use tensorflow 1.14 with version of cuda and cnn 10 in virtual enviroment with anaconda and follow the steps on https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
i train the model but when i want to validate with image i have the issues

Traceback (most recent call last):
File "C:\Users\Tensorflow\models\research\object_detection\Object_detection_image.py", line 96, in
(boxes, scores, classes, num) = sess.run([detection_boxes, detection_scores, detection_classes, num_detections],feed_dict={image_tensor: image_expanded})
File "D:\Users\Cristian\Miniconda3\envs\tf14\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "D:\Users\Cristian\Miniconda3\envs\tf14\lib\site-packages\tensorflow\python\client\session.py", line 1142, in _run
np_val = np.asarray(subfeed_val, dtype=subfeed_dtype)
File "D:\Users\Cristian\Miniconda3\envs\tf14\lib\site-packages\numpy\core\numeric.py", line 538, in asarray
return array(a, dtype, copy=False, order=order)
TypeError: int() argument must be a string, a bytes-like object or a number, not 'NoneType'

Not understand plus_in = inv_stdv * (centered_x + 1./255.) in discretized_mix_logistic_loss

In file nn.py:
Could anybody help me out explain why here is + 1./255. not 0.5 as paper equation (2) ?

60.    centered_x = x - means
61.    inv_stdv = tf.exp(-log_scales)
62.    plus_in = inv_stdv * (centered_x + 1./255.)
63.    cdf_plus = tf.nn.sigmoid(plus_in)
64.    min_in = inv_stdv * (centered_x - 1./255.)
65.    cdf_min = tf.nn.sigmoid(min_in)

How are weights trained?

Dear friends

I would like to know how the weights are trained. because in the code the weights are already loaded for the current session.

Thank you for your help

about the shift operation

closed

Understanding edge cases

Just wanted to make sure my understanding is right. Looking at the formulas, on the edge case, maximizing the probability for x=-1 is maximizing this function:
logProb(μ, σ) = (-1 - μ)/σ - log(1 + e^((-1 - μ)/σ))
Which means for optimality μ->-infinity
And this work because at inference time the predicted values are clipped between the effective pixel range?
How big is the impact on the neighboring edge values for which the optimal μ is -0.997 etc ?

Training pixelCNN on tiny imagenet dataset

arg_scope

Not need to copy arg_scope code, now it is part of tf.contrib.framework

You can use
@tf.contrib.framework.add_arg_scope and tf.contrib.framework.arg_scope()

what do these lines work for

x_pad = tf.concat([x,tf.ones(xs[:-1]+[1])],3) # add channel of ones to distinguish image from padding later on
u_list = [nn.down_shift(nn.down_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[2, 3]))] # stream for pixels above
ul_list = [nn.down_shift(nn.down_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[1,3])) + nn.right_shift(nn.down_right_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[2,1]))]

Pretrained model download link broken

The pretrained link http://alpha.openai.com/pxpp.zip seems broken. The hostname alpha.openai.com cannot be resolved.

Could you please fix the link or provide another mirror? Thank you!

PixelCNN source code

Hello, anybody has PixelCNN source code? I'd like to have a run. So tell me, please.

How to run pretrained model checkpoint.

anyone knows how to use checkpoint ?
There is no code to test, so can't run forward!

Will there be a torch implementation of it?

Will there be a torch implementation of pixel-cnn?

Thanks.

Learning curve isn't discontinuous when i continued training it on a pre-trained model

I stopped training after 60 epoch. Then i restored the model and run the code again, but the NLL at the beginning of the 2nd training was much larger than the value before i stopped training. In other words, the model can't restore the status before. Is there some temporary variables losing in the model or something else?

nr_gpus

What's the fewest number of GPUs you can get away with? I have access to 2 and I keep getting OOM errors (I tried batch-sizes 32 and 16). The default value for this param seems to be 8. Are you guys using 8 or have you been able to make it work with fewer?

Best,
Shriphani

GPU error & other things

Hi.I got a problem here:
When I run train.py,error exists:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1356, in _do_call
return fn(*args)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1339, in _run_fn
self._extend_graph()
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1374, in _extend_graph
tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation model_1/ones: {{node model_1/ones}}was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:1, /job:localhost/replica:0/task:0/device:XLA_GPU:2, /job:localhost/replica:0/task:0/device:XLA_GPU:3 ]. Make sure the device specification refers to a valid device.
[[model_1/ones]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "train.py", line 203, in
sess.run(initializer)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1350, in _do_run
run_metadata)
File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation model_1/ones: node model_1/ones (defined at /home/CYM/pixel-cnn/pixel_cnn_pp/model.py:36) was explicitly assigned to /device:GPU:0 but available devices are [ /job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:XLA_CPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:0, /job:localhost/replica:0/task:0/device:XLA_GPU:1, /job:localhost/replica:0/task:0/device:XLA_GPU:2, /job:localhost/replica:0/task:0/device:XLA_GPU:3 ]. Make sure the device specification refers to a valid device.
[[model_1/ones]]
should I reinstall the tensorflow-gpu? I don't know why.
and another problem in train.py ,line 172:
x = np.cast[np.float32]((x - 127.5) / 127.5) # input to pixelCNN is scaled from uint8 [0,255] to float in range [-1,1]
you say "input to pixelCNN is scaled from uint8 [0,255] to float in range [-1,1]"
why do this operation? can you explain
thx!

Memory cost too much and do not start to train when using tensorflow1.4

when training to cifar,it consist to increasing costed memory and do not begin to train.
I have a 2*16G RAM.It's seem to be enough.

1*GTX1080
cudnn6
tensorflow1.4
python3.5

not understand "log probability in the center of the bin"

Could anybody explain the default values to be used in extreme cases.

68    # log probability in the center of the bin, to be used in extreme cases
69    # (not actually used in our code)
70    log_pdf_mid = mid_in - log_scales - 2. * tf.nn.softplus(mid_in)

How to understand the softmax sampling

Hi,

In the code, logistic distribution sampling is done by:

    # sample mixture indicator from softmax
    sel = tf.one_hot(tf.argmax(logit_probs - tf.log(-tf.log(tf.random_uniform(
        logit_probs.get_shape(), minval=1e-5, maxval=1. - 1e-5))), 3), depth=nr_mix, dtype=tf.float32)

I could not understand how it it done this way.

Do I need to compute the CDF of pdf, and then for cdf, minus a random unifrom number, and take log option?

Why here minus a random uniform number make sense?

Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):

i follow the instruction of
https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10
Tensorflow 1.14-gpu
cuda10
cnn10.1
but when i used python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_coco.config
i have the problem

E1130 23:32:13.621114 22540 tf_should_use.py:71] ==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'init_ops/report_uninitialized_variables/boolean_mask/GatherV2:0' shape=(?,) dtype=string>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
File "train.py", line 184, in
tf.app.run() File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\platform\app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef) File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\absl\app.py", line 321, in run
raise File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\absl\app.py", line 250, in _run_main
sys.exit(main(argv)) File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func
return func(*args, **kwargs) File "train.py", line 180, in main
graph_hook_fn=graph_rewriter_fn) File "C:\Users\tensorflow\models\research\object_detection\legacy\trainer.py", line 416, in train
saver=saver) File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py", line 796, in train
should_retry = True File "D:\Users\Cristian\Miniconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\util\tf_should_use.py", line 193, in wrapped
return _add_should_use_warning(fn(*args, **kwargs))

Logistic output layer training slowly

Hi,

As the paper mentioned, the Logistic output layer aim to reduce the computation complexity of Softmax output layer.

But I found that using the Logistic output layer, the model convergence slowly for my network, not piexlcnn, a different network.

The logistic output layer works for my network, but it really take a more longer time for the model to convergence.

Is there any suggestion for the convergence of Logistic output layer?

The logistic quantization is more than 256, say 2^12 or 2^16.

Gated unit formulation

Original PixelCNN paper (https://arxiv.org/pdf/1606.05328.pdf) uses gated unit defined as tanh(a) * sigmoid(b). Same formulation of gated unit is used in the wavenet paper. Yet here you switched to gated unit defined as a*sigmoid(b) https://github.com/openai/pixel-cnn/blob/master/pixel_cnn_pp/nn.py#L286-L287. Is there a reason for this? Does it train better with this nonlinearity?

Pretrained model not available

Hi,
The link for pretrained model (http://alpha.openai.com/pxpp.zip) is not available. Or is it only true for Russia?

> curl http://alpha.openai.com/pxpp.zip
curl: (6) Could not resolve host: alpha.openai.com

Thank you

Speeding up image generation up to 183x

As currently implemented, generation speed significantly drops for bigger batch sizes (i.e. ~11 minutes for 16 images on a K40). Some of my friends and I recently worked on improving generation speed through smart caching. We are able to achieve up to a 183x speedup for generation. The repo and our ICLR workshop submission has more details.

Is there any interest in merging some of these changes into the main repository?

Any plans to release pre-trained Small ImageNet model?

This Repo is a great resource. Thank you very much.

I notice the readme mentions results on Small ImageNet . Are there any plans to release a pre-trained model as you've done with CIFAR 10, please? What does the ImageNet model achieve in Bits Per Dim on the validation set?

7000 loss value when training cifar10

Hello,
I am trying to train cifar10 with the nn conv2d, and dense layers, and with initialization I got 7229.39 loss at the first step (and after a while this is just 5000). I am training with the same model architecture proposed in the weight norm article (Tim Salimans, Kingma). However with older nn dense and conv2d implementation this does not happen (also when I am not using the initialization). This is the implementation I use and got proper loss (around 2.3 at the first steps):

   def conv2d(x, num_filters, filter_size=[3,3],  pad='SAME', stride=[1,1], nonlinearity=None, init_scale=1., init=False, name=''):
    with tf.variable_scope(name):
        V = tf.get_variable('V', shape=filter_size+[int(x.get_shape()[-1]),num_filters], dtype=tf.float32,
                              initializer=tf.random_normal_initializer(0, 0.05), trainable=True)
        g = tf.get_variable('g', shape=[num_filters], dtype=tf.float32,
                              initializer=tf.constant_initializer(1.), trainable=True)
        b = tf.get_variable('b', shape=[num_filters], dtype=tf.float32,
                              initializer=tf.constant_initializer(0.), trainable=True)
        
        
        if init:  # normalize x
            v_norm = tf.nn.l2_normalize(V,[0,1,2])
            x = tf.nn.conv2d(x, v_norm, strides=[1] + stride + [1],padding=pad)
            m_init, v_init = tf.nn.moments(x, [0,1,2])
            scale_init=init_scale/tf.sqrt(v_init + 1e-08)
            g = g.assign(scale_init)
            b = b.assign(-m_init*scale_init)
            x = tf.reshape(scale_init,[1,1,1,num_filters])*(x-tf.reshape(m_init,[1,1,1,num_filters]))
        else:
            W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])
            
            # calculate convolutional layer output
            x = tf.nn.bias_add(tf.nn.conv2d(x, W, [1] + stride + [1], pad), b)
            
        
        # apply nonlinearity
        if nonlinearity is not None:
            x = nonlinearity(x)

        return x

I've investigated the most recent implementation of nn (dense and conv2d layers). With this implementation on a tiny example the mean is just 0.001, and the variance is 0.95. With the code above I got -10^-7 and and 1.0005. Is that I am missing something here, the code in the nn library does not do the same as the code above?
Here is the demo code I used for test:

import tensorflow as tf
import numpy as np

sess = tf.Session()

padding='SAME'
init=True
num_filters=96
filter_size=[3,3]
stride=[1,1]
init_scale=1.
pad='SAME'
x = tf.get_variable('x',shape=[100,32,32,3],dtype=tf.float32,
                      initializer=tf.random_normal_initializer(0,1.0), trainable=True)
V = tf.get_variable('V', shape=filter_size+[int(x.get_shape()[-1]),num_filters], dtype=tf.float32,
                      initializer=tf.random_normal_initializer(0, 0.05), trainable=True)
g = tf.get_variable('g', shape=[num_filters], dtype=tf.float32,
                      initializer=tf.constant_initializer(1.), trainable=True)
b = tf.get_variable('b', shape=[num_filters], dtype=tf.float32,
                      initializer=tf.constant_initializer(0.), trainable=True)

# use weight normalization (Salimans & Kingma, 2016)
W = tf.reshape(g, [1, 1, 1, num_filters]) * tf.nn.l2_normalize(V, [0, 1, 2])

# calculate convolutional layer output
x = tf.nn.bias_add(tf.nn.conv2d(x, W, [1] + stride + [1], pad), b)

if init:  # normalize x
    m_init, v_init = tf.nn.moments(x, [0,1,2])
    scale_init = init_scale / tf.sqrt(v_init + 1e-10)
    with tf.control_dependencies([g.assign(g * scale_init), b.assign_add(-m_init * scale_init)]):
        x = tf.identity(x)


init = tf.global_variables_initializer()

sess.run(init)

# mean and var should be zero and unit after initialization
a = sess.run(x)
print np.mean(a)
print np.var(a)
sess.close()

Also I don't understand why in the code it is assign_add instead of assign. I do think the steps before initialization are happening before the assign, so the moments computed in the init step not from t=V*x/||V|| but from the output of the layer. I assume that the whole initialization step is scaled by g, and the bias.

unexpected keyword argument 'keepdims'

when running train.py i get the following issue:

C:\Users\cknau\Downloads\pixel-cnn-master\pixel-cnn-master>python train2.py
input args:
{
"data_dir":"D:\PixelCNN\dataset",
"save_dir":"D:\PixelCNN\samples",
"data_set":"cifar",
"save_interval":20,
"load_params":false,
"nr_resnet":5,
"nr_filters":160,
"nr_logistic_mix":10,
"resnet_nonlinearity":"concat_elu",
"class_conditional":false,
"energy_distance":false,
"learning_rate":0.001,
"lr_decay":0.999995,
"batch_size":16,
"init_batch_size":16,
"dropout_p":0.5,
"max_epochs":5000,
"nr_gpu":8,
"polyak_decay":0.9995,
"num_samples":1,
"seed":1
}
Traceback (most recent call last):
File "train2.py", line 120, in
loss_gen.append(loss_fun(tf.stop_gradient(xs[i]), out))
File "C:\Users\cknau\Downloads\pixel-cnn-master\pixel-cnn-master\pixel_cnn_pp\nn.py", line 83, in discretized_mix_logistic_loss
log_probs = tf.reduce_sum(log_probs,3) + log_prob_from_logits(logit_probs)
File "C:\Users\cknau\Downloads\pixel-cnn-master\pixel-cnn-master\pixel_cnn_pp\nn.py", line 27, in log_prob_from_logits
m = tf.reduce_max(x, axis, keepdims=True)
TypeError: reduce_max() got an unexpected keyword argument 'keepdims'

Anyone know what's going on here. I have NumPy 1.13, so that's not the issue.

How many epochs to train for CIFAR-10

Hi,
I'm wondering how many epochs should one train PixcelCNN++ from scratch on CIFAR-10 o get a reasonable result (e.g., Inception score around ~5.5)?
I can see in the train.py, the max_epochs is defaulted to 5000 epochs. Does one actually need to train that much?

why has to change to discretisized loss function?

Really don't understand why it is needed to change logistic's pdf to cdf (which is sigmoid)... can't it be estimated directly by using logistic's pdf?
is it some stablization consideration?

Best

how to understand the loss function corner case for 0 and 255

Hi,

In the loss function, the code give a special value to value 0 and 255:

    # log probability for edge case of 0 (before scaling)
    log_cdf_plus = plus_in - tf.nn.softplus(plus_in)
    # log probability for edge case of 255 (before scaling)
    log_one_minus_cdf_min = -tf.nn.softplus(min_in)

I could not understand these code, how could me understand it?

PixelCNN at test time

Hi,

I have an issue with model performance at test time. When running the provided code I get the following log:

Iteration 0, time = 2875s, train bits_per_dim = 4.3207, test bits_per_dim = 9.4122
Iteration 1, time = 2832s, train bits_per_dim = 3.8210, test bits_per_dim = 9.4122
Iteration 2, time = 2833s, train bits_per_dim = 3.6672, test bits_per_dim = 9.4122

My guess is that you have accidentally removed the call of "maintain_averages_op" inside the training cycle. As code search suggests this op is never used:
https://github.com/openai/pixel-cnn/search?utf8=%E2%9C%93&q=maintain_averages_op

How did you use conditional class label h

Hi, for conditional generation based on class labels, did you use h in every gated_resnet block?

"Object was never used" errors after upgrading to Tensorflow 1.2

After upgrading to Tensorflow 1.2, running train.py produces many (567) of the following error:

ERROR:tensorflow:==================================
Object was never used (type <class 'tensorflow.python.framework.ops.Tensor'>):
<tf.Tensor 'model_1/conv2d_0/stack:0' shape=(3,) dtype=int32>
If you want to mark it as used call its "mark_used()" method.
It was originally created here:
['File "train.py", line 135, in <module>\n    dropout_p=args.dropout_p, **model_opt)', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 261, in __call__\n    return self._call_func(args, kwargs, check_for_new_variables=True)', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/template.py", line 217, in _call_func\n    result = self._func(*args, **kwargs)', 'File "/media/bruce/MoreData/malcolm/tpc/base-pixel-cnn/pixel_cnn_pp/model.py", line 40, in model_spec\n    x_pad, num_filters=nr_filters, filter_size=[2, 3]))]  # stream for pixels above', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args\n    return func(*args, **current_args)', 'File "/media/bruce/MoreData/malcolm/tpc/base-pixel-cnn/pixel_cnn_pp/nn.py", line 358, in down_shifted_conv2d\n    return conv2d(x, num_filters, filter_size=filter_size, pad=\'VALID\', stride=stride, **kwargs)', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args\n    return func(*args, **current_args)', 'File "/media/bruce/MoreData/malcolm/tpc/base-pixel-cnn/pixel_cnn_pp/nn.py", line 238, in conv2d\n    tf.assert_variables_initialized([V, g, b])', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 170, in wrapped\n    return _add_should_use_warning(fn(*args, **kwargs))', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 139, in _add_should_use_warning\n    wrapped = TFShouldUseWarningWrapper(x)', 'File "/home/malcolm/data/anaconda3/lib/python3.6/site-packages/tensorflow/python/util/tf_should_use.py", line 96, in __init__\n    stack = [s.strip() for s in traceback.format_stack()]']
==================================

It still seems to run OK after printing the errors, though. The culprit seems to be this line, which appears three times in pixel_cnn_pp/nn.py:

tf.assert_variables_initialized([V, g, b])

It seems that this function expects its return value to be used, and that TF 1.2 introduced a check to print an error if the return value isn't used. As a workaround, commenting out this line (in each place it appears) removes the errors.

filter sizes

Hi,

Thank you a lot for sharing the code for your paper. It's really helpful.

I understand that the downward stream covers the vertical dependencies (i.e. vertical stack from the van den Oord paper), but I'm not sure how the downward + right stream works:
If I'm correct, the downward + right stream uses filter_size [2,2] between res blocks. Why is the vertical part needed? would a filter_size [1,2] not cover all the horizontal dependencies?

and does the padded channel of ones from the beginning act like a bias term?

Thank you

Training not starting for 64x64x3 images

Hi, thank you for this TensorFlow implementation of PixelCNN.

When I try to use images of size 64x64x3, the TensorFlow graph compiles without errors but the training does not start. The only thing that happens is that "starting training" is written to stdout, then nothing happens.
When I use the Linux "top" command, the Python process is shown with ~3 % CPU usage. "nvidia-smi" shows that TensorFlow has memory allocated but that there is nothing being computed on the GPUs.
I don't receive any error messages.

I can also invoke this behavior by simply adding the following lines to "./data/cifar10_data.py" in the __init__ function, right after self.data = np.transpose(self.data, (0,2,3,1)):

self.data = np.tile(self.data,[1,2,2,1]) # (N,3,32,32) -> (N,64,64,3)

About my machine:
CentOS Linux release 7.3.1611

2 x GeForce GTX 1080 Ti
Nvidia Driver 381.22

Python: 3.4.5 (default, Nov 9 2016, 16:24:59)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-11)]

TensorFlow version 1.1.0
Numpy version 1.12.1

Thank you in advance.

The code needs matplotlib, scipy, Pillow library

Please add these two 'matplotlib', 'scipy', 'Pillow' on 'Setup' list. :)

about pixel-cnn/model.py, line38

Hi, thanks for the code!

I have a question about pixel-cnn/model.py, line38. It's appending ones to the tensor x.
But I'm wondering how this is used -- it seems that after the "convolution(x_pad)" the ones channel is already disappeared. And I didn't find the corresponding code for using the ones channel to distinguish the image out. And more is that since pixelcnn is doing shifting correction for every time, it seems that there's no need for further distinguishing image x.

samples

After training, the network parameters are fixed. So each time the generation starts from an all zero tensor, the output is the same, so the network generates the same pixels all the time. I checked the function sample_from_model(), there is no randomness. So my question is how to make the network generate different pictures while starting from the same all zero tensor？

tf.train.import_meta_graph throws a ValueError

Description

I downloaded the pre-trained model here ( provided in README ) and tried loading the model. But I was not able to import the graph definitions from the meta file.

Environment info

Operating System: Ubuntu 14.04.3 LTS
Installed version of CUDA: 8.0.0
pip package: https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-0.12.1-cp34-cp34m-linux_x86_64.whl
Tensorflow version ( import tensorflow as tf; print(tf.VERSION); ): 0.12.1

Code to reproduce

import os
import sys
import time
import json
import argparse

import numpy as np
import tensorflow as tf

# Model downloaded to models folder
saver = tf.train.import_meta_graph("models/params_cifar.ckpt.meta")

# Throws error
# ValueError: Shape must be at most rank 1 but is rank 2 for 'model/conv2d_0/l2_normalize/Sum' (op: 'Sum') with input shapes: [2,3,4,160], [1,3].

Trace

$ python3 snips/load_model.py 
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/common_shapes.py", line 670, in _call_cpp_shape_fn_impl
    status)
  File "/usr/lib/python3.4/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be at most rank 1 but is rank 2 for 'model/conv2d_0/l2_normalize/Sum' (op: 'Sum') with input shapes: [2,3,4,160], [1,3].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "snips/load_model.py", line 11, in <module>
    saver = tf.train.import_meta_graph("models/params_cifar.ckpt.meta")
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/training/saver.py", line 1526, in import_meta_graph
    **kwargs)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/meta_graph.py", line 502, in import_scoped_meta_graph
    producer_op_list=producer_op_list)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/importer.py", line 380, in import_graph_def
    ops.set_shapes_for_outputs(op)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1617, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/ops.py", line 1568, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/common_shapes.py", line 610, in call_cpp_shape_fn
    debug_python_shape_fn, require_shape_fn)
  File "/usr/local/lib/python3.4/dist-packages/tensorflow/python/framework/common_shapes.py", line 675, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Shape must be at most rank 1 but is rank 2 for 'model/conv2d_0/l2_normalize/Sum' (op: 'Sum') with input shapes: [2,3,4,160], [1,3].

How to run pretrained model checkpoint.

Training fails with other datasets

Hi,

I would like to ask about training PixelCNN++ with other data set.

Do we just need to change hyper parameters when we train PixelCNN++ with other data set, or do we need to add more regularization, or something else?

I tested with SVHN data set, but they fails to learn.
There seems to be an overfitting problem, even though dropout is enable in training.

I tried changing the initial learning rate and learning rate pattern, but it still doesn't help.

error while training

Hi,

this is the error (first line is the run command) [images I use are jpg]:

python3 train.py --data_dir=/path/to/my/images --save_dir=./
input args:
 {
    "max_epochs":5000,
    "num_samples":1,
    "dropout_p":0.5,
    "nr_gpu":8,
    "save_dir":"./",
    "save_interval":20,
    "seed":1,
    "data_set":"cifar",
    "data_dir":"/path/to/my/images",
    "nr_resnet":5,
    "nr_logistic_mix":10,
    "batch_size":16,
    "lr_decay":0.999995,
    "energy_distance":false,
    "class_conditional":false,
    "polyak_decay":0.9995,
    "resnet_nonlinearity":"concat_elu",
    "init_batch_size":16,
    "load_params":false,
    "learning_rate":0.001,
    "nr_filters":160
}
>> Downloading cifar-10-python.tar.gz 100.0%
Successfully downloaded cifar-10-python.tar.gz 170498071 bytes.
Traceback (most recent call last):
  File "train.py", line 103, in <module>
    init_pass = model(x_init, h_init, init=True, dropout_p=args.dropout_p, **model_opt)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/template.py", line 278, in __call__
    result = self._call_func(args, kwargs, check_for_new_variables=False)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/ops/template.py", line 217, in _call_func
    result = self._func(*args, **kwargs)
  File "/home/ubuntu/enhancer/pixel-cnn/pixel_cnn_pp/model.py", line 37, in model_spec
    u_list = [nn.down_shift(nn.down_shifted_conv2d(x_pad, num_filters=nr_filters, filter_size=[2, 3]))] # stream for pixels above
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
    return func(*args, **current_args)
  File "/home/ubuntu/enhancer/pixel-cnn/pixel_cnn_pp/nn.py", line 303, in down_shifted_conv2d
    return conv2d(x, num_filters, filter_size=filter_size, pad='VALID', stride=stride, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/contrib/framework/python/ops/arg_scope.py", line 181, in func_with_args
    return func(*args, **current_args)
  File "/home/ubuntu/enhancer/pixel-cnn/pixel_cnn_pp/nn.py", line 210, in conv2d
    x = tf.nn.l2_normalize(x, axis=[0,1,2])
TypeError: l2_normalize() got an unexpected keyword argument 'axis'

originally defined at:
  File "train.py", line 100, in <module>
    model = tf.make_template('model', model_spec)

Does the implementation satisfy the causailty constraint?

Hi,

The idea behind autoregressive modeling (pixelcnn, pixelcnn++,...) is that pixels are generated sequentially depending on the previous pixels. it is so called causality constraint.

with up and down sampling, i think the constraint is not satisfied.

It could be because of up and downsampling layers in your network.

Please correct me if I am wrong.

Training results in NaN (commit 2b03725)

Running train.py --nr_gpu=1 results in NaN for the loss from bits_per_dim on tensorflow 1.5.0.
With some API changes (keepdims -> keep_dims, axis= -> ) the same NaN occurs on tensorflow 1.4.x. Disabling the data driven initialization of the weight normalization (commenting out sess.run(init_pass, feed_dict)) however does not result in NaN loss.
Any clues?

Configuration setup for executing the implementation

Here's an example setup I'm using to execute the program:

Python 3, Cuda 7.5, Tensorflow 0.11, #GPUs = 2

Everytime I'm hitting this error:

Traceback (most recent call last):
  File "train.py", line 120, in <module>
    dropout_p=args.dropout_p, **model_opt)
...........
  File "/home/users/mrinmoy/Venv/p3-tf3-gpu/lib/python3.4/site-packages/tensorflow/python/framework/tensor_util.py", line 290, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got list containing Tensors of type '_Message' instead.

originally defined at:
  File "train.py", line 116, in <module>
    model = tf.make_template('model', model_spec)

Seems like its Tensorflow specific, but the authors did not mention any version dependency on Tensorflow. Any work around?

is it possible to load the model in cpu and just do a forward pass on an image?

Hi, thanks for sharing this. I was wondering if it's possible to load the model on cpu and just do a simple forward operation on an image. I've tried the following

import tensorflow as tf
sess = tf.Session()
model = tf.train.import_meta_graph('params.cifar.cpkt.meta')
model.restore(sess, 'params_cifar.cpkt')

and I get the following error

E tensorflow/core/client/tensor_c_api.cc:485] Cannot assign a device to node 'Variable_1134': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
Colocation Debug Info:
Colocation group had the following types and devices:
Identity: CPU
Assign: CPU
Variable: CPU
	 [[Node: Variable_1134 = Variable[container="", dtype=DT_FLOAT, shape=[100], shared_name="", _device="/device:GPU:0"]()]]
---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-22-13b033dc4e64> in <module>()
----> 1 new_saver.restore(sess, 'params_cifar.ckpt')

/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.pyc in restore(self, sess, save_path)
   1127       raise ValueError("Restore called with invalid save path %s" % save_path)
   1128     sess.run(self.saver_def.restore_op_name,
-> 1129              {self.saver_def.filename_tensor_name: save_path})
   1130
   1131   @staticmethod

/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
    380     try:
    381       result = self._run(None, fetches, feed_dict, options_ptr,
--> 382                          run_metadata_ptr)
    383       if run_metadata:
    384         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
    653     movers = self._update_with_movers(feed_dict_string, feed_map)
    654     results = self._do_run(handle, target_list, unique_fetches,
--> 655                            feed_dict_string, options, run_metadata)
    656
    657     # User may have fetched the same tensor multiple times, but we

/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
    721     if handle is None:
    722       return self._do_call(_run_fn, self._session, feed_dict, fetch_list,
--> 723                            target_list, options, run_metadata)
    724     else:
    725       return self._do_call(_prun_fn, self._session, handle, feed_dict,

/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
    741         except KeyError:
    742           pass
--> 743       raise type(e)(node_def, op, message)
    744
    745   def _extend_graph(self):

InvalidArgumentError: Cannot assign a device to node 'Variable_1134': Could not satisfy explicit device specification '/device:GPU:0' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0
Colocation Debug Info:
Colocation group had the following types and devices:
Identity: CPU
Assign: CPU
Variable: CPU
	 [[Node: Variable_1134 = Variable[container="", dtype=DT_FLOAT, shape=[100], shared_name="", _device="/device:GPU:0"]()]]
Caused by op u'Variable_1134', defined at:
  File "/home/user/miniconda2/bin/ipython", line 6, in <module>
    sys.exit(IPython.start_ipython())
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/__init__.py", line 119, in start_ipython
    return launch_new_instance(argv=argv, **kwargs)
  File "/home/user/miniconda2/lib/python2.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/terminal/ipapp.py", line 348, in start
    self.shell.mainloop()
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 455, in mainloop
    self.interact()
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/terminal/interactiveshell.py", line 446, in interact
    self.run_cell(code, store_history=True)
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "/home/user/miniconda2/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-15-d6797432c2fa>", line 1, in <module>
    new_saver = tf.train.import_meta_graph('params_cifar.ckpt.meta')
  File "/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1458, in import_meta_graph
    return _import_meta_graph_def(read_meta_graph_file(meta_graph_or_file))
  File "/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1348, in _import_meta_graph_def
    producer_op_list=producer_op_list)
  File "/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/importer.py", line 279, in import_graph_def
    op_def=op_def)
  File "/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2310, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/user/miniconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1232, in __init__
    self._traceback = _extract_stack()

Dead link to pretrained model.

In README.md there is a section with a link to pre-trained models:

Clicking the link leads me to

The page https://alpha.openai.com/ also turns up dead, but https://openai.com/ works fine.