Giter VIP home page Giter VIP logo

wgan-gp's Introduction

WGAN-GP

An pytorch implementation of Paper "Improved Training of Wasserstein GANs".

Prerequisites

Python, NumPy, SciPy, Matplotlib A recent NVIDIA GPU

A latest master version of Pytorch

Progress

  • gan_toy.py : Toy datasets (8 Gaussians, 25 Gaussians, Swiss Roll).(Finished in 2017.5.8)

  • gan_language.py : Character-level language model (Discriminator is using nn.Conv1d. Generator is using nn.Conv1d. Finished in 2017.6.23. Finished in 2017.6.27.)

  • gan_mnist.py : MNIST (Running Results while Finished in 2017.6.26. Discriminator is using nn.Conv1d. Generator is using nn.Conv1d.)

  • gan_64x64.py: 64x64 architectures(Looking forward to your pull request)

  • gan_cifar.py: CIFAR-10(Great thanks to robotcator)

Results

  • Toy Dataset

    Some Sample Result, you can refer to the results/toy/ folder for details.

    • 8gaussians 154500 iteration

    frame1612

    • 25gaussians 48500 iteration

      frame485

    • swissroll 69400 iteration

    frame694

  • Mnist Dataset

    Some Sample Result, you can refer to the results/mnist/ folder for details.

    mnist_samples_91899

    mnist_samples_91899

    mnist_samples_91899

    mnist_samples_199999

  • Billion Word Language Generation (Using CNN, character-level)

    Some Sample Result after 8699 epochs which is detailed in sample

    I haven't run enough epochs due to that this is very time-comsuming.

    He moved the mat all out clame t

    A fosts of shores forreuid he pe

    It whith Crouchy digcloued defor

    Pamreutol the rered in Car inson

    Nor op to the lecs ficomens o fe

    In is a " nored by of the ot can

    The onteon I dees this pirder ,

    It is Brobes aoracy of " medurn

    Rame he reaariod to thim atreast

    The stinl who herth of the not t

    The witl is f ont UAy Y nalence

    It a over , tose sho Leloch Cumm

  • Cifar10 Dataset

    Some Sample Result, you can refer to the results/cifar10/ folder for details.

    mnist_samples_91899

Acknowledge

Based on the implementation igul222/improved_wgan_training and martinarjovsky/WassersteinGAN

wgan-gp's People

Contributors

caogang avatar robotcator avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wgan-gp's Issues

grad_outputs for gradient penalty

Hi, could you specify the purpose of grad_outputs? It is not clear given the PyTorch Tutorial... Moreover, why it should be torch.ones for penalty item?

After adding self-implemented Layer-Normalization, the backward time of gradient_penalty became large

My implementation of layer-normalization is:

class Layer_Norm(nn.Module):

    def __init__(self, dim):
        super(Layer_Norm, self).__init__()
        self.dim = dim
        self.g = Parameter(torch.zeros(1, dim))
        self.b = Parameter(torch.zeros(1, dim))
        self.init_weights()

    def forward(self, input):
        miu = torch.sum(input, 1).unsqueeze(1)/self.dim
        input_minus_miu = input - miu.expand_as(input)
        sigma = (torch.sum((input_minus_miu).pow(2), 1)/self.dim).sqrt().unsqueeze(1)
        input = input_minus_miu*self.g.expand(input.size())/sigma.expand_as(input) + self.b.expand(input.size())

        return input

    def init_weights(self):
        self.g.data.fill_(1)
        self.b.data.fill_(0)

After plugging in this before ReLU, the backward of gradient_penalty became large 0.1149s compared to the former value 0.0075s.

I compiled the source code from master branch, commit deb0aef30cdaa78f9840bfa4a919ad206e8e73a7 and also modified the ReLU source code before compiling according to your instruction.
I am wondering if it is because the my implementation of layer-normalization contains something not suitable for double backward?

an error occurred

Hello,when I run gan_cifar10.py, an error occurred:
Traceback (most recent call last):
File "gan_cifar10.py", line 187, in
_data = gen.next()
File "gan_cifar10.py", line 170, in inf_train_gen
for images, target in train_gen():
ValueError: too many values to unpack, I don't know how this mistake happened

"autograd" has no attribute 'grad'

hello,
when i try to run the 'gan_minist.py', the problem

... ...
gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates,
AttributeError: 'module' object has no attribute 'grad'"

will arise, how should I tackle this?

mone in generator

Hi,
In the gan training code, since you have already used gen_cost.backward(mone) for generator, why are you still doing gen_cost = -gen_cost

i.e, should it not be

  • just gen_cost.backward(mone)
    or
  • gen_cost.backward() and then gen_cost = gen_cost*-1
    Please correct me if I am wrong.

Some conflict between pytorch and tensorflow when using GPU at the same time

When i run gan_cifar10.py, there is an error occured, as follows:
F tensorflow/stream_executor/cuda/cuda_driver.cc:334] current context was not created by the StreamExecutor cuda_driver API: 0x3af31a0; a CUDA runtime call was likely performed without using a StreamExecutor context

There maybe some conflict use GPU on pytorch and tensorflow at the same time. How could I fix this problem when I run the code in pytorch and evaluate inception_score in tensorflow?

Why can't I train a model by gan_mnist.py?

When I use the command: python gan_mnist.py
it shows the following error, how can I solve this problem?
Thanks a ton!

Traceback (most recent call last): File "gan_mnist.py", line 257, in <module> lib.plot.flush() File "/data2/wangjifeng/wgan-gp/tflib/plot.py", line 35, in flush plt.savefig(name.replace(' ', '_')+'.jpg') File "/root/anaconda2/lib/python2.7/site-packages/matplotlib/pyplot.py", line 697, in savefig res = fig.savefig(*args, **kwargs) File "/root/anaconda2/lib/python2.7/site-packages/matplotlib/figure.py", line 1573, in savefig self.canvas.print_figure(*args, **kwargs) File "/root/anaconda2/lib/python2.7/site-packages/matplotlib/backend_bases.py", line 2252, in print_figure **kwargs) File "/root/anaconda2/lib/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 610, in print_jpg return background.save(filename_or_obj, format='jpeg', **options) File "/root/anaconda2/lib/python2.7/site-packages/PIL/Image.py", line 1672, in save fp = builtins.open(filename, "wb") IOError: [Errno 2] No such file or directory: 'tmp/mnist/wasserstein_distance.jpg'

I install pytorch 0.1.12+ac1c674 and it errors like this.

pytorch/pytorch@ac1c674

Traceback (most recent call last):
  File "gan_toy.py", line 270, in <module>
    gradient_penalty.backward()
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 151, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/__init__.py", line 98, in backward
    variables, grad_variables, retain_graph)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/function.py", line 90, in apply
    return self._forward_cls.backward(self, *args)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/nn/_functions/linear.py", line 23, in backward
    grad_input = torch.mm(grad_output, weight)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 539, in mm
    return self._static_blas(Addmm, (output, 0, 1, self, matrix), False)
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/variable.py", line 532, in _static_blas
    return cls.apply(*(args[:1] + args[-2:] + (alpha, beta, inplace)))
  File "/home/yan/anaconda3/lib/python3.6/site-packages/torch/autograd/_functions/blas.py", line 24, in forward
    matrix1, matrix2, out=output)
TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteT
ensor), but expected one of:
 * (torch.cuda.ByteTensor source, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (torch.cuda.ByteTensor source, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (int beta, torch.cuda.ByteTensor source, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (torch.cuda.ByteTensor source, int alpha, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (int beta, torch.cuda.ByteTensor source, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (torch.cuda.ByteTensor source, int alpha, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
 * (int beta, torch.cuda.ByteTensor source, int alpha, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteTensor
)
 * (int beta, torch.cuda.ByteTensor source, int alpha, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
      didn't match because some of the arguments have invalid types: (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteTensor
)

Any quick fix or just wait for the milestone when stable double backprop have been implemented?

multi-GPU?

It seems can not be trained in multi-GPU. When I do so, it report error "Scatter is not differentiable twice".

sometimes loss is negative during training model

I use Tensorflow to do the work.And my loss:
self.g_loss = -tf.reduce_mean(d_logits_fake)
self.d_loss = tf.reduce_mean(d_logits_fake) - tf.reduce_mean(d_logits_real)+GP
but during traning,I find sometimes g_loss is negative.I don't know why i will get it.Who will explain it

about wgan-gp

i rewrite caogang/wgangp code to generate 512*512 images , i add 2-3 conv2d layers in Dnet and Gnet, but when i run that code ,the loss explode to 30000000, and after i train the model 50k epochs with just one photo, the generated photo is still not good.so i want to know is the model or my learning rate(1e-4) cause this condition
_20181216165115
_20181216165121

Mode Collapse WGAN

Hey there,

After running a similar implementation, I still seem to be getting some degeneracy in the outputs. It seems that I am unable to model the diversity of the actual distribution. What hyperparameters do I need to set to avoid this?

How to use run these code?

I have no Idea of how to us these code . Should the code be used as "cd /root...." and
"gan_cifar10.py" ?

WGAN-gp loss keeps going large

Hello, I've implemented your code on my own dataset. However, the d_loss decreases from 10(which equals to lambda) to a very small negative number(like -10000), the wasserstein distance keeps going to order of million, and the gradient penalty changes from 10 to 0 and then goes to order of thousand. I've worked on this problem for several days but I still can't solve it. Can anyone help me with this?
@caogang

Memory Leak

Hello,

I tried to run the gan_mnist.py file both with the most current master version of pytorch (0.2.0+75bb50b) and with an older commit (0.2.0+c62490b).

With both versions the memory used by the code keeps increasing in each iteration, until the program ends with out of memory error.

When I took only the code of the function calc_gradient_penalty() and integrated it into my code, it caused the same memory leak.

Surprisingly, when a friend took the exact same code and integrated it to cycle gan - it did not cause memory leak.

Do you know what is the problem, or of a specific git revision of pytorch where there is no memory leak?

Why "no required computing gradients"?

I used the same calc_GradientPenalty method as yours and the latest master branch of pytorch('0.1.12+625850c'). But it stuck at penalty.backward() with an error

"RuntimeError: there are no graph nodes that require computing gradients"

. I used requires_gradient = True for the interpolates variable.
Thanks!

gradients.norm(2, dim=1), dim=1?

@caogang Thanks for your good code! But something confuses me in gan_cifar10.py

image

dim = 1? Why it is only normed in the second axis? I think it should be normed across all axis but the batch axis.

Why d_loss is rising

I have running gan_mnist.py, and generated the train_disc_cost.jpg. but Why d_loss is rising?I find the code is :disc_train_op = tf.train.AdamOptimizer(
learning_rate=2e-4,
beta1=0.5
).minimize(disc_cost, var_list=disc_params)
The code is minimize d_loss, so I don't understand result.

status?

I'm trying to implement the improved training algo and am basing my work on your code. I get the error "AttributeError: 'module' object has no attribute 'grad'". Is this because I am using the latest pytorch version but not using pytorch master? I could try to use master, but if it is really true that convnet layers are not yet supported, it is sort of pointless for now(?).

PyTorch version

I just wonder the PyTorch version of this repo. I guest it's 0.2.0?

Also, will this code be upgraded for PyTorch 1.0?

Logic problem with calc_gradient_penalty in CNN case

Right now, you're getting the norm of the gradient in gan_mnist with gradient_penalty = ((gradients.norm(2, dim=1) - 1) ** 2).mean() * LAMBDA.

gradient_penalty is of shape [BATCH_SIZE ,1, 28, 28]. We want to calculate the norm of the gradient PER SAMPLE, and then use that as an error metric. That means we need to collapse gradient_penalty into one value per sample, or to shape [BATCH_SIZE, 1] or just [BATCH_SIZE].

But, gradients.norm(dim=1) collapses it to size [BATCH_SIZE, 28, 28], which isn't right.

Instead, gradients needs to be reshaped to be flat before you take the norm.

I monitored the value of gradient_penalty, and doing it the way it has now explodes to around 10000 for gan_mnist, even when the networks gradients were reasonable, so formal reason aside, I'm pretty sure there's a bug.

Great library by the way, it's made my life really easy. Thanks for posting it.

Want me to make a PR?

What do you think?

why D_real.backward(one) and D_fake.backward(mone)?

Thanks for your excellent work. I have two confusion on the operation said on the Title.

1: what are one/mone doing here

2: I read some GAN codes, they first compute loss, then to do backward (loss_D_real.backward() or loss_D_fake.backward() ). What's the reason you not using loss?

question of the code

Hi,
what does the 'grad_outputs' in line 131 of gan_cifar10.py stand for ?Should the parameter be interpolates.size() instead of disc_interpolates.size()?

A question about Dcost

excuse me, can you tell me how can we get the result from descriminator and then muiltiply -1 as the cost? especially in mnist dataset, we use the sigmoid as the last layer, and the cost is negtive all the time, thank you

Data parallel problem (with multi GPUs)

I try to implement your code with multi-gpus, but the network occur the problem with

"RuntimeError : Scatter is not differentiable twice"

while add "torch.dataparallel()" to the Discriminator and Generator.
Followed by the suggestion of "gchanan", the error changes to

"RuntimeError : arguments are located on different GPUs at /opt/conda/conda-bld/pytorch_1503966894950/work/torch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:215"

How could I fix the problem? Thank you.

AttributeError: 'generator' object has no attribute 'next'

Hi, thanks for sharing the code.
Not sure what is the source of error, I am using Python 3.6 and PyTorch 0.4.

The line where the error occurred is _data = gen.next(), in the file gan_cifar10.py

# (1) Update D network
    ###########################
    for p in netD.parameters():  # reset requires_grad
        p.requires_grad = True  # they are set to False below in netG update
    for i in range(CRITIC_ITERS):
        _data = gen.next()
        netD.zero_grad()

发生了一个错误

博主,你好,你的代码很棒。但是我在运行你的代码时发生了一个错误:
Traceback (most recent call last): File "G:\desktop\myProject\WGAN_gp\gan_mnist.py", line 253, in <module> generate_image(iteration, netG) File "G:\desktop\myProject\WGAN_gp\gan_mnist.py", line 119, in generate_image 'tmp/mnist/samples_{}.png'.format(frame) File "G:\desktop\myProject\WGAN_gp\tflib\save_images.py", line 36, in save_images img[j*h:j*h+h, i*w:i*w+w] = x ValueError: could not broadcast input array from shape (28,28) into shape (26,28)
这是为什么呢?百思不得其解,还请博主不吝赐教!

Index [0] of grad tensor

Hi,

I am a little confused about taking only the zero index of gradient tensor in the penalty function:

gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates, grad_outputs=torch.ones(disc_interpolates.size()), create_graph=True, retain_graph=True, only_inputs=True)[0]

Why is it not possible to take the whole grad tensor ?

Best

shouldn't it be D_real.backward(one)?

  ```
   D_real.backward(mone)
   # train with fake
    noise = torch.randn(BATCH_SIZE, 128)
    if use_cuda:
        noise = noise.cuda()
    noisev = autograd.Variable(noise, volatile=True)  # totally freeze netG
    fake = autograd.Variable(netG(noisev).data)
    inputv = fake
    D_fake = netD(inputv)
    D_fake = D_fake.mean()
    D_fake.backward(one)

Issues with Python3 Version

When i am trying to run with Python3 .I am facing lot of issues.
Fixed almost of them Except the error in plot.py file at Flush

Axis Error: axis -1 is out of bounds for array of dimension 0.

Can you kindly help with this?

def flush():
prints = []

for name, vals in _since_last_flush.items():
	prints.append("{}\t{}".format(name, np.mean(vals.values())))
	_since_beginning[name].update(vals)

	x_vals = np.sort(_since_beginning[name].keys())
	y_vals = [_since_beginning[name][x] for x in x_vals]

	plt.clf()
	plt.plot(x_vals, y_vals)
	plt.xlabel('iteration')
	plt.ylabel(name)
	plt.savefig(name.replace(' ', '_')+'.jpg')

nans

When I run gan_toy.py I get nans for the discriminator cost, generator cost and wasserstein distance:

iter 99 tmp/8gaussians/disc cost nan tmp/8gaussians/gen cost nan tmp/8gaussians/wasserstein distance nan
/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py:1514: UserWarning: Warning: converting a masked element to nan.
self.zmax = float(z.max())
/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py:1515: UserWarning: Warning: converting a masked element to nan.
self.zmin = float(z.min())
/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py:1153: RuntimeWarning: invalid value encountered in greater
return lev[(lev > zmin) & (lev < zmax)]
/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py:1153: RuntimeWarning: invalid value encountered in less
return lev[(lev > zmin) & (lev < zmax)]
Traceback (most recent call last):
File "gan_toy.py", line 308, in
generate_image(_data)
File "gan_toy.py", line 123, in generate_image
plt.contour(x, y, disc_map.reshape((len(x), len(y))).transpose())
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2853, in contour
ret = ax.contour(*args, **kwargs)
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/init.py", line 1898, in inner
return func(ax, *args, **kwargs)
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 5825, in contour
contours = mcontour.QuadContourSet(self, *args, **kwargs)
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py", line 865, in init
self._process_levels()
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/matplotlib/contour.py", line 1199, in _process_levels
self.vmin = np.amin(self.levels)
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2352, in amin
out=out, **kwargs)
File "/home/eecs/shiry/anaconda2/lib/python2.7/site-packages/numpy/core/_methods.py", line 29, in _amin
return umr_minimum(a, axis, None, out, keepdims)
ValueError: zero-size array to reduction operation minimum which has no identity

I am using the following commit of the pytorch master:

commit 329a2f7d27543ef21353c32b520980c666fa12cf
Author: Kai Arulkumaran [email protected]
Date: Sat Jun 17 16:13:03 2017 +0100

Prevent divide by zero in dropout with p=1

Have you seen this behaviour before?
I was also seeing nans in gan_mnist before you added conv2d.

Issues about running gan_toy.py

Hi,

I'm trying running the gan_toy.py without any modifications. I use the master version of pytorch after commit #1507. However, there are some errors when I am running the code,

`Traceback (most recent call last):
File "gan_toy.py", line 270, in
gradient_penalty.backward()
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 145, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/init.py", line 98, in backward
variables, grad_variables, retain_graph)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/function.py", line 90, in apply
return self._forward_cls.backward(self, args)
File "/usr/local/lib/python2.7/dist-packages/torch/nn/_functions/linear.py", line 23, in backward
grad_input = torch.mm(grad_output, weight)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 531, in mm
return self._static_blas(Addmm, (output, 0, 1, self, matrix), False)
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/variable.py", line 524, in _static_blas
return cls.apply(
(args[:1] + args[-2:] + (alpha, beta, inplace)))
File "/usr/local/lib/python2.7/dist-packages/torch/autograd/_functions/blas.py", line 24, in forward
matrix1, matrix2, out=output)
TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteTensor), but expected one of:

  • (torch.cuda.ByteTensor source, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (torch.cuda.ByteTensor source, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (int beta, torch.cuda.ByteTensor source, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (torch.cuda.ByteTensor source, int alpha, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (int beta, torch.cuda.ByteTensor source, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (torch.cuda.ByteTensor source, int alpha, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
  • (int beta, torch.cuda.ByteTensor source, int alpha, torch.cuda.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
    didn't match because some of the arguments have invalid types: (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteTensor)
  • (int beta, torch.cuda.ByteTensor source, int alpha, torch.cuda.sparse.ByteTensor mat1, torch.cuda.ByteTensor mat2, *, torch.cuda.ByteTensor out)
    didn't match because some of the arguments have invalid types: (int, torch.cuda.ByteTensor, int, torch.cuda.ByteTensor, torch.cuda.FloatTensor, out=torch.cuda.ByteTensor)`

I'm wondering whether you have any ideas about the causes of this problem.

Thanks.

RuntimeError: cuda runtime error (2) : out of memory at /py/conda-bld/pytorch_1493680494901/work/torch/lib/THC/generic/THCStorage.cu:66

I get this error trying to run the mnist example. I have a Titan X GPU so I don't think I should run out of memory on mnist. I'm using PyTorch version 0.1.12_2 and Python 3.

Generator (
  (block1): Sequential (
    (0): ConvTranspose2d(256, 128, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU (inplace)
  )
  (block2): Sequential (
    (0): ConvTranspose2d(128, 64, kernel_size=(5, 5), stride=(1, 1))
    (1): ReLU (inplace)
  )
  (deconv_out): ConvTranspose2d(64, 1, kernel_size=(8, 8), stride=(2, 2))
  (preprocess): Sequential (
    (0): Linear (128 -> 4096)
    (1): ReLU (inplace)
  )
  (sigmoid): Sigmoid ()
)
Discriminator (
  (main): Sequential (
    (0): Linear (784 -> 4096)
    (1): ReLU (inplace)
    (2): Linear (4096 -> 4096)
    (3): ReLU (inplace)
    (4): Linear (4096 -> 4096)
    (5): ReLU (inplace)
    (6): Linear (4096 -> 4096)
    (7): ReLU (inplace)
    (8): Linear (4096 -> 4096)
    (9): ReLU (inplace)
    (10): Linear (4096 -> 1)
  )
)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-21-c32a204873f5> in <module>()
      4 print(net_D)
      5 if use_cuda:
----> 6     net_D = net_D.cuda()
      7     net_G = net_G.cuda()
      8 opt_D = optim.Adam(net_D.parameters(), lr=1e04, betas=(0.5, 0.9))

/home/clu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in cuda(self, device_id)
    145                 copied to that device
    146         """
--> 147         return self._apply(lambda t: t.cuda(device_id))
    148 
    149     def cpu(self, device_id=None):

/home/clu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    116     def _apply(self, fn):
    117         for module in self.children():
--> 118             module._apply(fn)
    119 
    120         for param in self._parameters.values():

/home/clu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    116     def _apply(self, fn):
    117         for module in self.children():
--> 118             module._apply(fn)
    119 
    120         for param in self._parameters.values():

/home/clu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in _apply(self, fn)
    122                 # Variables stored in modules are graph leaves, and we don't
    123                 # want to create copy nodes, so we have to unpack the data.
--> 124                 param.data = fn(param.data)
    125                 if param._grad is not None:
    126                     param._grad.data = fn(param._grad.data)

/home/clu/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py in <lambda>(t)
    145                 copied to that device
    146         """
--> 147         return self._apply(lambda t: t.cuda(device_id))
    148 
    149     def cpu(self, device_id=None):

/home/clu/anaconda3/lib/python3.6/site-packages/torch/_utils.py in _cuda(self, device, async)
     63         else:
     64             new_type = getattr(torch.cuda, self.__class__.__name__)
---> 65             return new_type(self.size()).copy_(self, async)
     66 
     67 

Zero gradient

I'm trying to implement the gradient-penalty approach using this code, but this code block:

    gradients = autograd.grad(outputs=disc_interpolates, inputs=interpolates,
                              grad_outputs=torch.ones(disc_interpolates.size()).cuda(gpu) if use_cuda else torch.ones(
                                  disc_interpolates.size()),
                              create_graph=True, retain_graph=True, only_inputs=True)[0]

always returns gradients of 0.

Would this be caused by my using ConvTranspose2d units?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.