tomlepaine / fast-wavenet Goto Github PK

View Code? Open in Web Editor NEW

1.8K 1.8K 310.0 313 KB

Speedy Wavenet generation using dynamic programming :zap:

License: GNU General Public License v3.0

Jupyter Notebook 20.07% Python 79.93%

deep-learning machine-learning tensorflow wavenet

fast-wavenet's People

Contributors

Stargazers

Watchers

Forkers

prajitr wanjinchang lyk125 techscientist johnson-yue zhangjiulong scatterbrain333 mortont lfthwjx 2php nieshaoshuai huyouare augmify iamtpb lijian8 jfsantos benjamesbabala hengqujushi ufarufalab datavizweb cloudxtreme milesqli codeaudit barneyeldinosaurio tigerneil laggingreflex anuroopsriram lab-x clcarwin v1vekkumar zhiyu-chen lantiga dingke dplaniel nschuc yluo42 will001 zhangyangbill theolivenbaum pastorenick nipengmath zyfnhct dancres kafkafield eos21 cindeem fang289040324 deepmusic davidreber g-wang soroushmehr misc-git-forks allensmile dorniwang ghego 8w9ag buffan chrysolily wookay mrgoogol bhtucker jasondmuscut kensun0 pursueorigin bensondou sikuma vseledkin rueberger fakhraddin mostafarohani angrycoffeemonster yfliao agangzz suanfeng knishimura785 spencerduncan mldl chagge cliffweitzman jiashaoyong devalaya winnerineast clairelenobel ml-lab arppa99100 nsteins preacle zbessinger wderekjones neovaldivia kmisiunas marionleborgne tansey abhinav-goyal bgtwoigu blank-wang qiqipipioioi michaelfeng87 zgsxwsdxg awesome-ml

fast-wavenet's Issues

Demo produces ValueError instantiating Model

Using OS X 10.11.6, Python 2.7, tensorflow 0.9.0, the line

model = Model(num_time_samples=num_time_samples,
              num_channels=num_channels,
              gpu_fraction=gpu_fraction)

in the "demo" code produces the following error and fails:

Traceback (most recent call last):
  File "/fast-wavenet/demo.py", line 17, in <module>
    gpu_fraction=gpu_fraction)
  File "/fast-wavenet/wavenet/models.py", line 47, in __init__
    outputs, targets)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 454, in sparse_softmax_cross_entropy_with_logits
    logits, labels, name=name)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 1450, in _sparse_softmax_cross_entropy_with_logits
    features=features, labels=labels, name=name)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 704, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2262, in create_op
    set_shapes_for_outputs(ret)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1702, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/ops/nn_ops.py", line 462, in _SparseSoftmaxCrossEntropyWithLogitsShape
    input_shape = logits_shape.with_rank(2)
  File "/usr/local/lib/python2.7/site-packages/tensorflow/python/framework/tensor_shape.py", line 641, in with_rank
    raise ValueError("Shape %s must have rank %d" % (self, rank))
ValueError: Shape (?, 35315, 256) must have rank 2

CUDA out of memory issue

Hi, I'm having an "out of memory" issue while running the demo.

Snippet:
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:04:00.0
Total memory: 11.91GiB
Free memory: 11.67GiB
(full log below)

I have tried to lower the model parameters, but nothing seems to work. Do you have any advice?
Why does the demo take so much GPU memory?
Thanks a lot,
Daniele

Full log:
python demo.py
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:119] Couldn't open CUDA library libcudnn.so. LD_LIBRARY_PATH:
I tensorflow/stream_executor/cuda/cuda_dnn.cc:3459] Unable to load cuDNN DSO
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:04:00.0
Total memory: 11.91GiB
Free memory: 11.67GiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x48c4140
I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
I tensorflow/core/common_runtime/gpu/gpu_device.cc:885] Found device 1 with properties:
name: GeForce GTX 980
major: 5 minor: 2 memoryClockRate (GHz) 1.342
pciBusID 0000:0a:00.0
Total memory: 3.94GiB
Free memory: 487.88MiB
W tensorflow/stream_executor/cuda/cuda_driver.cc:590] creating context when one is currently active; existing: 0x48c0320
E tensorflow/core/common_runtime/direct_session.cc:135] Internal: failed initializing StreamExecutor for CUDA device ordinal 2: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_OUT_OF_MEMORY; total memory reported: 18446744073648275456
Traceback (most recent call last):
File "demo.py", line 16, in
gpu_fraction=gpu_fraction)
File "/home/daniele/fast-wavenet-master/wavenet/models.py", line 54, in init
sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))
File "/home/daniele/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1186, in init
super(Session, self).init(target, graph, config=config)
File "/home/daniele/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 551, in init
self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/usr/lib/python2.7/contextlib.py", line 24, in exit
self.gen.next()
File "/home/daniele/.local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.

Longer sample produce very noisy output

Yes I'm kinda new to TF and still... Training, so bear with me for my lame questions.
I'm experiencing with the demo. It trained and generated correctly with the very short audio sample provided with the code, but then I wanted to try something different. I ran the demo on a short (abt. 20seconds) sample from a well-known Beethoven's symphony and then generated 300000 samples. Well, something strange happened: only the first half a second is fine, the rest of the generated sound is extremely noisy and barely recognizable.
In the code, I just changed the path of the input audio and the duration of the generated audio.
What am I doing wrong? Thank you for your patience in reading my post (and answering, if possible!)

Training on multiple samples

Hi, I'm pretty new to machine learning, so maybe this is a silly question, but I was wondering how I would train this network using more than just one sample? Because it looks like the make_batch function only loads a single file. Also, I'm not sure if this is related, but would I need to train a separate network for each class of generator? Or is there a way to label the training data? Any help and tutoring is much appreciated!

Generating novel audio

Ran your demo, and it worked fine. The generator can reproduce the training sample. Now I want to generate some new sounds, e.g., by changing the initial conditions. I tried changing the input sample to the generator, but that didn't change anything---it still reproduces the training excerpt. Any ideas? I'm wondering if the model has been overfit?

Error when running ipynb: No module named 'layers'

I use Google Colab (python 3).

I cloned the repo

!git clone https://github.com/tomlepaine/fast-wavenet.git
%cd fast-wavenet

Then I ran this:

from time import time

from wavenet.utils import make_batch
from wavenet.models import Model, Generator

from IPython.display import Audio

%matplotlib inline

And I got this error:

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-6-4dd95abb40c7> in <module>()
      2 
      3 from wavenet.utils import make_batch
----> 4 from wavenet.models import Model, Generator
      5 
      6 from IPython.display import Audio

/content/fast-wavenet/wavenet/models.py in <module>()
      2 import numpy as np
      3 import tensorflow as tf
----> 4 from layers import (_causal_linear, _output_linear, conv1d,
      5                     dilated_conv1d)
      6 

ModuleNotFoundError: No module named 'layers'

---------------------------------------------------------------------------
NOTE: If your import is failing due to a missing package, you can
manually install dependencies using either !pip or !apt.

To view examples of installing some common dependencies, click the
"Open Examples" button below.
---------------------------------------------------------------------------

Did I miss something?

generator = Generator(model) in demo produces error

when running the demo using tensorflow .10, python 3.5 (anaconda), commit 20485a2 I get the following :

Make Generator.

TypeError Traceback (most recent call last)
in ()
----> 1 generator = Generator(model)

/home/denis/fast-wavenet/wavenet/models.py in init(self, model, batch_size, input_size)
99 count += 1
100
--> 101 outputs = _output_linear(h)
102
103 out_ops = [tf.argmax(tf.nn.softmax(outputs), 1)]

/home/denis/fast-wavenet/wavenet/layers.py in _output_linear(h, name)
170
171 def _output_linear(h, name=''):
--> 172 with tf.variable_scope(name, reuse=True):
173 w = tf.get_variable('w')[0, :, :]
174 b = tf.get_variable('b')

/home/denis/anaconda3/lib/python3.5/contextlib.py in enter(self)
57 def enter(self):
58 try:
---> 59 return next(self.gen)
60 except StopIteration:
61 raise RuntimeError("generator didn't yield") from None

/home/denis/anaconda3/lib/python3.5/site-packages/tensorflow/python/ops/variable_scope.py in variable_scope(name_or_scope, default_name, values, initializer, regularizer, caching_device, partitioner, custom_getter, reuse, dtype)
1350 """
1351 if default_name is None and not name_or_scope:
-> 1352 raise TypeError("If default_name is None then name_or_scope is required")
1353 if values is None:
1354 values = []

TypeError: If default_name is None then name_or_scope is required

[Google colab] TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

I used Google colab (python 3 GPU)
How to reproduce the error:

Open new python3 project in Google colab
Run this code:

!git clone https://github.com/tomlepaine/fast-wavenet.git
%cd fast-wavenet
!mkdir wavenet/assets
!cp assets/voice.wav wavenet/assets/voice.wav
%cd wavenet

from time import time
from utils import make_batch
from models import Model, Generator
from IPython.display import Audio
%matplotlib inline

inputs, targets = make_batch('assets/voice.wav')
num_time_samples = inputs.shape[1]
num_channels = 1
gpu_fraction = 1.0


model = Model(num_time_samples=num_time_samples,
              num_channels=num_channels,
              gpu_fraction=gpu_fraction)

Audio(inputs.reshape(inputs.shape[1]), rate=44100)

And you will see this error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-11-afc93d99fff0> in <module>()
      2 model = Model(num_time_samples=num_time_samples,
      3               num_channels=num_channels,
----> 4               gpu_fraction=gpu_fraction)
      5 
      6 Audio(inputs.reshape(inputs.shape[1]), rate=44100)

/content/fast-wavenet/wavenet/models.py in __init__(self, num_time_samples, num_channels, num_classes, num_blocks, num_layers, num_hidden, gpu_fraction)
     34                 rate = 2**i
     35                 name = 'b{}-l{}'.format(b, i)
---> 36                 h = dilated_conv1d(h, num_hidden, rate=rate, name=name)
     37                 hs.append(h)
     38 

/content/fast-wavenet/wavenet/layers.py in dilated_conv1d(inputs, out_channels, filter_width, rate, padding, name, gain, activation)
    136     with tf.variable_scope(name):
    137         _, width, _ = inputs.get_shape().as_list()
--> 138         inputs_ = time_to_batch(inputs, rate=rate)
    139         outputs_ = conv1d(inputs_,
    140                           out_channels=out_channels,

/content/fast-wavenet/wavenet/layers.py in time_to_batch(inputs, rate)
     24     padded = tf.pad(inputs, [[0, 0], [pad_left, 0], [0, 0]])
     25     transposed = tf.transpose(padded, perm)
---> 26     reshaped = tf.reshape(transposed, shape)
     27     outputs = tf.transpose(reshaped, perm)
     28     return outputs

/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gen_array_ops.py in reshape(tensor, shape, name)
   6480   if _ctx is None or not _ctx._eager_context.is_eager:
   6481     _, _, _op = _op_def_lib._apply_op_helper(
-> 6482         "Reshape", tensor=tensor, shape=shape, name=name)
   6483     _result = _op.outputs[:]
   6484     _inputs_flat = _op.inputs

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py in _apply_op_helper(self, op_type_name, name, **keywords)
    607               _SatisfiesTypeConstraint(base_type,
    608                                        _Attr(op_def, input_arg.type_attr),
--> 609                                        param_name=input_name)
    610             attrs[input_arg.type_attr] = attr_value
    611             inferred_from[input_arg.type_attr] = input_name

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py in _SatisfiesTypeConstraint(dtype, attr_def, param_name)
     58           "allowed values: %s" %
     59           (param_name, dtypes.as_dtype(dtype).name,
---> 60            ", ".join(dtypes.as_dtype(x).name for x in allowed_list)))
     61 
     62 

TypeError: Value passed to parameter 'shape' has DataType float32 not in list of allowed values: int32, int64

What did I miss? How to make it work in Google colab?

Can someone please share a pretrained model

Preferably on LJSpeech dataset.

having some problems with running Demo

There is a problem on running the demo:

Even changing the code in layers.py by using int() to change the type to integer cannot solve the problem.

I don't know whether it is the problem of the python and tensorflow themselves because some issues are raised due to the incompatibility of the version.

I use python 3.5.3 and tensorflow 1.1.0

a basic question: how to install fast-wavenet?

anyone can help please?

How long does it take to generate one second of audio? Is it possible to use it on android or ios client to do inference?

As is kown,
For a downsized model (4000hz vs 16000 sampling rate, 16 filters v/s 256, 2 stacks vs ??):

A Tesla K80 needs around ~4 minutes to generate one second of audio.
A recent macbook pro needs around ~15 minutes. Deepmind has reported that generating one second of audio with their model takes about 90 minutes.

Can we have some comparative voice samples ?

Just to compare against Google's implementation.

Wavenet vs fast wavenet (which one better)

Which models runs on consumer resources? Can fast wavenet do anything wavenet can do? Besides generating outputs way faster, and needing less memory, how much memory does it use? Is this repo based on the tensorflow-wavenet repo? Also, is fast wavenet in anyway a downgrade (output wise eg vocal synthesis) to wavenet? Thanks?

More questions about how to use coming soon to a GitHub thread near you!

demo, time_to_batch: "TypeError: Expected int32"

Hey, would really love to try this out. Here are a couple things I find when trying to run the demo...
(This is with tensorflow 0.10.0 on Mac OS X El Capitan, 10.11.6.)

Traceback (most recent call last):
  File "demo.py", line 15, in <module>
    from wavenet.models import Model, Generator
  File "/Users/myusername/exercises/neural/fast-wavenet/wavenet/models.py", line 4, in <module>
    from layers import (_causal_linear, _output_linear, conv1d,
ImportError: No module named 'layers'

If you change line 4 of models.py so it reads "from wavenet.layers" instead of just "from layers", then this error goes away. That's easy.

The next error, I don't know how to fix...

Traceback (most recent call last):
  File "demo.py", line 32, in <module>
    gpu_fraction=gpu_fraction)
  File "/Users/myusername/exercises/neural/fast-wavenet/wavenet/models.py", line 36, in __init__
    h = dilated_conv1d(h, num_hidden, rate=rate, name=name)
  File "/Users/myusername/exercises/neural/fast-wavenet/wavenet/layers.py", line 138, in dilated_conv1d
    inputs_ = time_to_batch(inputs, rate=rate)
  File "/Users/myusername/exercises/neural/fast-wavenet/wavenet/layers.py", line 26, in time_to_batch
    reshaped = tf.reshape(transposed, shape)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1383, in reshape
    name=name)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/op_def_library.py", line 455, in apply_op
    as_ref=input_arg.is_ref)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 620, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/constant_op.py", line 179, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/ops/constant_op.py", line 162, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape))
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 353, in make_tensor_proto
    _AssertCompatible(values, dtype)
  File "/Users/myusername/anaconda/lib/python3.5/site-packages/tensorflow/python/framework/tensor_util.py", line 290, in _AssertCompatible
    (dtype.name, repr(mismatch), type(mismatch).__name__))
TypeError: Expected int32, got 35316.0 of type 'float' instead.

How do we fix that? I can see that right before the call to reshape,
transposed = Tensor("b0-l0/transpose:0", shape=(35316, ?, 1), dtype=float32)

the generation results seems not linearly

Without Training, I can run generate.run directly, with dilation_num = 4, generate cost 19s, dilation_num = 8, cost 29s, dilation_num = 14, cost 43s, so the results seems not linearly?

from time import time

from wavenet.utils import make_batch
from wavenet.models import Model, Generator

#from IPython.display import Audio


inputs, targets = make_batch('assets/voice.wav')
num_time_samples = inputs.shape[1]
num_channels = 1
gpu_fraction = 1.0

model = Model(num_time_samples=num_time_samples,
              num_channels=num_channels,
              gpu_fraction=gpu_fraction)

#Audio(inputs.reshape(inputs.shape[1]), rate=44100)

tic = time()
#model.train(inputs, targets)
toc = time()

print('Training took {} seconds.'.format(toc-tic))

generator = Generator(model)

# Get first sample of input
input_ = inputs[:, 0:1, 0]

tic = time()
predictions = generator.run(input_, 32000)
toc = time()
print('Generating took {} seconds.'.format(toc-tic))

#Audio(predictions, rate=44100)

layers=4

$python test.py
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py:553: calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead
2018-11-09 18:17:04.603404: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/tf_should_use.py:189: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
Training took 0.0 seconds.
Make Generator.
Generating took 19.5070331097 seconds.

layers=8

$python test.py
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py:553: calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead
2018-11-09 18:17:51.435269: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/tf_should_use.py:189: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
Training took 0.0 seconds.
Make Generator.
Generating took 29.0180389881 seconds.

layers=14

$python test.py
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py:553: calling conv1d (from tensorflow.python.ops.nn_ops) with data_format=NHWC is deprecated and will be removed in a future version.
Instructions for updating:
`NHWC` for data_format is deprecated, use `NWC` instead
2018-11-09 18:15:56.114761: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
WARNING:tensorflow:From /home/xianning.lu/sing/tf-no-mkl/lib/python2.7/site-packages/tensorflow/python/util/tf_should_use.py:189: initialize_all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Use `tf.global_variables_initializer` instead.
Training took 0.0 seconds.
Make Generator.
Generating took 43.2759749889 seconds.

also having issues with the demo

This is what I get on step 2. I tried working with the fixes that people suggested on the other posts, but have still not had any luck. Any suggestions would be greatly appreciated!

ValueErrorTraceback (most recent call last)
in ()
6 model = Model(num_time_samples=num_time_samples,
7 num_channels=num_channels,
----> 8 gpu_fraction=gpu_fraction)
9
10 Audio(inputs.reshape(inputs.shape[1]), rate=44100)

/root/shared/fast-wavenet-master/wavenet/models.py in init(self, num_time_samples, num_channels, num_classes, num_blocks, num_layers, num_hidden, gpu_fraction)
34 for i in range(num_layers):
35 rate = 2**i
---> 36 name = 'b{}-l{}'.format(b, i)
37 h = dilated_conv1d(h, num_hidden, rate=rate, name=name)
38 hs.append(h)

/root/shared/fast-wavenet-master/wavenet/layers.py in dilated_conv1d(inputs, out_channels, filter_width, rate, padding, name, gain, activation)
142 padding=padding,
143 gain=gain,
--> 144 activation=activation)
145 , conv_out_width, _ = outputs.get_shape().as_list()
146 new_width = conv_out_width * rate

/root/shared/fast-wavenet-master/wavenet/layers.py in conv1d(inputs, out_channels, filter_width, stride, padding, data_format, gain, activation, bias)
89 w = tf.get_variable(name='w',
90 shape=(filter_width, in_channels, out_channels),
---> 91 initializer=w_init)
92
93 outputs = tf.nn.conv1d(inputs,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, custom_getter)
986 collections=collections, caching_device=caching_device,
987 partitioner=partitioner, validate_shape=validate_shape,
--> 988 custom_getter=custom_getter)
989 get_variable_or_local_docstring = (
990 """%s

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, var_store, name, shape, dtype, initializer, regularizer, trainable, collections, caching_device, partitioner, validate_shape, custom_getter)
888 collections=collections, caching_device=caching_device,
889 partitioner=partitioner, validate_shape=validate_shape,
--> 890 custom_getter=custom_getter)
891
892 def _get_partitioned_variable(self,

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in get_variable(self, name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape, custom_getter)
346 reuse=reuse, trainable=trainable, collections=collections,
347 caching_device=caching_device, partitioner=partitioner,
--> 348 validate_shape=validate_shape)
349
350 def _get_partitioned_variable(

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in _true_getter(name, shape, dtype, initializer, regularizer, reuse, trainable, collections, caching_device, partitioner, validate_shape)
331 initializer=initializer, regularizer=regularizer, reuse=reuse,
332 trainable=trainable, collections=collections,
--> 333 caching_device=caching_device, validate_shape=validate_shape)
334
335 if custom_getter is not None:

/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variable_scope.pyc in _get_single_variable(self, name, shape, dtype, initializer, regularizer, partition_info, reuse, trainable, collections, caching_device, validate_shape)
637 " Did you mean to set reuse=True in VarScope? "
638 "Originally defined at:\n\n%s" % (
--> 639 name, "".join(traceback.format_list(tb))))
640 found_var = self._vars[name]
641 if not shape.is_compatible_with(found_var.get_shape()):

ValueError: Variable b0-l0/w already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:

File "wavenet/layers.py", line 91, in conv1d
initializer=w_init)
File "wavenet/layers.py", line 144, in dilated_conv1d
activation=activation)
File "wavenet/models.py", line 36, in init
name = 'b{}-l{}'.format(b, i)

How do I turn text into audio？

In the prediction, the input is a single number. But if I want to use this code to solve tts, I have no idea how to start.

if filter width =3, how to do fast inference?

In the new paper, Google use filter width =3 to increase the receptive field.

Then how could we do inference with filter width 3?
My idea is use to Queue, because the dilation is still 2 times increased, the first Queue is used to store the first half of middle value, and the second Queue is used to store the second half middle value.
Output of first Queue then be enqueued into the second Queue.

such as:

        current_state = q.dequeue()
        push = q.enqueue([current_layer])
        init_ops.append(init)
        push_ops.append(push)

        pre_state = None
        if self.filter_width == 3:
            q2 = tf.FIFOQueue(
                 1,
                 dtypes=tf.float32,
                 shapes=(self.batch_size, self.quantization_channels))

            init2 = q2.enqueue_many(tf.zeros((1, self.batch_size, self.quantization_channels)))

            pre_state = q2.dequeue()
            push2 = q2.enqueue([current_state])

            init_ops2.append(init2)
            push_ops2.append(push2)

        if self.filter_width == 2:
            current_layer = self._generator_causal_layer(
                            current_layer, current_state)
        if self.filter_width == 3:
            current_layer = self._generator_causal_layer(
                            current_layer, current_state, pre_state)
 


...
        with tf.name_scope('dilated_stack'):
            for layer_index, dilation in enumerate(self.dilations):
                with tf.name_scope('layer{}'.format(layer_index)):

                    q = tf.FIFOQueue(
                        dilation,
                        dtypes=tf.float32,
                        shapes=(self.batch_size, self.residual_channels))
                    init = q.enqueue_many(
                        tf.zeros((dilation, self.batch_size,
                                  self.residual_channels)))

                    current_state = q.dequeue()
                    push = q.enqueue([current_layer])
                    init_ops.append(init)
                    push_ops.append(push)

                    pre_state = None
                    if self.filter_width == 3:
                        q2 = tf.FIFOQueue(
                             dilation,
                             dtypes=tf.float32,
                             shapes=(self.batch_size, self.residual_channels))

                        init2 = q2.enqueue_many(tf.zeros((dilation, self.batch_size, self.residual_channels)))

                        pre_state = q2.dequeue()
                        push2 = q2.enqueue([current_state])

                        init_ops2.append(init2)
                        push_ops2.append(push2)

                    output, current_layer = self._generator_dilation_layer(
                        current_layer, current_state, layer_index, dilation,
                        global_condition_batch, local_condition, pre_state)
                    outputs.append(output)

is that make sense?

Is this possible to run without GPU?

Hello,

I am wondering to run it without GPU, and I tried with 'sess = tf.Session(config=tf.ConfigProto(device_count={'gpu':0}))'. And the training part is successful. However, when I start to generate，it seems cannot generate anything and jump out.
Is this possible to run without GPU?

how to use it?

can we save model?
how to use it after if it generate only this training phrase?

Why not upload your voice samples?

I recommend you upload the voice samples to some websites like soundcloud in US or ximalaya in China. And we can listen the effect of your demo on line.