Giter VIP home page Giter VIP logo

Comments (7)

paarthneekhara avatar paarthneekhara commented on May 16, 2024

For filter_width=3, dilation=1 => dilation/2 = 0
=> (filter_width - 1) * dilation/2 = 0 . So the input will not be padded.

from bytenet-tensorflow.

zcyang avatar zcyang commented on May 16, 2024

sorry its a typo, dilation = 2

from bytenet-tensorflow.

zcyang avatar zcyang commented on May 16, 2024

I just tested
For decoder, if I change the original padding to
padding = [[0, 0], [(filter_width - 1) * dilation/2, (filter_width -1) * dilation/2], [0, 0]]
the result looks normal
But if I change it to
padding = [[0, 0], [0, 0], [0, 0]], the loss is close to 0.

So for the encoder:
padding = [[0, 0], [0, 0], [0, 0]]
and decoder:
padding = [[0, 0], [(filter_width - 1) * dilation/2, (filter_width -1) * dilation/2], [0, 0]]

Another way to correct it is to keep the padding but change the pad in conv1d from SAME to VALID.

from bytenet-tensorflow.

paarthneekhara avatar paarthneekhara commented on May 16, 2024

The padding setting for the decoder is correct.

padding = [[0, 0], [(filter_width - 1) * dilation, 0], [0, 0]]

This is done to preserve the causality. Refer to the bytenet decoder diagram for further clarity. Check causal_conv function in :
https://github.com/ibab/tensorflow-wavenet/blob/master/wavenet/ops.py

For the encoder, The current code is fine as well. If you are using an odd filter width (1,3,5..) it shouldn't be a problem. I think having a look at the diagram again might help. The first output of the encoder depends only on the first and the third element, which will be the case if :

0 1 3 5 0 
0 2 4 6 0

is the padded input for conv1d.

Let me know if you still think I may be wrong.

from bytenet-tensorflow.

zcyang avatar zcyang commented on May 16, 2024

You are ignoring the padding from '''conv1d''' itself, the input is
''"
0 1 3 5 0
0 2 4 6 0
'''
becomes the input to conv1d, which will again, pad with filter_width-1 zeros with the SAME padding scheme

'''
0 0 1 3 4 0 0
0 0 2 4 6 0 0
''"

from bytenet-tensorflow.

paarthneekhara avatar paarthneekhara commented on May 16, 2024

Oh, I see. I just went through the documentation of tf.nn.conv1d. It seems that the input may be paded differently for different input lengths, which is not desirable. Changing it to VALID seems appropriate. I would appreciate if you can make the required change and test machine translation to check if it improves.
Do you think this change might also be required for the decoder? That implementsation is the same as wavenet though.

from bytenet-tensorflow.

paarthneekhara avatar paarthneekhara commented on May 16, 2024

@zcyang Updated the ops. The model works correctly now. It was indeed an error in encoder padding. Check Model/ops for the new implementation.

from bytenet-tensorflow.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.