Comments (7)
For filter_width=3, dilation=1 => dilation/2 = 0
=> (filter_width - 1) * dilation/2 = 0 . So the input will not be padded.
from bytenet-tensorflow.
sorry its a typo, dilation = 2
from bytenet-tensorflow.
I just tested
For decoder
, if I change the original padding to
padding = [[0, 0], [(filter_width - 1) * dilation/2, (filter_width -1) * dilation/2], [0, 0]]
the result looks normal
But if I change it to
padding = [[0, 0], [0, 0], [0, 0]]
, the loss is close to 0.
So for the encoder
:
padding = [[0, 0], [0, 0], [0, 0]]
and decoder
:
padding = [[0, 0], [(filter_width - 1) * dilation/2, (filter_width -1) * dilation/2], [0, 0]]
Another way to correct it is to keep the padding but change the pad in conv1d
from SAME
to VALID
.
from bytenet-tensorflow.
The padding setting for the decoder is correct.
padding = [[0, 0], [(filter_width - 1) * dilation, 0], [0, 0]]
This is done to preserve the causality. Refer to the bytenet decoder diagram for further clarity. Check causal_conv
function in :
https://github.com/ibab/tensorflow-wavenet/blob/master/wavenet/ops.py
For the encoder, The current code is fine as well. If you are using an odd filter width (1,3,5..) it shouldn't be a problem. I think having a look at the diagram again might help. The first output of the encoder depends only on the first and the third element, which will be the case if :
0 1 3 5 0
0 2 4 6 0
is the padded input for conv1d.
Let me know if you still think I may be wrong.
from bytenet-tensorflow.
You are ignoring the padding from '''conv1d''' itself, the input is
''"
0 1 3 5 0
0 2 4 6 0
'''
becomes the input to conv1d, which will again, pad with filter_width-1 zeros with the SAME padding scheme
'''
0 0 1 3 4 0 0
0 0 2 4 6 0 0
''"
from bytenet-tensorflow.
Oh, I see. I just went through the documentation of tf.nn.conv1d. It seems that the input may be paded differently for different input lengths, which is not desirable. Changing it to VALID
seems appropriate. I would appreciate if you can make the required change and test machine translation to check if it improves.
Do you think this change might also be required for the decoder
? That implementsation is the same as wavenet though.
from bytenet-tensorflow.
@zcyang Updated the ops. The model works correctly now. It was indeed an error in encoder padding. Check Model/ops
for the new implementation.
from bytenet-tensorflow.
Related Issues (8)
- Hi Paarth HOT 6
- Pre-trained models
- could not train using python3.5 and tf 0.11 HOT 2
- Translating HOT 1
- Consistency with the original paper? HOT 9
- We can upgrade the code to compatible with tf1.0.0 by the following diff: HOT 2
- where is the data of Data/MachineTranslation/news-commentary-v11.de-en.de HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bytenet-tensorflow.