Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Understanding convolution kernels in dilation layers about tensorflow-wavenet HOT 4 OPEN

redwrasse commented on June 10, 2024

Understanding convolution kernels in dilation layers

from tensorflow-wavenet.

Comments (4)

cheind commented on June 10, 2024 1

@redwrasse, I agree that the original paper misses some details here and there. Take a look at (Gated) PixelCNN by WaveNet's main author (https://arxiv.org/pdf/1606.05328.pdf) and you will find that he "copies" the gated activation from there. Also, it seems like they stacked them along the output function dims to spare a conv1d.

For the later, have a look here
https://github.com/cheind/autoregressive/blob/e1f9b72b0f9764f9b4d6b6f65f028cd50db6940e/autoregressive/wave.py#L63

from tensorflow-wavenet.

redwrasse commented on June 10, 2024

Answering this for myself from looking through the literature, yes it looks like there are in fact two distinct dilated convolutions passed to the 'gated activation unit'- the original wavenet paper diagrams appear misleading.

from tensorflow-wavenet.

redwrasse commented on June 10, 2024

Thanks @cheind, I'll take a look. A side project I'd like to get back into.

from tensorflow-wavenet.

cheind commented on June 10, 2024

@redwrasse, same for me :) I just figured that it works nicely on 2D images as well (without the special architecture of PixelCNN, just plane WaveNet with unrolled images). In addition, once you have the joint distribution the model estimates, you might start to query all kind of things from the model (like given a wavenet conditioned on the speaker id, what is the probability that this speech was spoken by speaker X).

In case you are interested, I have a quite elaborate presentation + code here
https://github.com/cheind/autoregressive/tree/image-support

The branch will be closed soon and merged to main, so I leave a perm-link
https://github.com/cheind/autoregressive/tree/23701bd503843a1de82c6a32ba5bd6e8ad6965a3

from tensorflow-wavenet.

Recommend Projects

Understanding convolution kernels in dilation layers about tensorflow-wavenet HOT 4 OPEN

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent