Have you ever faced the problem that two masks always turns to equal during the training about pit-speech-separation HOT 5 CLOSED

snsun commented on August 13, 2024

Have you ever faced the problem that two masks always turns to equal during the training

from pit-speech-separation.

Comments (5)

snsun commented on August 13, 2024

Hi， No, I haven't met this problem when I implemented PIT training. I think, maybe you need to check your code again. The most possible reason is that you PIT loss function implementation has bug. Best Xinyao(Alvin) Sun <[email protected]> 于2018年11月28日周三上午10:33写道：

…

Hi, the issue is not for your code, I am doing some research on speech separation and I implemented basic PIT model (CNN, RNN), however, I always face an issue that two masks trend to have equal values during the training. It results in separation totally not work. I just wonder whether you have ever faced a similar issue when you implemented the PIT algorithm? Thanks — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#21>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMS4KSCl07nimScsNNzr-XZr8QGJRe1eks5uzfYFgaJpZM4Y2vJD> .

from pit-speech-separation.

Lucklyric commented on August 13, 2024

Thanks for your reply.
In fact that, I am quite confused about what's the problem.
The main part of PIT is not complicated. I also tried to replace the PIT part with your code and got the same issue.
The main difference I am sure is the way to mix data. I am using
Voxceleb dataset http://www.robots.ox.ac.uk/~vgg/data/voxceleb/. The way I created input data is using librosa's STFT and I only use a log of magnitude as the input of the network. I directly add two speaker's log magnitude as the mix data.
I do not have too many experiences on audio analysis before, do you think the way I created training data may cause some potential issues?
Thanks

from pit-speech-separation.

snsun commented on August 13, 2024

Instead of using log magnitude, we used magnitude directly. If you use log-magnitude, you can not multiply mask by the noisy log-magnitude directly. I recommended you to use magnitude. Best Xinyao(Alvin) Sun <[email protected]> 于2018年11月28日周三上午11:47写道：

…

Thanks for your reply. In fact that, I am quite confused about what's the problem. The main part of PIT is not complicated. I also tried to replace the PIT part with your code and got the same issue. The main difference I am sure is the way to mix data. I am using Voxceleb dataset http://www.robots.ox.ac.uk/~vgg/data/voxceleb/. The way I created input data is using librosa's STFT and I only use a log of magnitude as the input of the network. I directly add two speaker's log magnitude as the mix data. I do not have too many experiences on audio analysis before, do you think the way I created training data may cause some potential issues? Thanks — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMS4KaE1Q8Nt50H38cDiC3IWfZ_RrGFNks5uzgddgaJpZM4Y2vJD> .

from pit-speech-separation.

Lucklyric commented on August 13, 2024

Thanks,

from pit-speech-separation.

snsun commented on August 13, 2024

I didn't observe the distribution of WSJ data. But I believe the distributions should be similar between different datasets. From the info you have provided, I don't think I can figure out the reason. How about your training and development sets loss? Best Xinyao(Alvin) Sun <[email protected]> 于2018年11月28日周三下午2:41写道：

…

Thanks for your suggestion. The reason why I did log is that the raw amplitude is distributed very unbalance. I did try using raw amplitude and it did not work. Because I do not have WSJ dataset, could you tell me what kind of distribution of your raw amplitude data? For mine it looks like this, [image: image] <https://user-images.githubusercontent.com/7803070/49133382-1e7b8980-f29d-11e8-873f-64b028c00d0c.png> Sorry for disturbing you a lot, I just wanna to figure out the reason that casuses my implemente does work. Thanks, — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#21 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMS4KdtttmjqN1MyYanYPA7I39ecBMTLks5uzjAJgaJpZM4Y2vJD> .

from pit-speech-separation.

Have you ever faced the problem that two masks always turns to equal during the training about pit-speech-separation HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent