Giter VIP home page Giter VIP logo

metricgan's People

Contributors

jasonswfu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

metricgan's Issues

Questions about MetricGAN

While I study MetricGAN, I have some questions.

  1. What is a generative component of MetricGAN?
    For example, SEGAN[1] has a generative component. It concatenates "random latent vectors" to its encoder output.

  2. Do you classify this work as a generative or discriminative method?

  3. Have you tried a pre-trained discriminator(D) that is trained with training data and then fixed while training the generator(G)?
    As you mentioned in the paper, D should be called an evaluator. The training procedure of MetricGAN is not adversarial. G and D help each other.
    Therefore, I'm wondering which is better, 'training G with pre-trained D' or 'training G and D together'.

[1] Pascual, Santiago, Antonio Bonafonte, and Joan Serra. "SEGAN: Speech enhancement generative adversarial network." arXiv preprint arXiv:1703.09452 (2017).

Thank you very much for sharing your great work.

pretrained SE model available?

Hi,
I was trying to see if I could access a pretrained SE model that I could use this method as a baseline for my work. I could not find it here, so if possible, can you please redirect me to the correct place? Thanks a lot!

About pesq measurement

I have an inquiry about the performance measurement method for the NSDTSEA (VCTK+DEMAND) data set published in the paper.

I think you measured MOS-LQO and not PESQ. Many open source codes convert the measurement results of WB PESQ into MOS-LQO and export them. So the range of values is not -0.5 to 4.5 as you wrote in your paper.

My own measurements on the same dataset show that the PESQ is 1.68. For MOS-LQO it measures 1.97 as shown in your paper. The remaining CSIG, CBAK, and COVL were measured in the same way as in the paper.

I would appreciate it if you could confirm that your pesq metric is correct.

A few question on training schemes

Hi, Jason

Thank you very much for sharing the codes. I really like your paper, it's quite flexible and efficient to optimize some un-differentiable measures using your proposed scheme. Regarding to the detailed training procedures, I have several questions below:

(1) Did you train enhancement model (G), using PESQ measure (D) directly? I mean, did you add some additional loss, like MSE, or only used PESQ measure alone, in training stage?
(2) when training D, I found you trained it using so-called "previous list". It seems optional, I would like to know whether this stage is crucial for getting a better result?
(3) In the released codes, G and D are trained alternately for num_sampling=100 steps in one epoch. And batch_size used is equal to 1. I am wondering whether these hyper-parameters are same with your recipe, to get the Table.2 results?

Sorry to ask so many questions. Thank you again and wish you good works in future!

Batch Size = 1

Hi,
I am curious about the batch size. It is always 1 for different implementations. How long does it require to train 600 epochs? Why not use a much larger batch? It takes lots of time to train.

Running PESQ file

Hi, Jason

I am trying to run MetricGAN.py But it fails while executing the read_pesq function.

def read_pesq(clean_root, enhanced_file, sr):
    f=enhanced_file.split('/')[-1]       
    wave_name=f.split('_')[-1].split('@')[0]
    clean_file=clean_root+'Train_'+wave_name+'.wav'
    
    cmd = PESQ_path+'/PESQ {} {} +{}'.format(clean_file, enhanced_file, sr)
    proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE)
    out = proc.communicate()
    pesq = float(out[0][-6:-1])
    return (pesq+0.5)/5.0

The particular error is (line 84):

    pesq = float(out[0][-6:-1])
ValueError: could not convert string to float:

I tried running it separately in bash in format

PESQ clean.wav enhanced.wav 16000

The error is PESQ: command not found

Could you, please, share more detail about how to use PESQ file?

Thank you very much!

评价指标

能提供一下你们的评价指标代码吗,为什么有些我测的有些指标跟论文不一致

which cuda and cudnn version did you use?

when I run ./MetricGAN.py, I get some number (maybe loss) and error, just like this:
How do I fix this error? which cuda & cudnn versions did you use? (The keras, librosa, python versions are exactly same with yours)

0.765720826757873
2.1266
/home/users/woody/vocoder/speech_enhancement/metricGAN/local/lib/python2.7/site-packages/matplotlib/axes/_base.py:3152: UserWarning: Attempting to set identical left==right results
in singular transformations; automatically expanding.
left=1, right=1
'left=%s, right=%s') % (left, right))
Sample training data for discriminator training...
Discriminator training...
/home/users/woody/vocoder/speech_enhancement/metricGAN/local/lib/python2.7/site-packages/keras/engine/training.py:973: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set model.trainable without calling model.compile after ?
'Discrepancy between trainable weights and collected trainable'
Epoch 1/1
2019-09-03 18:42:10.063259: E tensorflow/stream_executor/cuda/cuda_dnn.cc:403] could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED
2019-09-03 18:42:10.063335: E tensorflow/stream_executor/cuda/cuda_dnn.cc:411] possibly insufficient driver version: 384.130.0
2019-09-03 18:42:10.063354: F tensorflow/core/kernels/conv_ops.cc:712] Check failed: stream->parent()->GetConvolveAlgorithms( conv_parameters.ShouldIncludeWinogradNonfusedAlgo(), &algorithms)
Aborted (core dumped)

~loss: Nan

Hi,Jason:

When I try to train the MetricGAN(table2).py, I met two problems:
(1)At every epoch both on G and D, its loss is nan , but only show a warning :

/student/home/yll/anaconda3/envs/MetricGAN/lib/python3.6/site-packages/keras/engine/training.py:973: UserWarning: Discrepancy between trainable weights and collected trainable weights, did you set `model.trainable` without calling `model.compile` after ?
  'Discrepancy between trainable weights and collected trainable'
Epoch 1/1
 - 1498s - loss: nan

(2) At the fisrt epoch of traning Discriminator, it taken 593M on GPU's memory but its ultilization is 0%.

Data info: I use the same dataset from SEGAN already downsampled to 16k
by the way, keras-gpu=2.1.2, tensorflow-gpu=1.10, librosa=0.5.1, python=3.6

Do you know why this happen(loss nan)?
Thank you very much.

method

it is operate on TF domain right?

About target metric

In this paper, MetricGAN used PESQ or STOI as the target metric for training.
In addition to PESQ and STOI, is it possible to use metrics such as CSIG, CBAK, and COVL be used as target metrics? (Possibility)

Multi GPU issue

Hello authors,

I'm trying to run the MetricGAN on multi GPUs. When I added mirror strategy to the MetricGAN model and try to run, it is running on the single GPU. Can you suggest how to resolve this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.