Giter VIP home page Giter VIP logo

speech-enhancement's People

Contributors

vbelz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speech-enhancement's Issues

The lack of documentation

hello
I really need your helpppppp.
When i run the main.py, It has error [Errno 2] No such file or directory: './Train/sound/noisy_voice_long.wav'
I want to know what's "noisy_voice_long.wav / noise_long.wav / voice.wav ” and how do i get it.
plz answer me

Parser takes only first character from the filename and says “File not found”

I know it's very difficult to understand my issue, but I'll try my best to explain.
So I've cloned a repository from Github and working on it.
When I run the program without any arguments, it works fine
python main.py --audio_input_prediction works fine.
But when I try to pass my own file, it shows an error.
python main.py --audio_input_prediction myaudio.wav shows an error saying "FileNotFoundError: [Errno 2] No such file or directory: '<File Path/m'"
Notice how it only takes the first character from my argument?
In the code, for default mode, it's something like:
(args.py)parser.add_argument('--audio_input_prediction', default=['default_audio.wav'], type=list) and it works fine.
So naturally, I tried to add '[]' to my file name
python main.py --audio_input_prediction [myaudio.wav] Shows an error too which says "FileNotFoundError: [Errno 2] No such file or directory: '<File Path/['"
See, here it took only the first character of my provided argument i.e. '['
And the file IS there. No spelling mistakes in file name either. Any help would be very much appreciated.

In my conclusion, the issue is in args.py, specifically in this line `parser.add_argument('--audio_input_prediction', default=['noisy_voice_long_t2.wav'], type=list)'.
I even tried to change the type to 'str', but still, I got the same error

Python 2 or 3 ?

Hi, should we use python2.x or python 3.x in order to make it run?
I ask you because I am going through some issues installing the requirements and maybe it has to do with the python version.

Thanks for your work!

General questions

HI @vbelz ,

First of all, thankyou for your work, I have tried to denoise some audio and it worked so good, but I have a few questions

Quoted from README:

Specify how many frames you want to create as nb_samples in args.py (or pass it as argument from the terminal) I let nb_samples=50 by default for the demo but for production I would recommend having 40 000 or more.

1. What is exactly nb_samples?

2. Are the weights provided by you from nb_samples=50?

3. Should I resample audio to be 8KHz for denoising or is it done inside the network? Also, should I do it for training?

4. I want to twerk it to be a better denoiser for background noise rather than specific sounds. What are your thoughts on this? I have a dataset with clean samples and background noise samples. Will it work if it train it? Which hyperparameter should I use?

Thank you so much and sorry for bothering you!

global scaling

Hey @vbelz,
first, thank you for sharing this project, it helps me a lot!
There is one thing I didn't understand and that's the global scaling of matrix_spec (and inverse global scaling).
How did you choose the numbers for scaling? and why there is different scaling for X_in and X_ou?

How to train for a different audio sampling rate?

Hi vbelz,

I am wondering what changes need to be made in order to train for a different audio sampling rate, e.g. at 44100 Hz?

I assume both the model and some parameters in args.py need to be modified. Can you please share some insights on this?

Thanks,
Tony

[BUG]: Validation against test data

Training error

At line 60, as mentioned here you're validating against test data while training ? Isn't it supposed to be train data?

python3 history = generator_nn.fit(X_train, y_train, epochs=epochs, batch_size=batch_size, shuffle=True, callbacks=[checkpoint], verbose=1, validation_data=(X_test, y_test))

Error (tensorflow)

I am getting the following error for tensorflow module:

ERROR:root:Internal Python error in the inspect module.
Below is the traceback from this internal error.

please help me out to get rid of this error.

Thank you.

Is this project unsupervised learning?

Hi,@vbelz, I have a problem.
Is this project unsupervised learning? There are 10 kinds of noise collected in this project. I originally thought that 10 models could be seen in the weight folder, but I only saw two models, model_best and model_unet.

thanks.

Inference pipeline

Hello,

The model does a great job of removing the noise. However I notice that the speech quality is degraded.

For testing, I changed X_denoise = m_amp_db_audio - inv_sca_X_pred[:,:,:,0] to X_denoise = m_amp_db_audio.

I was expecting the original audio file. FYI, my input is a mono channel 16000 wav file. Can you please help me. I am guessing I need to change some parameters other than the sample rate.

Question on Error of Invalid Instruction (core dumped)

Hey @vbelz I had a question :
While running the python main.py --mode='data_creation' I get the error Invalid Instruction (core dumped)
I guess it because of the tensorflow version (1.15.2) as my cpu does not support AVX
But it would not give an error if I use tensorflow version (1.5)

If I want to use the same version of tensorflow that is 1.15.2 what could be an alternative?

MemoryError

Hello blogger, I encountered an error: MemoryError: Unable to allocate array with shape (20000, 128, 128) and data, where I changed 40000 to 20000, but there is still this issue. I would like to ask if this is due to excessive training set data or NB_ The samples are too large

Why extracted windows are slightly above 1 second?

First of all, thank you so much for this repository. I am doing some research in the speech domain, and this has been very helpful.

But, I have some doubts regarding the same.

  1. Why extracted windows are slightly above 1 second and not exactly 1 second?
  2. Can this 1 second be increased to more number of seconds? How will this affect the training?

Thanks in advance.

Update requirements.txt and there is no json file outputed

Hello there!
So I had a journey installing all dependencys and python it would be great if you update the requirements.txt file! Iam guessing that my problem is cussed by the worng version of tensorflow.
But my problem is that after training there won't be any .json file outputed in any folder
It would be great if you help me 😃

Question about the inputs and the outputs of the model

Hey,
First of all your code is great! it worked for me and it is very simple and clear 👍
One question - in your model you used Xin to be spectogram(noisy_voice) and Xout is spectogram(noisy_voice) - spectogram(voice). I didn't understand why did you do the substruction so I tried to take Xout to be spectogram(voice), but then I got underfitted loss. Do you know why that happens?

Thanks again!
Olga :)

a little confusion

Hey!
I have some confusion about the computing process. The input audio whose size is 112501KB, gets an output of 112486KB. Could you tell me the reason and the operations about the audio throughout the prediction?

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.