Comments (5)
@vbelz please help on this.
from speech-enhancement.
from speech-enhancement.
Hi @vbelz , Thank you for your kind and quick response.
I have been working on creating the data necessary to train it, and I have a few more questions (sorry for bothering you).
I have approx. 10h of audio, and when I am about to create the dataset, I end up with the following error, caused in the function numpy_audio_to_matrix_spectrogram
:
m_mag_db = np.zeros((nb_audio, dim_square_spec, dim_square_spec))
MemoryError: Unable to allocate array with shape (37028, 257, 257) and data type float64
Also, I have the following questions:
1.) I have used dimensions of 256x256
, as I have downsampled the audios to 16KHz
instead of 8KH
. The window I have used is of 16128KHz
, which is slightly more than one second. Do you thinks this is a correct approach? I mean, your window was of 64Hz
more than a second for 8KHz
, so I scaled it to 16KHz
. Also, the problem I am facing is that the size I get from preparing the dataset is 256x257
(the dimensions that librosa.stft
returns). I don't know why isn't it 256x256
, as my parameters are: hop_length_fft = 63
, n_fft = 510
, frame_length = 16128
and hop_length_frame = 16128
. This gives a result of 16128/63 = 256
, so I don't know where it gets that number of 257
columns.
2.) Why the window should be in between a second? Will it improve its performance if it is smaller / bigger?
3.) Do you think there will be any mayor loss of performance by decreasing precision to 32bit (i.e. numpy datatype = 'float32')
4.) It looks like you are cropping all audios as you don't include the last window of them. Therefore, I have added zero padding to the end of each audio to achieve the window size. What do you think about this?
5.) I have concatenated the audios one after another so they keep the audio structure. Is there a special reason to create a random order? you use the function blend_noise_randomly
to do this.
6.) What's the difference between frame_length
and hop_frame_length
? I think they refer to the same parameter: the sliding window size for STFT, which is by definition the frame_length
.
Thanks a lot for your time and effort
cheers.
from speech-enhancement.
@vbelz please help on this.
Hello Vishesh,
As I am new to this project could you guide me on how to get till till the denoised output.
Note: I do not have GPUs on my computer
from speech-enhancement.
from speech-enhancement.
Related Issues (20)
- Hi, do you have a paper in arxiv? HOT 2
- Error (tensorflow)
- Python 2 or 3 ? HOT 2
- Is this project unsupervised learning? HOT 1
- Why extracted windows are slightly above 1 second?
- Question about the inputs and the outputs of the model HOT 1
- Question on Error of Invalid Instruction (core dumped) HOT 5
- Parser takes only first character from the filename and says “File not found” HOT 4
- The lack of documentation
- global scaling
- Update requirements.txt with librosa version 0.6.x HOT 2
- a little confusion
- Inference pipeline
- Update requirements.txt and there is no json file outputed HOT 2
- Could you share your best model?
- thanks for this project,and here are some solutions of the problems i met with. for reference only.
- How to train for a different audio sampling rate? HOT 1
- MemoryError HOT 1
- [BUG]: Validation against test data
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from speech-enhancement.