Comments (3)
-
VAD and speech enhancement (SE) dataset must be different. The reason is that SE assumes that incoming signal is always noisy speech not noise only. We also have exploited the VAD dataset (first approach as you mentioned) to train the SE model, however, as expected, failed to train (The problem to solve become more hard because of the noise only segments). In contrast, for the VAD dataset, the ratio between noise only and noisy speech segments should be almost equal in order to prevent the class imbalance problem. These are the reason why we follow different methods to make the VAD and SE dataset. To make the VAD dataset, just follow the way you mentioned "As per your comment in one of the closed issue, you mentioned that you concatenate different sound effects to make one long sound wave containing noises and then pick a random speech utterance and add that speech utterance to noise files at various SNRs until the end of the long sound wave of noises." Additionally YOU MUST VERIFY THE RATIO BETWEEN NOISE (labeled to 0) AND SPEECH SEGMENTS (labeled to 1), ideally, 1 : 1 is the best.
-
Both the fant tool and v_add_noise.m in voicebox (implemented by MATLAB) follow the ITU standard. Therefore, according to my experiments, they didn't show significant difference so that I prefer to use voicebox because of easy implementation. Use anything you want. The make_train_noisy.m is ONLY for the speech enhancement toolkit.
from vad.
I have one more doubt, I would be grateful if you help me out with this. Suppose even I create a long file containing noise of 40 mins and suppose the speech utterance is of 1 minute. On using v_addNoise, it just gives me an output noisy speech of 1 minute in which a noise interval is randomly picked up from the long noise file and added to the speech.
According to you, we however need to add the same 1 minute file 40 times to the long noise file at different SNRs. How do we do that using V_AddNoise? Is it even possible using v_addNoise or should I try with the FaNT tool?
However, a big thanks for clearing my above doubt.
from vad.
the inputs for vaddnoise.m should be
step 1
noise(1:length(speech1)), speech1
step 2
noise(length(speech1)+1 :length(speech1)+1 + length(speech2) ), speech2
from vad.
Related Issues (20)
- After how many iterations did your proposed VAD model converge ? HOT 4
- Questions understanding bdnn_transform function. HOT 1
- Anyone has VAD annotated Mandarin dataset?
- Questions about the data normalization HOT 2
- Process many audio files with the VAD model
- Could you please supply the trained model ? HOT 1
- The deference of VAD_LSTM and VAD_LSTM2
- Training the net with smaller batch sizes
- Can anyone share the noisy data? HOT 1
- Meaning of different functions
- lossing some data
- MRCG RAM consumption
- how to test the single wav audio in pretrained model
- Hi I want to know why you multiply "TrueLabel_bin[iidx:iidx + wsize - 1]" by 10 in func Truelabel2Trueframe
- Problem with the training data
- A question from the beginner HOT 4
- A more modern VAD with pre-trained models
- I used my own dataset to train the model but got error in Dimension
- Consider a cloud CI service
- Dataset download link is unreachable HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vad.