Comments (13)
mfcc setting in python:
# get the mfcc of noisy voice mfcc_feat = mfcc(sig, sample_rate, winlen=0.032, winstep=0.032 / 2, numcep=20, nfilt=20, nfft=512, lowfreq=20, highfreq=8000, winfunc=np.hanning, ceplifter=0, preemph=0, appendEnergy=True)
mfcc setting in main.c:
// 20 features, 0 offset, 20 bands, 512fft, 0 preempha, attached_energy_to_band0 mfcc_t *mfcc = mfcc_create(NUM_FEATURES, 0, NUM_FEATURES, 512, 0, true);
#define SAMP_FREQ 16000 #define MEL_LOW_FREQ 20 #define MEL_HIGH_FREQ 8000
from nnom.
Hi @liziru
Please use some true number to test both functions. All zero simply means there is no energy in each band so the first band will give to its minimum cause by Log(0). With true signal (or just some random noise), you might plot both or use some metric like MSE or cosine similarity to compare the output of those 2 signals.
Since we use the option appendEnergy=True
and in main.c mfcc_create(..., true).
, the first band will represent the energy of the FFT. I believe the python is using 64bit float arithmetic but 32bit float in C.
So this might be the cause of the different. anyway, both -84 and -36 are their minimum number.
In both python and c code, they are saturated by 2^3 = 8
nnom/examples/rnn-denoise/main_arm.c
Line 210 in ec3afac
nnom/examples/rnn-denoise/main.py
Line 269 in ec3afac
They will both saturated to -8
after these 2 quantisation/saturation. So this energy different will not affect anything.
from nnom.
@majianjia
Thank you for your reply.
I did the test following your advice and it really works.
However, I found another two problems.
First, with the sample input(0-512, 512 samples), the result of the python code is a little different from that of c code. As you said, python using 64bit float arithmetic but 32bit float in C may lead to this problem.
before being saturated to -8, python and c results with the same input:
-8.0303,29.3225,6.7850,7.4641,3.6157,4.1926,2.4651,2.8310,1.8338,2.0457,1.3851,1.5110,1.0347,1.0887,0.7333,0.7338,0.4813,0.4191,0.2408,0.1338, -4.5886,30.0869,7.2367,7.8549,3.8652,4.3900,2.5817,2.9530,1.8638,2.1015,1.4186,1.5603,1.0761,1.1739,0.7649,0.7836,0.5086,0.5034,0.2978,0.1628
Second, with the sample input, the result of nnom inference is a little different from results of tf model.predict api.
input feats:
-8.0303, 29.3225, 6.7850, 7.4641, 3.6157, 4.1926, 2.4651, 2.8310, 1.8338, 2.0457, 1.3851, 1.5110, 1.0347, 1.0887, 0.7333, 0.7338, 0.4813, 0.4191, 0.2408, 0.1338, -8.0303, 29.3225, 6.7850, 7.4641, 3.6157, 4.1926, 2.4651, 2.8310, 1.8338, 2.0457, -8.0303, 29.3225, 6.7850, 7.4641, 3.6157, 4.1926, 2.4651, 2.8310, 1.8338, 2.0457
results of nnom infer and tf api infer:
0.4724,0.7480,0.8504,0.8583,0.8583,0.8583,0.8346,0.8425,0.8110,0.8346,0.8110,0.8268,0.8268,0.8031,0.8268,0.8268,0.8346,0.8425,0.8031,0.8346, 0.9275,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000,1.0000
The first row is results of nnom infer. The secocnd row is results of tf api infer. Quanifiztions of inference and feats can leads to some loss, but the loss is a little big.
Is the loss acceptable? Do you have some advice to improve the loss?
Looking forward to your reply soon!
from nnom.
As a footnote,my nn model is made up of four full-connected layers, so there is no hidden information like RNN. And, the result distributions of two inference engines is almost same.
from nnom.
the 8 bit resolution might not good for regression application. Please also try to this if It is related. #104
I will check in detail later when I am back.
from nnom.
the 8 bit resolution might not good for regression application. Please also try to this if It is related. #104
I will check in detail later when I am back.
Thank you very much. I checked my code and 'NNOM_TRUNCATE' was already defined in nnom_port.h as you advised in #104, but I didnot do the following step because i think this ops will round the results. change this line #define NNOM_ROUND(out_shift) ( (0x1u << out_shift) >> 1 ) to #define NNOM_ROUND(out_shift) ((q31_t)( (0x1u << out_shift) >> 1 )) fix the issue. But how about the arm version? still not working.
Sadly, the loss is not changed.
from nnom.
Round or floor don't actually change the result because it only affects the result by 0.5/128.
In the denoise example, the output of normal gains are like this, with column represent the gain index (1~20) and rows represent the timestamp. You can see that they will reach 1 here after the hard_sigmoid() as the final output layer.
Did you try to use Conv or RNN? They might be different, dense is not working well when the 2 vectors is hugely in size (e.g. 1000 units input vs 2 units output).
from nnom.
Round or floor don't actually change the result because it only affects the result by 0.5/128.
In the denoise example, the output of normal gains are like this, with column represent the gain index (1~20) and rows represent the timestamp. You can see that they will reach 1 here after the hard_sigmoid() as the final output layer.Did you try to use Conv or RNN? They might be different, dense is not working well when the 2 vectors is hugely in size (e.g. 1000 units input vs 2 units output).
Thank you for your reply.
I have to use denses(full-connected) layer in rnn-denoise project due to some limits. Input size and output size is all 20, so dense should be ok. I use four dense layers with relu activations besides the last layer with sigmoid rather than hard-sigmoid.
I can understand the gains table provided in the picture, but the loss of two inference engines exists truly which makes me sad and confused.
from nnom.
The RNN currently runs with 8bit input/output data and 16bit memory (state) data, which might keep more info.
I am not sure what is the cause of the loss you met. Would you be able to validate the model for more data?
You may also try Conv-based network, TCN (consist of Conv with dilation>1) is completely fine using NNoM. Which can outperform RNN type model.
from nnom.
The RNN currently runs with 8bit input/output data and 16bit memory (state) data, which might keep more info.
I am not sure what is the cause of the loss you met. Would you be able to validate the model for more data?
You may also try Conv-based network, TCN (consist of Conv with dilation>1) is completely fine using NNoM. Which can outperform RNN type model.
OK, I think i am close to the answer. I found the weights.h file is different because of different x_train, i have to say the x_train is generated randomly, which is used to compare with nnom infer in c code with x_train as the same input. As a result, the result of nnorm infer with different weight.h file generated with different x_train.
# now generate the NNoM model. generate_model(model, x_train[:2048 * 4], name='weights.h')
weight.h file comparsions generated with different x_train.
So, the loss of two infer engines maybe caused by setting in weight.h.
However, NNOM_TRUNCATE' was already defined in nnom_port.h as you advised in #104, which should mean i am using float computation now.
from nnom.
@majianjia After i set x_train in 'generate_model' to x_train of training as you did in main.py example,
the result of nnom infer changes a lot and is still hugely different from that of 'tf predict' api infer.
the first row is generated by nnom infer.
0.5827,0.5197,0.4409,0.3937,0.2992,0.1654,0.1102,0.1181,0.1260,0.1417,0.1575,0.1575,0.1654,0.1732,0.1811,0.1969,0.2126,0.2126,0.2047,0.2047, 0.9952,0.9999,0.9998,0.9994,0.9898,0.9204,0.6904,0.6838,0.7321,0.8566,0.8668,0.8191,0.7994,0.8683,0.8680,0.9044,0.9288,0.9375,0.9346,0.9124
from nnom.
Forget about NNOM_TRUNCATE
since you don't use RNN layers. Also, this macro is not about using the floating number, nnom currently only running on 8bit fixed-point data.
For the calibration step generate_model(model, x_train[:2048 * 4], name='weights.h')
, you should use real data, can be training or testing data but not random number. And the data should covert the most cases possible, you can enlarge the size of x_train[:2048 * 4]
to see if it helps.
The calibration step will generate those number in you screenshot is determined by the output of each layer. Therefore to provide this q format to contains the maximum/minimum of the layer/weights. By using different calibration dataset, these bits/shift are supposed to be changed. However, calibrating with different real signals will bring little changes while with fake signals can change quite a lot.
I will suggest you run the example first. Once it is successful, then modify the tf model and see if it still work.
from nnom.
Forget about
NNOM_TRUNCATE
since you don't use RNN layers. Also, this macro is not about using the floating number, nnom currently only running on 8bit fixed-point data.For the calibration step
generate_model(model, x_train[:2048 * 4], name='weights.h')
, you should use real data, can be training or testing data but not random number. And the data should covert the most cases possible, you can enlarge the size ofx_train[:2048 * 4]
to see if it helps.
The calibration step will generate those number in you screenshot is determined by the output of each layer. Therefore to provide this q format to contains the maximum/minimum of the layer/weights. By using different calibration dataset, these bits/shift are supposed to be changed. However, calibrating with different real signals will bring little changes while with fake signals can change quite a lot.I will suggest you run the example first. Once it is successful, then modify the tf model and see if it still work.
I am sorry to tell you that enlarging the size of x_train[:2048 * 4]
and running the example first and modifying the tf model does not work. Your nnom inference is a good project. Do you have a plan to support floating computation? I think developers of other areas will like this project very much.
from nnom.
Related Issues (20)
- Incosistent Accuracy Between Python and C Implementation HOT 2
- 关于记忆性 HOT 4
- 报错 File "ptq_ns/nnom-master/scripts/nnom.py", line 1019, in generate_model inX += ' ,layer[%d]' % (LI[inp][0]) KeyError: 'tf.__operators__.getitem_5'
- about reshape HOT 3
- 使用了per channel量化和kld量化方法后,出现了多次推理结果不一致的问题
- 关于输出维度的问题 HOT 4
- nnom静态内存支持如何打开? HOT 6
- main_pc.c中的test_x.txt和text_y.txt如何制作?
- 关于识别几秒时长的语音 HOT 3
- inhomogeneous shape after 1 dimensions
- keyword_spotting中的main_pc.c中的test_x.txt和text_y.txt如何制作?
- nothing
- 关于使用DW_Conv2D与Conv2D的移植后使用耗时的问题 HOT 1
- 使用#define NNOM_USING_CMSIS_NN 的错误。 HOT 2
- 运行rnnnoise的example报错 HOT 2
- Error: Deprecated Usage of `np.int` in `to_cstyle` Function in `gen_config.py`
- Keras 3 version compatibility HOT 1
- Installable using pip
- Exception with Tensorflow 2.16 HOT 2
- Consultation on kws examples?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from nnom.