xiph / rnnoise Goto Github PK

Recurrent neural network for audio noise reduction

License: BSD 3-Clause "New" or "Revised" License

C 69.36% Python 24.32% Shell 1.04% Makefile 1.29% M4 3.98%

rnnoise's Introduction

RNNoise is a noise suppression library based on a recurrent neural network.
A description of the algorithm is provided in the following paper:

J.-M. Valin, A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech
Enhancement, Proceedings of IEEE Multimedia Signal Processing (MMSP) Workshop,
arXiv:1709.08243, 2018.
https://arxiv.org/pdf/1709.08243.pdf

An interactive demo is available at: https://jmvalin.ca/demo/rnnoise/

To compile, just type:
% ./autogen.sh
% ./configure
% make

Optionally:
% make install

It is recommended to either set -march= in the CFLAGS to an architecture
with AVX2 support or to add --enable-x86-rtcd to the configure script
so that AVX2 (or SSE4.1) can at least be used as an option.
Note that the autogen.sh script will automatically download the model files
from the Xiph.Org servers, since those are too large to put in Git.

While it is meant to be used as a library, a simple command-line tool is
provided as an example. It operates on RAW 16-bit (machine endian) mono
PCM files sampled at 48 kHz. It can be used as:

% ./examples/rnnoise_demo <noisy speech> <output denoised>

The output is also a 16-bit raw PCM file.
NOTE AGAIN, THE INPUT and OUTPUT ARE IN RAW FORMAT, NOT WAV.

The latest version of the source is available from
https://gitlab.xiph.org/xiph/rnnoise . The GitHub repository
is a convenience copy.

== Training ==

The models distributed with RNNoise are now trained using only the publicly
available datasets listed below and using the training precedure described
here. Exact results will still depend on the the exact mix us data used,
on how long the training is performed and on the various random seeds involved.

To train an RNNoise model, you need both clean speech data, and noise data.
Both need to be sampled at 48 kHz, in 16-bit PCM format (machine endian).
Clean speech data can be obtained from the datasets listed in the datasets.txt
file, or by downloaded the already-concatenation of those files in
https://media.xiph.org/rnnoise/data/tts_speech_48k.sw
For noise data, we suggest concatenating the 48 kHz noise data from DEMAND at
https://zenodo.org/records/1227121
with contrib_noise.sw and synthetic_noise.sw noise files from
https://media.xiph.org/rnnoise/data/
To balance out the data, we recommend using multiple (e.g. 5) copies of the
contrib_noise.sw and synthetic_noise.sw noise files.

The first step is to take the speech and noise, and mix them in a variety of ways
to simulate real life conditions (including pauses, filtering and more).
Assuming the files are called speech.pcm and noise.pcm, start by generating
the training feature data with:

% ./dump_features speech.pcm noise.pcm features.f32 <count>
where <count> is the number of sequences to process. The number of sequences
should be at least 10000, but the more the better (200000 or more is recommended).

Optionally, training can also simulate reverberation, in which case room impulse
responses (RIR) are also needed. Limited RIR data is available at:
https://media.xiph.org/rnnoise/data/measured_rirs-v2.tar.gz
The format for those is raw 32-bit floating-point (files are little endian).
Assuming a list of all the RIR files is contained in a rir_list.txt file,
the training feature data can be generated with:

% ./dump_features -rir_list rir_list.txt speech.pcm noise.pcm features.f32 <count>

To make the feature generation faster, you can use the script provided in
script/dump_features_parallel.sh (you will need to modify the script if you
want to add RIR augmentation).

To use it:
% script/dump_features_parallel.sh ./dump_features speech.pcm noise.pcm features.f32 <count> <nb_processes>
which will run nb_processes processes, each for count sequences, and
concatenate the output to a single file.

Once the feature file is computed, you can start the training with:
% python3 train_rnnoise.py features.f32 output_directory

Choose a number of epochs (using --epochs) that leads to about 75000 weight
updates. The training will produce .pth files, e.g. rnnoise_50.pth .
The next step is to convert the model to C files using:

% python3 dump_rnnoise_weights.py --quantize rnnoise_50.pth rnnoise_c

which will produce the rnnoise_data.c and rnnoise_data.h files in the
rnnoise_c directory.

Copy these files to src/ and then build RNNoise using the instructions above.

For slightly better results, a trained model can be used to remove any noise
from the "clean" training speech, before restaring the denoising process
again (no need to do that more than once).

== Loadable Models ==

The model format has changed since v0.1.1. Models now use a binary
"machine endian" format. To output a model in that format, build RNNoise
with that model and use the dump_weights_blob executable to output a
weights_blob.bin binary file. That file can then be used with the
rnnoise_model_from_file() API call. Note that the model object MUST NOT
be deleted while the RNNoise state is active and the file MUST NOT
be closed.

To avoid including the default model in the build (e.g. to reduce download
size) and rely only on model loading, add -DUSE_WEIGHTS_FILE to the CFLAGS.
To be able to load different models, the model size (and header file) needs
to patch the size use during build. Otherwise the model will not load
We provide a "little" model with half as an alternative. To use the smaller
model, rename rnnoise_data_little.c to rnnoise_data.c. It is possible
to build both the regular and little binary weights and load any of them
at run time since the little model has the same size as the regular one
(except for the increased sparsity).

rnnoise's People

Contributors

Stargazers

Watchers

Forkers

zhaoforever jjoergensen fanwei918 jsimnz zgsxwsdxg xzm2004260 wdv4758h janx2 fy378968174 mindfy maxmax2016 lyapple2008 dofuuz yunzqq saitamandd holm-xie fototo xepost jkimmason hyli666 mctyro tsh320621 airacks 18307612949 yongxuustc cc-cherie shigang ahikaml laiyang cubrady james-lh joosthub stanfordone aitorbajo whisperai hongshui3000 lorabit robo-warrior bsrjy ltcxjtu hellocsi hcfeng201 lumiamia icefire-luo dylancao redscv chenglinjuan lifasun fliperworld zengxiao1028 zkj3e fanjinfei asdlei99 shaquilleliu zhly0 edresson zxzandymac iooops onefm2 mumble-voip dianaavalos hzauccg windstudent wangqisitp wegylexy cybrobin zephyract ultrant wangyang2014 loretoparisi jwq1220 opensoftandfree as472780551 zuowanbushiwo dacson donghaiyw 1815368076 kevin-ke lyt9726 maydali28 mysticaltech chenxinglili racerxdl gordonhuangyong umeike mdiedric vincenthdu gregorr xinkez dariofranceschinipev wxthon liuyuting1997 zonespc chavesliu wangmengzhi audiobucket mohabouje keerthana-manjunatha vehere-ccu9 mingruiyuan

rnnoise's Issues

Pitch estimation pseudo-interpolation bug

I might be wrong, but there could be a bug in the pitch pseudo-interpolation.
Luckily, it's not critical, but better solve if it's an actual bug.

At the end of pitch_search():

   if (best_pitch[0]>0 && best_pitch[0]<(max_pitch>>1)-1)
   {
      opus_val32 a, b, c;
      a = xcorr[best_pitch[0]-1];
      b = xcorr[best_pitch[0]];
      c = xcorr[best_pitch[0]+1];
      if ((c-a) > MULT16_32_Q15(QCONST16(.7f,15),b-a))
         offset = 1;
      else if ((a-c) > MULT16_32_Q15(QCONST16(.7f,15),b-c))
         offset = -1;
      else
         offset = 0;
   } else {
      offset = 0;
   }
   *pitch = 2*best_pitch[0]-offset;

I think that the last line should be *pitch = 2*best_pitch[0]+offset; (plus instead of minus).
For instance, when c-a > .7*(b-a) is true, it means that xcorr[best_pitch[0]+1] is the strongest correlation coefficient (a and b are almost the same, while c is greater than a). Hence, I would return 2*best_pitch[0]+1.

Note that at the end of remove_doubling() the same is done, but in fact *T0_ = 2 * T + offset; is assigned (which has a +). I haven't sent a pull request since this code comes from Opus (and it's used in other projects).

Output file has no profile?

I tried to rnnoise_demo with my data, which is 16 bit monaural format converted from 24 bit monorail format using ffmpeg. Rnnoise succeeded to produce a new data, but I could not play that. What's wrong?
Would you tell me some hint?

pitch_filter

Firstly, amazing work with this project!
I found pitch_filter(X, P, Ex, Ep, Exp, g); the g is initial in while() for every frame, I feel it is somewrong

Better understanding the feature extraction process

Hi, first of all thank you for your amazing work!

I've been looking at the code and I am interested in better understanding the feature extraction process.
I have some questions, I will do my best to be as detailed as possibe.

Let me start from compute_frame_features() in denoise.c.
The first step is computing the energy in the Opus bands (done in frame_analysis()) and then the pitch is estimated/tracked. pitch_downsample() performs the 20ms frames downsampling by halving the samples using a [0.25, 0.5, 0.25] kernel around the even samples. Is this a less expensive way to perform downsampling and low-pass filtering jointly to avoid aliasing?

Then _celt_autocorr() is called with a lag of just 4 samples on the downsampled sequence (hence, 24kHz). Why just 4? What is achieved exactly with such a low lag? And if I look at _celt_autocorr(), I don't understand why the autocorrelation computed by celt_pitch_xcorr() is modified afterwards by summing the autocorrelation for different lags.
After that, the autocorrelation is further modified once _celt_autocorr() returns (below the // Noise floor -40 dB comment). Why is that done? And finally _celt_lpc() is called and the LPC coefficients modified (I mean lpc2) and used to filter the downsampled sequence via celt_fir5().
This whole part is a bit obscure to me and it's also hard to understand where some constants come from (e.g., ac[0] *= 1.0001f; and ac[i] -= ac[i]*(.008f*i)*(.008f*i);) - I've found some possible mappings between the dB and the linear scale, but I'm not fully sure.
Overall, pitch_downsample() looks like a pre-processing step before the pitch is sought in pitch_search(). It would be great if you can share details on what is done.

My apologies if I am asking something that may be obvious to others. I'm to some extent familiar with LPC and auto-correlation, a little bit with pitch tracking. That's probably why I can't grasp all the details in the code.

Cheers,
Alessio

.so library

Hi,thanks for your nice work,is it possible to provide a so library with a .h file?in the code,the run file is a sh file

Decoding the model

When I try to use the provided python script dump_rnn.py to decode the newweights9i.hdf5 model, I found that it can not work well. So I change a lot of it to make it work well. I am not sure if it is right in my way. Here I want share them to you. If you are not busy in some time, please help me check it. I have try it to decode the model i got. I add below in the begging.

from keras.constraints import Constraint
def mean_squared_sqrt_error(y_true, y_pred):
return K.mean(K.square(K.sqrt(y_pred) - K.sqrt(y_true)), axis=-1)

def my_crossentropy(y_true, y_pred):
return K.mean(2*K.abs(y_true-0.5) * K.binary_crossentropy(y_pred, y_true), axis=-1)

def mymask(y_true):
return K.minimum(y_true+1., 1.)

def msse(y_true, y_pred):
return K.mean(mymask(y_true) * K.square(K.sqrt(y_pred) - K.sqrt(y_true)), axis=-1)

def mycost(y_true, y_pred):
return K.mean(mymask(y_true) * (10K.square(K.square(K.sqrt(y_pred) - K.sqrt(y_true))) + K.square(K.sqrt(y_pred) - K.sqrt(y_true)) + 0.01K.binary_crossentropy(y_pred, y_true)), axis=-1)

def my_accuracy(y_true, y_pred):
return K.mean(2*K.abs(y_true-0.5) * K.equal(y_true, K.round(y_pred)), axis=-1)

class WeightClip(Constraint):
def init(self, c=2,name='WeightClip'):
self.c = c

    def __call__(self, p):
	#return {'name': self.__class__.__name__, 'c': self.c}
            return K.clip(p, -self.c, self.c)

    def get_config(self):
            return {'name': self.__class__.__name__, 'c': self.c}

add an argument to name = 'WeightClip' init

and change load_model from
model = load_model('./newweights9i.h5', custom_objects={'msse': mean_squared_sqrt_error, 'mean_squared_sqrt_error':mean_squared_sqrt_error, 'my_crossentropy':mean_squared_sqrt_error, 'mycost':mean_squared_sqrt_error, 'WeightClip':foo})

to
model = load_model(sys.argv[1], custom_objects={'msse':msse, 'mean_squared_sqrt_error': mean_squared_sqrt_error, 'my_crossentropy':my_crossentropy, 'mycost':mycost, 'WeightClip':WeightClip})

Unused training data

In the rnn_train.py file there are two arrays that are never used

noise_train = np.copy(all_data[:nb_sequences*window_size, 64:86])
noise_train = np.reshape(noise_train, (nb_sequences, window_size, 22))

what are they? Are they needed somewhere? The training data containing on those arrays were never used at all!

Clipping of desired signal present in output

Hi,

Firstly, amazing work with this project! I have verified that it is quite effective at de-noising speech samples out of the box. I have found, however, that feeding certain noisy speech samples through the example program results in de-noised output with the speech signal being clipped. For my use-case, it would be highly favorable to cancel slightly less noise to receive non-clipped output. Is this something that can be easily done, either out-of-box or via modification on my end?

I am also wondering if the clipping is resultant of the samples themselves (which were recorded with a laptop mic) rather than your code, persay.. I can't yet identify what about my samples causes some to get clipped and others to not.

biquad

Hi,in the code,the function seems to doing a filtering to the speech and noise,is there some formula to this,and the related parameter -1.99599, 0.99600,-2,1?Thanks.

Magic numbers

There currently are quite a few magic numbers in the codebase. I’m referring to this type:

Unique values with unexplained meaning or multiple occurrences which could (preferably) be replaced with named constants

compiling with emscripten

Can you tell me how to compile this rnnoise code using Emscripten toolkit.

WAV to PCM and back?

Could someone please tell the command to convert WAV to PCM for processing and then back to WAV?

When i use your demo to train my dataset, the trained rnnoise_demo has bad performance

I use speech (2 hours), noise(2 hours), generate 24 hours dataset to train rnnoise, but the trained demo has bad performance, can you give some advice? thanks~

Trained Weights

Does this repo have a checkpointed version of a working set of trained weights. Can't seem to find it.

Additionally, I see there is training code in the repo, and you mention the data format for the audio used for training here but how would I go about training a network with my own audio data?

Thanks!

Loud Macbook Pro fan and false positives

First of all, awesome stuff! Kudos for open sourcing this gem.

I'm messing around with RNNoise a bit and stumbled upon an interesting behavior. For as long as my Macbook Pro fans are quiet, VAD works perfectly as well as the actual noise suppression. Once my fans turn up and go all the way to ~6200RPM, weird things start to happen:

CPU usage starts to progressively increase, only the thread or CPU core capacity is the limit
when this happens, RNNoise starts misinterpreting my laptop fan noise as an actual voice with high confidence (> 0.95)
this doesn't stop until I actually say anything. At that point, RNNoise regains confidence in what human voice actually sounds like and starts behaving properly for the next 2-10 seconds (depending on how loud my voice was)
Rinse and repeat :)

I understand the temporal dynamic behavior of RNNs, but I wonder - could this be caused by insufficient data in the model (not covering my laptop fan properly), the NN layers structure or rather the code around it?

I'd like to understand better where to start digging. Thanks!

FRAME_SIZE should be accessible

It took me a while to see that FRAME_SIZE currently has to be hard-coded by the library user. There simply is no API for grabbing the internal value. Can we fix this?

downsample

hi.the function downsampling in pitch.c，what does this function do ？just as the function name，doing downsampling to the pitch buffer？why it need downsampling ？

about adding noise

Hi, this is a great project. I have tried your pretrained model, and it performed well in real-world scenarios.
I collected about 50 hours clean speech data and 3 hours noise data myself. I used your code to generate noisy speech data and to extract input features and output labels. But no matter I trained the model from scratch or by loading your pretrained model, the performance of my trained model is worse than that of your model. So is there any trick to construct the training data?
When listening to the constructed noisy speech, I found that noise in some places is too stong to hear nothing. Is that because the speech gain or noise gain was to big in that places? The speech gain was set to between 0.01 and 10. And the noise gain was set to between 0.03 and 10. It seemed that data overflow occured. What do you think about?
Looking forward to your reply. Thanks.

Fixed-point arithmetic

Did anyone test to run the code using fixed-point arithmetic?
How was the results? What Q number format did you use?

Best regards

Keras version

According to your web site https://people.xiph.org/~jm/demo/rnnoise/ the initial version was in Keras. Is it possibile to make available this version? Thanks a lot

--enable-doc doesn't install documentation

It always installs only these doc files:

%%PORTDOCS%%%%DOCSDIR%%/AUTHORS
%%PORTDOCS%%%%DOCSDIR%%/COPYING
%%PORTDOCS%%%%DOCSDIR%%/README

aoubt Fixed music background noise

Fixed music background noise, speech; can remove the background noise noise. I think rrnoise is very suitable for doing such a thing. How do I control the GRU switch? I think it's an exciting thing.

Adjustable amount of reduction

Hello Jean-Marc! This is probably something that you already were planing but completely remove noise during low SNR portions of the sound doesn't always sounds good. Easily fixable in my plugin (https://github.com/lucianodato/speech-denoiser) but something you should consider for the library.
By the way I'm having troubles making it work in my plugin, does the library support 32 bit floats or do I have to cast them down to 16 bits???

16kHz conversion

Thanks for the great project.!
I really get a good inspiration on it.

Now I'm trying to convert from 48kHz samplingrate base code to 16kHz sampling rate code.

I change some parameters like followings.

INPUT_FEATURE_LEN : 42 -> 38
OUTPUT_GAIN_LEN : 22 -> 18
It was changed in accordance with eband5ms table.
static const opus_int16 eband5ms[] = {
/0 200 400 600 800 1k 1.2 1.4 1.6 2k 2.4 2.8 3.2 4k 4.8 5.6 6.8 8k 9.6 12k 15.6 20k/
0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, 28, 34, 40, 48, 60, 78, 100
};

and also change pitch related factors.
#define PITCH_MIN_PERIOD 60 ->20
#define PITCH_MAX_PERIOD 768 ->256
#define PITCH_FRAME_SIZE 960->320

is it correct?
I tried neural network training after changing above params.
But, it looks not working..
(there is just audio's gain suppression)

Is there anyone who can answer my question?
Thanks.

Inference in Python

Very much appreciate this work! Not a Keras user here, but if I were to use the saved model "newweights9i.hdf5" directly from Python for inference (I know it's slow), do you have any code snippet I can follow along? Thanks!

How to improve the noise reduction through multi noise?

Only one noise file and one voice file can be used for one training session. So how could I improve the noise reduction through multi noise?

Clang doesn’t like `(int)0=0;`

The following code produces warnings in clang: *(int*)0=0;

Indirection of non-volatile null pointer will be deleted, not trap

It recommends the following:

Consider using __builtin_trap() or qualifying pointer with 'volatile'

Processing of 8KHz / 16KHz Sample rate audio signals

though this question was posted earlier in the discussion threads, I could not see a definitive answer on the same. I would like to understand if the RNNoise Demo can be used for 8KHz / 16KHz PCM samples.

is the training depends on the sample rates?

can anyone suggest what changes I should perform in order to adapt the decoder to process 8KHz/16Khz sample rate speech

Training issue

When trying to execute denoise_training the resulting f32 file is empty.

By preprocessing denoise (gcc -E -DTRAINING=1 -Wall -W -O3 -g -I../include denoise.c -o denoise_training.E) it turns out the following statements have been commented out:

#if 0
    compute_rnn(&noisy->rnn, g, &vad_prob, features);
    interp_band_gain(gf, g);
#if 1
    for (i=0;i<FREQ_SIZE;i++) {
      X[i].r *= gf[i];
      X[i].i *= gf[i];
    }
#endif
    frame_synthesis(noisy, xn, X);

    for (i=0;i<FRAME_SIZE;i++) tmp[i] = xn[i];
    fwrite(tmp, sizeof(short), FRAME_SIZE, fout);
#endif

and nothing will be written into fout (.f32 file).

Is it working as expected? How to prepare the training data?
Thank you in advance

Randomize Training data

My training data set is a 50000000 X 87 matrix. Each iteration my model reads the data in the same way. Its recommended to shuffle training data at the beginning of each epoch so that it generalizes better. Is it possible to shuffle the training data? Will it somehow make the data invalid?

Code In Python

Online it says this was written in python then converted to C. Is there anyway that I could get access to that code.

The mix gain setting question

Hi,
Thanks for the terrific work.
Here is a question about mix gain, in denoise.c.
I notice the denoise effect is too strong, and speech_gain and noise_gain are set as below. So if there may be a distortion problem when these two audio mixed or when handle the audio in reality?

    speech_gain = pow(10., (-40+(rand()%60))/20.);  // 0.01~10
     noise_gain = pow(10., (-30+(rand()%50))/20.);  // 0.06~10

can rnnoise support 8K 16bit(little endian int16) pcm,48K is not fit in embedded mcu

i want to test the effect of rnnoise, but if it only support 48K sample rate, it is not fit for embedded mcu.

Significative delay in real time mic input

@JanX2 @jmvalin When doing real-time denoise from mic, there is a significative delay of 1-2 sec. How to improve this with the current pre-trained model?

librnnoise.so.0: cannot open shared object file: No such file or directory

I have installed the file as per the README, but facing with the below issue when running
./examples/rnnoise_demo input.pcm output.pcm

/home/navin/Documents/Speech/speech/speech-engine/toolkit/rnnoise/examples/.libs/lt-rnnoise_demo: error while loading shared libraries: librnnoise.so.0: cannot open shared object file: No such file or directory
What package did i missed?

missing files

First of all, this is a great work.
The code for preparing denoise_data9.h5 and the python code for denoise using the trained model are missing. Could you please share them too?

Training the model from scratch

I am looking at the training code. Is there a script to generate 'denoise_data9.h5' from the raw audio + noise examples?

Input conversion

Hi, thanks for your job !

I try to convert my wav file with this command :
sox $file -b 16 -r 48000 ${file%-*}-48000.wav

But when I apply the network, I cannot play the output, it seems to have 0s of audio.
Can you share a correct command to convert a wav to the right format pls ?

Thanks

Segmentation fault: 11

I get Segmentation fault: 11 when running $ ./examples/rnnoise_demo input.pcm output.pcm

Stream microphone to speakers

How would I test this by streaming the microphone to speakers in realtime?

Discussion about one class of noise

Hi,
I wonder if we use noise of only one place to train, could we get a noise reducer for this place while it can keep other place noise with it.
Anyone thought about or tried this?

paper

“A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement”
I want ask which magazine is this paper published in?I cant find answer on the Internet.

Processing live audio stream, possibly by using rnnoise VST library

Is it possible to process live audio? Maybe something like VST effect.

Is rnnoise on microphone array VocalFusion XVF3000 feasible?

I was wondering if it would be possible to integrate rnnoise into something like XMOS VocalFusion XVF3000. Microphone arrays with this chip is quite readily available and affordable.

IDE available
docs available
XMOS is extending C standard with some features which they call xC language

At first sight it looks like good fit.

I am not sure

how much rnnoise depends on OS/platform it runs on
performance required
what else to consider

Training

Hi,xiph:
This is not a issue,just a question,sorry to put it here:
1 if I want to train from scatch(using your training data,resample to 16k hz),what should I change the code?
2 Where can I get your training data?Since I did not found it.
3 how to get the c code,after training using keras?
Thanks!

Makefile Error

Hi, I am trying to make the build for rnnoise, but I keep running into this error:

ld.exe: cannot find -link

I have searched extensively and cannot find a solution to this problem.

Heres the full error message:
$ make make all-am make[1]: Entering directory '/cygdrive/c/Users/goldw/Downloads/rnnoise-master/rnnoise-master' CCLD librnnoise.la C:/Program Files/mingw-w64/x86_64-7.2.0-posix-seh-rt_v5-rev1/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/7.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: cannot find -link collect2.exe: error: ld returned 1 exit status make[1]: *** [Makefile:460: librnnoise.la] Error 1 make[1]: Leaving directory '/cygdrive/c/Users/goldw/Downloads/rnnoise-master/rnnoise-master' make: *** [Makefile:354: all] Error 2

Training process is not documented enough and generally not smooth.

This comment should be a part of README or another document;
The working dump_rnn.py should be in training/, not just some link.
Either denoise_training should use fout again instead of stdout or fout should be removed.
Generated header uses float-format numbers like #define VAD_GRU_SIZE 24.0.
There should be a script to execute all steps automatically, going from voice.wav and noise.wav to rnn_data.h and rnn_data.c.