Giter VIP home page Giter VIP logo

deepbeam's Introduction

Deepbeam

Deep learning based Speech Beamforming

Requirements

tensorflow, scipy, fftw, h5py

Train Wavenet-based enhancement model

Noisy input data filename: noisy_train.mat

Dimension: [24570, NUM_TOKENS]

Content: noisy waveforms

Clean ouput data filename: target_train.mat

Dimension: [16384, NUM_TOKENS]

Content: 256 mu-law quantized bin index of clean waveforms

The above become numpy arrays after loaded into python, you can generate your own traning data and modify the model architecture accordingly.

To train the enhancement model, place the data in the same directory as the training code, then execute the following:

python bawn_sp_multi_gpu_train_v2.py /logdir NUM_GPUS

Demo using pre-trained model

A pre-trained enhancement model using 109 speakers and 100 noises is available in assets.

"demo.ipynb" contains a complete enhancement and beamforming workflow using a short 8-channel noisy speech.

The noisy input is a [length, NUM_CHANNELS] matrix, where each column is a channel. Another input contains only the multi-channel noise itself is required in order to compute signal-to-noise-ratio.

deepbeam's People

Contributors

auspicious3000 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepbeam's Issues

details about the noisy_train.mat and target_train.mat

Thank you for your share!
However, I want to know the details about how to constructi the "noisy_train" and "target_train".
What is the meaning of "NUM_TOKENS" ?
What about the NUM_CHANNELS in "noisy_train" and "target_train" ??

input/output dimension & model complexity

Thank you for sharing!
I can successfuly run the demo code and get the denoised and beamformed audio =]

However, I wondered that whether the input dimension is fixed or at least longer than a pre-defined length.
It failed when the len_input and len_pad were set to 512 and 0, respectively.

I also calculated the run time during each iteration of inferencing (input was a 3-sec audio ).
It took about 2 to 3 sec in the PMF prediction step
and it took about 40 to 50 sec to execute update_beamform_coef_with_weights function

Is it possible that the input and output is a short frame size (20~30 ms)
and the run time can be decreased to meet real-time criteria?

TensorFlow Version

Hi, which TensorFlow version did you use? I got an error "TypeError: softmax() got an unexpected keyword argument 'axis'" when using TensorFlow 1.6, I also tried other versions, such as 1.4 but it still doesn't work.

Error : No variables to save in the line model.saver.restore(model.sess, 'assets/bawn_sp.ckpt-399125')

Hello, when I run the demo code, I encounter an error in the line model.saver.restore(model.sess, 'assets/bawn_sp.ckpt-399125'). I'm using TensorFlow 2.14.0, and the error message is:

swift
Copy code
Exception has occurred: ValueError
No variables to save
File "C:\Users\pc\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\training\saver.py", line 971, in _build
raise ValueError("No variables to save")
File "C:\Users\pc\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\training\saver.py", line 949, in _build_eager
self._build(
File "C:\Users\pc\AppData\Local\Programs\Python\Python311\Lib\site-packages\tensorflow\python\training\saver.py", line 1416, in restore
self._build_eager(save_path, build_save=False, build_restore=True)
File "C:\Users\pc\Desktop\AI-Beamformers\deepbeam\deepbeam\test.py", line 20, in
model.saver.restore(model.sess, 'assets/bawn_sp.ckpt-399125')
File "C:\Users\pc\AppData\Local\Programs\Python\Python311\Lib\runpy.py", line 88, in _run_code
exec(code, run_globals)
File "C:\Users\pc\AppData\Local\Programs\Python\Python311\Lib\runpy.py", line 198, in _run_module_as_main (Current frame)
return _run_code(code, main_globals, None,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: No variables to save
Do you have any experience with this? Thank you in advance!

Problem with bawn.py

Hello, i have a simple question related to bawn.py script. In that script there is variable called bins which is not defined anywhere in the script but its used in a few places. I would like to know if that is some other variable, some of the arrays, or i just need to define bins as a number. Thank you in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.