voice-engine / voice-engine Goto Github PK

building blocks to create voice interface applications

License: GNU General Public License v3.0

Python 100.00%

voice-engine's Introduction

Voice Engine

The library is used to create voice interface applications. It includes building blocks such as KWS (keyword spotting), DOA (Direction Of Arrival). There are also elements to measure RMS (dBFS or dB(A)).

Requirements

pyaudio
numpy
snowboy

Installation

Install pyaudio, numpy and snowboy, use virtualenv a virtual python environment.

sudo apt install python-pyaudio python-numpy python-virtualenv
sudo apt-get install swig python-dev libatlas-base-dev build-essential make
git clone --depth 1 https://github.com/Kitt-AI/snowboy.git
cd snowboy
virtualenv --system-site-packages env
source env/bin/activate
python setup.py build
python setup.py bdist_wheel
pip install dist/snowboy*.whl
cd ..
git clone https://github.com/voice-engine/voice-engine.git
cd voice-engine
python setup.py bdist_wheel
pip install dist/*.whl

Get started

To record audio and search keyword "snowboy", see also kws_snowboy.py

import time
from voice_engine.kws import KWS
from voice_engine.source import Source

src = Source()
kws = KWS()
src.link(kws)

def on_detected(keyword):
    print('found {}'.format(keyword))
kws.on_detected = on_detected

kws.start()
src.start()
while True:
    try:
        time.sleep(1)
    except KeyboardInterrupt:
        break
kws.stop()
src.stop()

Building blocks

The library uses gstreamer-like elements which can be linked together as an audio pipeline. One element can connect to more than one other elements.

The topology can be:

Source --> ChannelPicker --> KWS          Source --> ChannelPicker --> KWS --> Alexa
  |                          /\
  V                        /   \
 DOA                   Alexa   Google Asissitant

voice-engine's People

Stargazers

Watchers

voice-engine's Issues

Playing audio after kws

Hey guys, any idea on how to play audio after kws? i have a full code of a personal assistant i built using microsoft tech stack, the thing is whenever i try to play any audio it gives
aplay: main:788: audio open error: No such file or directory
i tried the sample code on the main page and added playing audio from os, won't work either, any ideas?

import time
import os
from voice_engine.kws import KWS
from voice_engine.source import Source

src = Source()
kws = KWS()
src.link(kws)

def on_detected(keyword):
    print('found {}'.format(keyword))
    os.system("aplay -c3 sample.wav")

kws.on_detected = on_detected

kws.start()
src.start()
while True:
    try:
        time.sleep(1)
    except KeyboardInterrupt:
        break
kws.stop()
src.stop()

Would you change the WakeWord?

Hello

now, i am using kws_doa.py so it is work

but i want to change the other wakeword nither snowboy

I'd appreciate it if you could tell me how.

Illegal instruction

I keep getting this error: "Illegal instruction", and nothing else while trying to run any of the examples from here: https://github.com/respeaker/respeaker_for_raspberrypi

I narrowed it down to this command:
ns = NS(rate=src.rate, channels=1)

Anything?

how to replace Alexa UMDL

Hello Team,

I want to trigger my own voice .imdl file along with Alexa. Could you please advise, where to change the code and add my voice file.

Regards
vithal

KWS 识别通道问题

在kws_doa_alexa_respeaker_v2.py和其他很多范例中，为什么都是只取了一个通道比如ch0 作为关键词识别的通道，而没有用所有通道或者每个通道都做关键词识别呢？

另外，除了 snowboy 我想用别的关键词识别器（比如Porcupine）结合respeakerd。但是研究了respeakerd，感觉如果我用“manual_without_kws”启动 respeakerd，好像是一开始就启动了mic0 的 Beamforming？这样我就很难从各个方向监听关键词（使用 Porcupine），触发后，获取doa，然后src.on_set_direction(dir) 进行 Beamforming，不知道这个能否有解决办法？

如果我用from voice_engine.source 的方案，好像没有 Beamforming，还有最上面这个问题

还有一个问题，from respeakerd_source import RespeakerdSource 的方案， alexa 开始说话或者说完话，Beamforming 是怎么取消的？是在 librespeaker里面吗， python 有没有方法控制手工取消？也就是实现 python 设置Beamforming角度，python再说完话取消

Dependence on the geometry of the mic-array

We came across your DOA code implemented for 2, 4, 6 microphones. We would like to inquire if it is geometry-dependent, i.e. the microphones should form a specific geometric shape, in order of it to work properly.

Is your DOA code for 4 and 6 microphones array based or assume a certain geometry of the mic-array (e.g. the 6 microphones are located in an equilateral hexagon)?

Do you run on a specific mic-array card? If so, what is its model and how the mic-array are arranged (e.g. in an equilateral hexagon)? What adaptions or changes do I need to do if you want to run it with my own customized mic-array?

Does your DOA range is 360 degrees (on the mic-array plane)?

Can you explain the algorithm how to infer the global DOA angle from the local angles of each pair of microphones?

Thanks in advance.

Missing pixel_ring file

Hi,

The code in voice-engine/examples/ds_kws_doa_for_respeaker_4mic_array.py refers to pixel_ring, but this is not in the voice_engine directory nor is it a python requirement.

Guess it's just a missing file?

Thanks,
Akram

the demo program ns_kws_doa.py is not responding

Dear Voice Engine developers,
Does anyone know how to fix this error after running the command python ns_kws_doa.py like below?
pi@raspberrypi:~/respeaker/2mic $ python ns_kws_doa.py
['arecord', '-t', 'raw', '-f', 'S16_LE', '-c', '2', '-r', '16000', '-D', 'default', '-q']
this program had not responded to anything even though I have already had set up a default sound card (respeaker 2-mics Pi HAT).
Thank you so much for your assistance!

ImportError: No module named _webrtc_audio_processing

I got the following ImportError when I run the example of "example/ns_kws.py" on Respeaker Core V2.0 :

Traceback (most recent call last):
File "ns_kws.py", line 9, in
from voice_engine.ns import NS
File "/usr/local/lib/python2.7/dist-packages/voice_engine/ns.py", line 7, in
from webrtc_audio_processing import AP
File "/home/respeaker/.local/lib/python2.7/site-packages/webrtc_audio_processing/init.py", line 2, in
from .webrtc_audio_processing import AudioProcessingModule
File "/home/respeaker/.local/lib/python2.7/site-packages/webrtc_audio_processing/webrtc_audio_processing.py", line 17, in
_webrtc_audio_processing = swig_import_helper()
File "/home/respeaker/.local/lib/python2.7/site-packages/webrtc_audio_processing/webrtc_audio_processing.py", line 16, in swig_import_helper
return importlib.import_module('_webrtc_audio_processing')
File "/usr/lib/python2.7/importlib/init.py", line 37, in import_module
import(name)
ImportError: No module named _webrtc_audio_processing

I alread installed the voice-engine and python-webrtc-audio-processing package, the example of aec_kws.py and kws_doa.py worked fine,

Is there any idea on the above issue?
Thanks a lot!

Is there c++ library for this?

We want to integrate this library in c++ application? Do you have c++implementation?

Multiple keywords

Is there an option to add multiple keywords detection as well as direction of arrival?

voice to text

Team,
Is there any function, that I can use from voice_engine that I can covert voice to text.

I need search for a word

Why use argmin(tau) for DoA on 6-mic array?

Dear Voice Engine developers,

I find your voice engine module for KWS and DOA detections particularly useful, so first of all, thank you for providing this important building block for many voice related applications.

While I am reading the code for the 6-mic circular array DOA, I found in line 45 of doa_respeaker_v2_6mic_array.py:

min_index = np.argmin(np.abs(tau))

I was wondering why you take only one pair of the mics rather than 6 mics all together to compute the angle of arrival? and why specifically the one pair with the minimum tau is taken to compute the 'best_guess'?

It would be much appreciated if you could kindly elaborate on this.

All the best
Rui

ImportError: No module named speexdsp and webrtc_audio_processin

I bought a Respeaker Core V2.0, did the exact same instructions in the readme here, doa is working fine, kws is working fine but ns and aec gives the following when run from examples:

ImportError: No module named speexdsp
ImportError: No module named webrtc_audio_processing

tried to install speexdsp using pip, it installs fine but the example still don't work!

any ideas?

OS: Respeaker V2.0 latest image

record audio

I want to record audio for 5 seconds and save that as a file after the keyword detection. How to do that

Snowboy doa precision 4 mic array

Hey,

We setup snowboy on a rasberry pi with a respeaker 4mic array. We were wondering if can increase the precision of the doa supplied by the voice-engine. By default it is setup in a way that it gives back directions with doa.direction() corresponding directly with the LEDs positions.

How to print the text after we say alexa

Team,

I need to print the text my word when I speak or call alexa.

How this function getting triggered when I call only alexa.

def on_detected(keyword):
direction = doa.get_direction()
print('detected {} at direction {}'.format(keyword, direction))
alexa.listen()
pixel_ring.wakeup(direction)