watson-developer-cloud / speech-to-text-websockets-python Goto Github PK

View Code? Open in Web Editor NEW

This project forked from daniel-bolanos/speech-to-text-websockets-python

86.0 36.0 63.0 3.4 MB

Python client that interacts with the IBM Watson Speech To Text service through its WebSockets interface

Home Page: http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercloud/speech-to-text.html

Python 100.00%

speech-to-text-websockets-python's Introduction

This sample has been deprecated. Please use the Official Watson Python SDK

Synopsis

This project consists of a python client that interacts with the IBM Watson Speech To Text service through its WebSockets interface. The client streams audio to the STT service and receives recognition hypotheses in real time. It can run N simultaneous recognition sessions

Installation

There are some dependencies that need to be installed for this script to work. It is advisable to install the required packages in a separate virtual environment. Certain packages have been observed to conflict with the package requirements for this script; in particular the package nose conflicts with these required packages. In order to interact with the STT service via WebSockets, it is necessary to install pip, then write the following commands:

pip install -r requirements.txt

You also may need to write this command

$ apt-get install build-essential python-dev

If you are creating an environment using anaconda, proceed with the above pip command to install the packages--do not use conda to install the requirements as conda will install nose as a dependency.

Examples

The example below will run the default 10 WAV files through the WebSockets interface of the Speech To Text (STT) service and will dump the recognition hypotheses to a file under the "./output" directory.

$ python ./sttClient.py -credentials <username>:<password> -model en-US_BroadbandModel

The example below performs the same task much faster by opening 10 simultaneous recognition sessions (WebSocket connections) against the STT service.

$ python ./sttClient.py -credentials <username>:<password> -model en-US_BroadbandModel -threads 10

Options

To see the list of available options type:

$ python sttClient.py -h

Motivation

This script has been created by Daniel Bolanos in order to facilitate and promote the utilization of the IBM Watson Speech To Text service.

speech-to-text-websockets-python's People

Contributors

Stargazers

Watchers

speech-to-text-websockets-python's Issues

is there a limit around 20 minutes?

Hi,
first of all, thank you for your script. It really helped me a lot.
I noticed that when I try to transcribe a large file, like for example one of 60 minutes, it stops transcribing around minute 22-23. I tried using several audio files, and it always finishes around that time.
Is there any time constraints or some configuration that I am missing?
Are you aware of this limitation?

Thank you very much

json file formatting

It seems that the json file that this outputs is actually made up of multiple json objects. This makes it very difficult to parse.

Basic pep8 compliance

As a sample python example program it would be really good if this followed python pep8 best practices so that it matched with other python code. For instance:

4 space indents
80 column wraps

I'm happy to make the changes and push a pull request if this is welcomed.

Option to Name Output File

First of all thank you for the hard work that you have done on this project. It has certainly worked much better with large files than the SESSIONS / cURL method we were using before.

Again, I do not have much experience with Python so customizing the Python script is very challenging for me. Currently the script outputs to a 0.txt or 0.json file, and I was wondering if you might be willing to add a feature (or show me how to customize the script) to name the output file as we'd like or, at the very least, have the output file be given the same filename (obviously not extension) as the input file (or first file of multiple, etc).

I'm sure you guys are very busy but if you ever get a little bit of time to add such a feature it'd be greatly appreciated! My bread and butter is PHP but that doesn't do much good on a project like this. I suppose you could also show me how to set the output filename variable from the command line if that's easier. Thank you!

Some issues with Anaconda env Python 3.6.3

Hi Daniel

I have some issues with my environment in Anaconda (Python 3.6.3 |Anaconda, Inc.| (default, Nov 8 2017, 15:10:56) [MSC v.1900 64 bit (AMD64)] on win32), I think this issues are probably because I am starting with python and watson. Also I'm using PyCharm IDE. So I woul like if you could help me to use the code.

1. No module named Queue

I solved adding "from multiprocessing"

I would like to know if was the best way to solve it.

2. Unresolved reference raw_input

I solved puting in to the code "row_input = input"

I would like to know if was the best way to solve it.

3. End of satament expected

I solved adding "(" and ")"

I would like to know if was the best way to solve it.

4. Unresolve reference 'status'

But I don't know how resolve this issue. Could you help me?

Thanks

execution in proxy environment

Hi, thank you for the script.

It is very helpful, however, I can't execute in proxy environment.
If it would be possible, could I have some advice to modify to execute the script in proxy environment?

Thank you very much

Undefined name 'status' in sttClient.py

Should this be value['status'] instead?

flake8 testing of https://github.com/watson-developer-cloud/speech-to-text-websockets-python on Python 3.7.1

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./sttClient.py:401:37: F821 undefined name 'status'
            print(fmt.format(key, **status))
                                    ^
1     F821 undefined name 'status'
1

Issue regarding Session Timed out

The library is awesome and is helping me a lot but I need help with this issue other audio files have proper transcription and working perfectly but some of my audio files have silence in it in between and I think because of that I run into session timed out. Is there any solution to solve this?
I tried changing Inactivity_timeout=-1 but it is of no use,

2018-08-22 04:06:05-0700 [-] Log opened.
2018-08-22 04:06:05-0700 [-] ./recordings/UXArmy_record.wav
2018-08-22 04:06:05-0700 [-]
2018-08-22 04:06:05-0700 [-] {'Authorization': 'Basic ZjIzNmUwOTAtYTIwNC00MzEyLTg5ZDktZGIwMGM1ZmFjNjk2OnQxZWtITHlTVDhjWg=='}
2018-08-22 04:06:05-0700 [-] wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_BroadbandModel
2018-08-22 04:06:05-0700 [-] Starting factory <main.WSInterfaceFactory object at 0x7f8b77c92c10>
2018-08-22 04:06:06-0700 [-] ./output
2018-08-22 04:06:06-0700 [-] contentType: audio/wav queueSize: 1
2018-08-22 04:06:09-0700 [-] onConnect, server connected: tcp4:169.48.115.62:443
2018-08-22 04:06:09-0700 [-] onOpen
2018-08-22 04:06:09-0700 [-] sendMessage(init)
2018-08-22 04:06:09-0700 [-] ./recordings/UXArmy_record.wav
2018-08-22 04:06:10-0700 [-] onOpen ends
2018-08-22 04:06:16-0700 [-] Text message received: {
2018-08-22 04:06:16-0700 [-] "state": "listening"
2018-08-22 04:06:16-0700 [-] }
2018-08-22 04:06:44-0700 [-] Text message received: {
2018-08-22 04:06:44-0700 [-] "error": "Session timed out."
2018-08-22 04:06:44-0700 [-] }
2018-08-22 04:09:49-0700 [-] onClose
2018-08-22 04:09:49-0700 [-] WebSocket connection closed: see the previous message for the error details., code: 1011, clean: True, reason: see the previous message for the error details.
2018-08-22 04:09:49-0700 [-] ./output
2018-08-22 04:09:49-0700 [-] contentType: audio/wav queueSize: 0
2018-08-22 04:09:50-0700 [-] onConnect, server connected: tcp4:169.48.115.62:443
2018-08-22 04:09:50-0700 [-] onOpen
2018-08-22 04:09:50-0700 [-] sendMessage(init)
`

Cannot Run Example

Hi - I keep getting this error:

Hi, I keep getting this error:

'module' object has no attribute 'OP_NO_TLSv1_1'

Full Error is here:
/usr/bin/python "/Users/vikrambaid/Desktop/projects/IBM Watson/speech-to-text-continuous-websockets/stt-watson/test.py"
Traceback (most recent call last):
File "/Users/vikrambaid/Desktop/projects/IBM Watson/speech-to-text-continuous-websockets/stt-watson/test.py", line 1, in
from stt_watson.SttWatson import SttWatson
File "/Users/vikrambaid/Desktop/projects/IBM Watson/speech-to-text-continuous-websockets/stt-watson/stt_watson/SttWatson.py", line 9, in
from watson_client.Client import Client
File "/Users/vikrambaid/Desktop/projects/IBM Watson/speech-to-text-continuous-websockets/stt-watson/watson_client/Client.py", line 24, in
from watson_client.websocket.WSInterfaceProtocol import WSInterfaceProtocol
File "/Users/vikrambaid/Desktop/projects/IBM Watson/speech-to-text-continuous-websockets/stt-watson/watson_client/websocket/WSInterfaceProtocol.py", line 6, in
from twisted.internet import ssl
File "/Library/Python/2.7/site-packages/twisted/internet/ssl.py", line 65, in
from twisted.internet import tcp, interfaces
File "/Library/Python/2.7/site-packages/twisted/internet/tcp.py", line 28, in
from twisted.internet._newtls import (
File "/Library/Python/2.7/site-packages/twisted/internet/_newtls.py", line 21, in
from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol
File "/Library/Python/2.7/site-packages/twisted/protocols/tls.py", line 63, in
from twisted.internet._sslverify import _setAcceptableProtocols
File "/Library/Python/2.7/site-packages/twisted/internet/_sslverify.py", line 38, in
TLSVersion.TLSv1_1: SSL.OP_NO_TLSv1_1,
AttributeError: 'module' object has no attribute 'OP_NO_TLSv1_1'

Process finished with exit code 1

Unable to get example working

Hello, after cloning the project and installing the requirements, I get the following output when I run the example command given ($ python ./sttClient.py -credentials <username>:<password> -model en-US_BroadbandModel):

the output directory "./output" already exists, overwrite? (y/n)? y
2016-04-11 08:44:33-0500 [-] Log opened.
2016-04-11 08:44:33-0500 [-] ./recordings/0001.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0002.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0003.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0004.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0005.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0006.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0007.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0008.wav
2016-04-11 08:44:33-0500 [-] ./recordings/0009.wav
2016-04-11 08:44:33-0500 [-] ./recordings/00010.wav
2016-04-11 08:44:33-0500 [-] Traceback (most recent call last):
2016-04-11 08:44:33-0500 [-]   File "./sttClient.py", line 298, in <module>
2016-04-11 08:44:33-0500 [-]     factory = WSInterfaceFactory(q, summary, args.dirOutput, args.contentType, args.model, url, headers, debug=False)
2016-04-11 08:44:33-0500 [-]   File "./sttClient.py", line 55, in __init__
2016-04-11 08:44:33-0500 [-]     WebSocketClientFactory.__init__(self, url=url, headers=headers, debug=debug)   
2016-04-11 08:44:33-0500 [-]   File "/Users/eric.bunch/anaconda2/lib/python2.7/site-packages/autobahn/twisted/websocket.py", line 278, in __init__
2016-04-11 08:44:33-0500 [-]     protocol.WebSocketClientFactory.__init__(self, *args, **kwargs)
2016-04-11 08:44:33-0500 [-] TypeError: __init__() got an unexpected keyword argument 'debug'

Any ideas on anything I'm doing wrong or how to correct this? I'm running python 2.7.11.

extend this feature to general stt

Hi i wanted to understand if this script can be used in a way where it listens to the speech from the microphone of the user and performs the transcribing of the spoken test?

Passing Credentials

I'm trying to run the sample but having trouble with the credentials. The Readme suggests we should pass in the credentials on the commandline as such:
python ./sttClient.py -credentials :
Should we be using the Bluemix account username/password or the credentials from the service (which looks like the below)?
{
"credentials": {
"url": "https://stream.watsonplatform.net/speech-to-text/api",
"password": "xxx",
"username": "xxxxx-xxxxx-xxxxxx-xxxxxx-xxxxx"
}
}

I tried both and always get 401 - Unauthorized:

2016-08-29 16:21:12+0300 [-] ./recordings/0009.wav
2016-08-29 16:21:12+0300 [-] ./recordings/0010.wav
2016-08-29 16:21:12+0300 [-] {'Authorization': 'Basic abcd.....=='}
2016-08-29 16:21:12+0300 [-] Starting factory <main.WSInterfaceFactory object at 0x104057610>
2016-08-29 16:21:12+0300 [-] ./output
2016-08-29 16:21:12+0300 [-] contentType: audio/wav queueSize: 9
2016-08-29 16:21:13+0300 [-] failing WebSocket opening handshake ('WebSocket connection upgrade failed (401 - Unauthorized)')
2016-08-29 16:21:13+0300 [-] onClose

Support for speaker labels?

Thanks for the great repo! I was able to get the test example working easily.

I was wondering if future updates will include the possibility of outputting speaker_labels?

I have a problem with TypeError: a bytes-like object is required, not 'str'

I've got this error at line 350

auth = args.credentials[0] + ":" + args.credentials[1]
headers["Authorization"] = "Basic " + base64.b64encode(auth)

how should I solve it?

Possible to Get Output File (JSON) That Includes "Results"->"Alternatives"->"Timestamps", etc?

I know that it's in streaming mode and converting the JSON output to an enumerated result in the output TXT file, but I was wondering if there's any way to get it to output all of the accumulated JSON for the audio into a single JSON file.

Unfortunately I don't know Python so I'm not able to edit the code to modify the format of the output to JSON, but I know at the very least having the timestamps available would be useful for people trying to create subtitles and other similar features.

No module named cryptography

I am using python 3.4.0 on windows-7(64 bit) and getting "No Module Named Cryptography" error.
I have attached the screenshot to get the detailed error.

I have also googled this error but didn't able to resolve this issue.
How can i resolve this issue?

add support for pip

Adding requirements.txt: https://pip.readthedocs.org/en/1.1/requirements.html

watson-developer-cloud / speech-to-text-websockets-python Goto Github PK

speech-to-text-websockets-python's Introduction

This sample has been deprecated. Please use the Official Watson Python SDK

Synopsis

Installation

Examples

Options

Motivation

speech-to-text-websockets-python's People

Contributors

Stargazers

Watchers

Forkers

speech-to-text-websockets-python's Issues

Recommend Projects

Recommend Topics

Recommend Org