watson-developer-cloud / speech-android-sdk Goto Github PK

DEPRECATED - Please use https://github.com/watson-developer-cloud/android-sdk

License: Apache License 2.0

Java 100.00%

speech-android-sdk's Introduction

Watson Speech Android SDK

The Watson Speech SDK for the Android platform enables an easy and lightweight interaction with the IBM's Watson Speech-To-Text (STT) and Text-To-Speech (TTS) services in Bluemix. The SDK includes support for recording and streaming audio in real time to the STT service while receiving a transcript of the audio as you speak. This project includes an example application that showcases the interaction with both the STT and TTS Watson services in the cloud.

The current version of the SDK uses a minSdkVersion of 9, while the example application uses a minSdkVersion of 16.

Watson Developer Cloud Speech APIs

Installation

Using the library

Download the speech-android-wrapper.aar
Once unzipped drag the speech-android-wrapper.aar file into your Android Studio project view under the libs folder.
Go to build.gradle file of your app, then set the dependencies as below:

    dependencies {
        compile fileTree(dir: 'libs', include: ['*.jar'])
        compile (name:'speech-android-wrapper',ext:'aar')
        compile 'com.android.support:appcompat-v7:22.0.0'
    }
    repositories{
        flatDir{
            dirs 'libs'
        }
    }

Clean and run the Android Studio project

Getting credentials

Create an account on Bluemix if you have not already.
Follow instructions at Service credentials for Watson services to get service credentials.

A Quick Start Guide

To get started, you can also take a look at a quick start guide created by @KeyOnTech.

Speech To Text

Implement the ISpeechDelegate and SpeechRecorderDelegate in the MainActivity

These delegates implement the callbacks when a response from the server is received or when the recorder is sending back the audio data. SpeechRecorderDelegate is optional.

   public class MainActivity extends Activity implements ISpeechDelegate{}

Or with SpeechRecorderDelegate

   public class MainActivity extends Activity implements ISpeechDelegate, SpeechRecorderDelegate{}

Instantiate the SpeechToText instance

   SpeechToText.sharedInstance().initWithContext(new URI("wss://stream.watsonplatform.net/speech-to-text/api"), this.getApplicationContext(), new SpeechConfiguration());

Enabling audio compression

By default audio sent to the server is uncompressed PCM encoded data, compressed audio using the Opus codec can be enabled.

   SpeechToText.sharedInstance().initWithContext(this.getHost(STT_URL), this.getApplicationContext(), new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS));

Or this way:

    // Configuration
    SpeechConfiguration sConfig = new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS);
    // STT
    SpeechToText.sharedInstance().initWithContext(this.getHost(STT_URL), this.getApplicationContext(), sConfig);

Set the Credentials and the delegate

   SpeechToText.sharedInstance().setCredentials(this.USERNAME,this.PASSWORD);
   SpeechToText.sharedInstance().setDelegate(this);

Alternatively pass a token factory object to be used by the SDK to retrieve authentication tokens to authenticate against the STT service

   SpeechToText.sharedInstance().setTokenProvider(new MyTokenProvider(this.strSTTTokenFactoryURL));
   SpeechToText.sharedInstance().setDelegate(this);

Get a list of models supported by the service

   JSONObject models = getModels();

Get details of a particular model

   JSONObject model = getModelInfo("en-US_BroadbandModel");

Pick the model to be used

   SpeechToText.sharedInstance().setModel("en-US_BroadbandModel");

Start Audio Transcription

   SpeechToText.sharedInstance().recognize();

If you implemented SpeechRecorderDelegate, and needs to process the audio data which is recorded, you can use set the delegate.

   SpeechToText.sharedInstance().recognize();
   SpeechToText.sharedInstance().setRecorderDelegate(this);

Delegate methods to receive messages from the sdk

    public void onOpen() {
        // the  connection to the STT service is successfully opened
    }

    public void onError(String error) {
    	// error interacting with the STT service
    }

    public void onClose(int code, String reason, boolean remote) {
        // the connection with the STT service was just closed
    }

    public void onMessage(String message) {
        // a message comes from the STT service with recognition results
    }

End Audio Transcription

   SpeechRecognition.sharedInstance().stopRecording();

Receive speech power levels during the recognize

The amplitude is calculated from the audio data buffer, and the volume (in dB) is calculated based on it.

    @Override
    public void onAmplitude(double amplitude, double volume) {
        // your code here
    }

Text To Speech

Instantiate the TextToSpeech instance

   TextToSpeech.sharedInstance().initWithContext(this.getHost(TTS_URL));

Set the Credentials

   TextToSpeech.sharedInstance().setCredentials(this.USERNAME,this.PASSWORD);

Alternatively pass a token factory object to be used by the SDK to retrieve authentication tokens to authenticate against the TTS service

   TextToSpeech.sharedInstance().setTokenProvider(new MyTokenProvider(this.strTTSTokenFactoryURL));

Get a list of voices supported by the service

   TextToSpeech.sharedInstance().voices();

Pick the voice to be used

   TextToSpeech.sharedInstance().setVoice("en-US_MichaelVoice");

Generate and play audio

  TextToSpeech.sharedInstance().synthesize(ttsText);

Common issues

speech-android-sdk's People

Contributors

Stargazers

Watchers

speech-android-sdk's Issues

AndroidManifest.xml merge conflict when added to cordova plugin

I am not an expert in Android but somehow I have managed to add speech-android-sdk to a cordova plugin(after some changes).

When attempting to build for the android platform it fails in step :mergeDebugAssets saying that android:theme is present in two manifests (app's AndroidManifest.xml and speech-android-wrapper's AndroidManifest.xml) and cannot be merged. The CLI suggests to use 'tools:replace=...' to solve the conflict

However, I think, this aar does not need <application> tag since is meant to be added as a library. The one who should have <application> tag in its AndroidManifest.xml is the host.

I think below lines should be removed to avoid extra settings in the host application.

https://github.com/watson-developer-cloud/speech-android-sdk/blob/master/speech-android-wrapper/src/main/AndroidManifest.xml#L10-L15

<application
    android:allowBackup="true"
    android:icon="@mipmap/ic_launcher"
    android:label="@string/app_name"
    android:theme="@style/AppTheme" android:name="com.ibm.watson.developer_cloud.android.speech_to_text.v1.WatsonSDK">
</application>

com-sun-jna-android-arm-libjnidispatch-so-not-found-in-resource-path

Here is link to my original post on this problem in SO.

http://stackoverflow.com/questions/37621358/com-sun-jna-android-arm-libjnidispatch-so-not-found-in-resource-path

Still looking for a fix or workaround.

how do i get user credentials

New Project Quick Start Guide

This is an extension to my original posting

explaining how to get the demo project provided on the GitHub page working.

https://github.com/watson-developer-cloud/speech-android-sdk

This post is helping users who want to start a new project and use the SDK, these steps are as of 3-2-2016. I am writing as I am doing them...

Download the project zip file

https://github.com/watson-developer-cloud/speech-android-sdk

Open Android Studio
Create a New Project
Name it
min SDK API 15 or anything above...
select Blank Activity WITH Fragment

Which ever you pick you will need to do the Watson SDK code in a Fragment

Once it finishes the setup
Select File > Import Module > for source directory find the folder you unzipped in your downloads folder, within that folder select the folder "speech-android-wrapper" > leave the default name for Module name or call it what you like... > Finish

This will copy the folder into your project and project directory

Now make it accessible to the project

Right click your project on the top left ( not the newly added library the "app" folder or whatever it is called on the top left...) > Select Open Module Settings > select "app" click on dependencies > click + add Module Dependency > select the speech-android-wrapper project module > Ok

No go to the project folder "app" > select your fragment activity file

Add the ISPeechDelegate interface to the Fragment
public class Fragment1 extends Fragment implements ISpeechDelegate

Implement all the Override methods required by the interface
The onMessage method will be used to capture the speech to text returns from the SDK, the other methods are pretty self explanatory...

Under onCreateView method add
SpeechConfiguration sConfig = new SpeechConfiguration ( SpeechConfiguration.AUDIO_FORMAT_OGGOPUS );

Now run and confirm it Builds Successfully

Thats it for the setup now you can use the library

Now you can add a sharedInstance.initWithContext( ...) ...
set the setCredentials
set the setModel
set the setDelegate ( this )
...

This will setup the app and prepare you to follow what is on the start up guide

https://github.com/watson-developer-cloud/speech-android-sdk

Hope this helps new users get started.

UnsatisfiedLinkError libjnidispatch.so

I got this error when starting speech to text module

           SpeechToText    recognize
                           ws connecting with Basic Authentication
audio.WebSocketUploader    New WebSocketUploader: wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_BroadbandModel
                           wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=en-US_BroadbandModel
                           Connecting...
     AudioCaptureThread    recording started!
         AndroidRuntime    FATAL EXCEPTION: initStreamToServerThread
                           Process: com.skios.elizatourism, PID: 22797
                           java.lang.UnsatisfiedLinkError: Native library (com/sun/jna/android-arm/libjnidispatch.so) not found in resource path (.)
                               at com.sun.jna.Native.loadNativeDispatchLibraryFromClasspath(Native.java:786)
                               at com.sun.jna.Native.loadNativeDispatchLibrary(Native.java:746)
                               at com.sun.jna.Native.<clinit>(Native.java:135)
                               at com.sun.jna.NativeLibrary.<clinit>(NativeLibrary.java:82)
                               at com.sun.jna.NativeLibrary.getInstance(NativeLibrary.java:337)
                               at com.ibm.watson.developer_cloud.android.speech_to_text.v1.opus.JNAOpus.<clinit>(JNAOpus.java:42)
                               at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.OggOpusEnc.initEncoderWithUploader(OggOpusEnc.java:53)
                               at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader.initStreamAudioToServer(WebSocketUploader.java:113)
                               at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader.access$000(WebSocketUploader.java:46)
                               at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader$2.run(WebSocketUploader.java:175)

Deprecate speech-android-sdk

@germanattanasio

With the shift to making the java-sdk more consumable on Android, we are deprecating this sdk. The only thing this sdk provides that is not already provided by the java-sdk is the ability to explicitly end audio transcription. So, in order to deprecate this sdk, we need to provide that functionality within the java-sdk.

Provide a function to explicitly end transcription within the java-sdk
Add OPUS encoding as an option within the android-sdk

Improve instructions on where to use android sdk code

Please improve the instructions on where to place the code in the android speech to text tutorial

stop currently playing audio of Text-to-Speech

Hi,

We are using IBM Watson in our production build. We have a Siri kind of functionality for our app.

We were looking for if there is any way by which we can stop the audio that is currently played by Text to Speech service ?

SSL == Null

When I press record on the S-T-T-page the text shows up in the textView, no problem, but when I press stop the text turns into "ssl == null"
I can't figure out what the problem is.
I only get this error if the message is shorter than a few seconds.

inconsistencies with the AAR which is not updated

I got an email with useful feedback regarding the discrepancy between our README and the classes that we provide in the sdk.

In instantiating the SpeechToText instance, the README uses new SpeechConfiguration(). This is also what the example application does. However, the AAR does not contain the SpeechConfiguration class. The method signature implied by the README is initWithContext(URI, Context, SpeechConfiguration).

The method signature in the AAR is initWithContext(URI, Context, boolean). The boolean here represents isUsingVad but the documentation makes no mention of what "Vad" is, how it should be used, etc.

This is one of the inconsistencies I've found in the SDK. The very next step in the README is setCredentials(username, password) - but in the AAR does not contain this method. Is it instead broken out to two methods, one for user and one for password.

It goes on and on like this (ISpeechDelegate vs SpeechDelegate, setModel(String) which does not exist at all in the AAR, etc.).

Provide a sample implementation that extends Android's RecognitionService

Provide a sample implementation that extends Android's RecognitionService (see: https://developer.android.com/reference/android/speech/RecognitionService.html) in order to reduce the integration effort for apps that already interact with a recognition service by calling SpeechRecognizer.createSpeechRecognizer (https://developer.android.com/reference/android/speech/SpeechRecognizer.html)

The SDK failed to run on Emulator

I encountered INSTALL_FAILED_NO_MATCHING_ABI error when running the app on Windows Android Studio Emulator but working fine when deployed on actual mobile device. My guess it is because the SDK only provide armeabi but not x86? Anyone manage to run the SDK on emulator?

android.media.AudioTrack: getMinBufferSize(): 1948401012 Hz is not a supported sample rate.

Hi. I've got this error when call synthesize method.

I found someone in IBMDeveloper site got the same error but no solution yet.
https://developer.ibm.com/answers/questions/230127/android-text-to-speech-returns-2-buffer-size.html

LOGCAT :

07-08 21:33:04.060 28864-28864/ D/TextToSpeech: synthesize called: https://stream.watsonplatform.net/text-to-speech/api/v1/synthesize
07-08 21:33:04.060 28864-28864/ I/com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility: Start requesting TTS... (audio/wav)
07-08 21:33:04.070 28864-29059/ D/com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility: using basic authentication
07-08 21:33:06.280 28864-29059/ E/android.media.AudioTrack: getMinBufferSize(): 1948401012 Hz is not a supported sample rate.
07-08 21:33:06.280 28864-29059/ W/System.err: java.lang.IllegalArgumentException: 1948401012Hz is not a supported sample rate.
07-08 21:33:06.290 28864-29059/ W/System.err: at android.media.AudioTrack.audioParamCheck(AudioTrack.java:380)
07-08 21:33:06.290 28864-29059/ W/System.err: at android.media.AudioTrack.(AudioTrack.java:315)
07-08 21:33:06.290 28864-29059/ W/System.err: at android.media.AudioTrack.(AudioTrack.java:265)
07-08 21:33:06.290 28864-29059/ W/System.err: at com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility.initPlayer(TTSUtility.java:129)
07-08 21:33:06.300 28864-29059/ W/System.err: at com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility.access$800(TTSUtility.java:52)
07-08 21:33:06.300 28864-29059/ W/System.err: at com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility$TTSThread.run(TTSUtility.java:220)
07-08 21:33:06.300 28864-29059/ I/com.ibm.watson.developer_cloud.android.text_to_speech.v1.TTSUtility: Stopping audioTrack...

Codes:

protected void procTTS(View view) throws JSONException {
        textAnswer = (TextView) findViewById(R.id.txtAnswer);
        String ttsText=textAnswer.getText().toString();
        TextToSpeech.sharedInstance().synthesize(ttsText);
    }
private boolean initTTS() {
        String username = getString(R.string.TTSUsername);
        String password = getString(R.string.TTSPassword);
        String serviceURL = "https://stream.watsonplatform.net/text-to-speech/api";
        TextToSpeech.sharedInstance().initWithContext(getHost(serviceURL));
        TextToSpeech.sharedInstance().setCredentials(username, password);
        TextToSpeech.sharedInstance().setVoice(getString(R.string.voiceDefault));
        return true;
    }

speech-android-sdk and Bluemix

Hi Daniel,
We're getting the attached authentication error message.
Any Idea how to solve it?

Thanks,
Gil

Response different than java-sdk

I have been using the speech-android-sdk in my app and have correctly integrated it in my app.
Its working perfectly and is correctly giving me the recording results.
My question is: the response received in java-sdk as mentioned on watson STT API reference on IBM is of the format

{
  "results": [
    {
      "word_alternatives": [
        {
          "start_time": 0.03,
          "alternatives": [
            {
              "confidence": 0.9496,
              "word": "the"
            }
          ],
          "end_time": 0.09
        },
        . . .
        {
          "start_time": 0.86,
          "alternatives": [
            {
              "confidence": 0.9471,
              "word": "report"
            }
          ],
          "end_time": 1.51
        }
      ],
      "alternatives": [
        {
          "timestamps": [
            [
              "the",
              0.03,
              0.09
            ],
            . . .
            [
              "report",
              0.86,
              1.51
            ]
          ],
          "confidence": 0.985,
          "transcript": "the latest weather report "
        }
      ],
      "final": true
    }
  ],
  "result_index": 0
}

In the android SDK, the response is received by onResponse(String s) method implemented by SpeechDelegate. The response received by this method as String s is of the format:

{  "results": [ {
                      "alternatives": [
                                              {
                                               "confidence": 0.766, 
                                               "transcript": "okay asking again "
                                               }
                                                ], 
                                                "final": true
                       }
                     ], 
"result_index": 0
 }

If I want to access the "timestamps" and "word_alternatives" entries in the response, how should I go about it?
P.S. I have implemented the SDK the same way its mentioned in the example project and its working perfectly.

SpeechRecorderDelegate documentation or souce code

Hi,

The readme states that it is possible to process or use the audio that is recorded using SpeechRecorderDelegate and setRecorderDelegate. where is the API reference for this delegate.

How can I use the audio stream/file that is streamed ?

Thx

watsonsdk.aar.zip download link is behind IBM firewall

The link points to https://git.hursley.ibm.com/w3bluemix/WatsonAndroidSpeechSDK/blob/master/speech-android-wrapper/build/outputs/aar/watsonsdk.aar.zip - we should probably put a copy of it in github releases.

crash when microphone permissions are not granted

I got the following crash (in the simulator) when trying to use STT and microphone permissions were not granted.

07-21 23:09:42.622 23782-23831/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: wss://stream.watsonplatform.net/speech-to-text/api/v1/recognize?model=ja-JP_BroadbandModel
07-21 23:09:42.623 23782-24744/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: Connecting...
07-21 23:09:42.626 1209-1601/? E/audio_hw_generic: Error opening input stream format 1, channel_mask 0010, sample_rate 16000
07-21 23:09:42.626 1209-1815/? W/ServiceManager: Permission failure: android.permission.RECORD_AUDIO from uid=10060 pid=23782

                                                 [ 07-21 23:09:42.626  1209: 1815 E/         ]
                                                 Request requires android.permission.RECORD_AUDIO
07-21 23:09:42.626 1209-1815/? E/AudioFlinger: openRecord() permission denied: recording not allowed
07-21 23:09:42.627 23782-24745/com.example.hello E/AudioRecord: AudioFlinger could not create record track, status: -1
07-21 23:09:42.627 1209-24747/? I/AudioFlinger: AudioFlinger's thread 0xb3c00000 ready to run
07-21 23:09:42.627 23782-24745/com.example.hello E/AudioRecord-JNI: Error creating AudioRecord instance: initialization check failed with status -1.
07-21 23:09:42.627 23782-24745/com.example.hello E/android.media.AudioRecord: Error code -20 when initializing native AudioRecord object.
07-21 23:09:42.627 23782-24745/com.example.hello E/AudioCaptureThread: Error reading voice audio
                                                                       java.lang.IllegalStateException: startRecording() called on an uninitialized AudioRecord.
                                                                           at android.media.AudioRecord.startRecording(AudioRecord.java:885)
                                                                           at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.AudioCaptureThread.run(AudioCaptureThread.java:62)


                                                                       --------- beginning of crash
07-21 23:09:42.628 23782-24745/com.example.hello E/AndroidRuntime: FATAL EXCEPTION: Thread-224
                                                                   Process: com.example.hello, PID: 23782
                                                                   java.lang.IllegalStateException: stop() called on an uninitialized AudioRecord.
                                                                       at android.media.AudioRecord.stop(AudioRecord.java:928)
                                                                       at com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.AudioCaptureThread.run(AudioCaptureThread.java:92)


                                                                   --------- beginning of system
07-21 23:09:42.628 1535-2000/system_process W/ActivityManager:   Force finishing activity com.example.hello/com.ibm.jp.tmnf.watson.WatsonDialogAndSpeech.WatsonActivity
07-21 23:09:42.657 1535-1580/system_process I/OpenGLRenderer: Initialized EGL, version 1.4
07-21 23:09:42.658 2176-24749/com.google.android.gms D/DropBoxEntryAddedChimeraService: User is not opted-in to Usage & Diagnostics.
07-21 23:09:43.131 1535-1549/system_process W/ActivityManager: Activity pause timeout for ActivityRecord{dfc5505 u0 com.example.hello/com.ibm.jp.tmnf.watson.WatsonDialogAndSpeech.WatsonActivity t97 f}
07-21 23:09:43.341 23782-24750/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: WS connection opened successfully
07-21 23:09:43.341 23782-24744/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: Connected
07-21 23:09:43.345 23782-24744/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: Sending init message: {"action":"start","content-type":"audio\/ogg;codecs=opus","interim_results":true,"continuous":true,"inactivity_timeout":300}
07-21 23:09:43.345 23782-24744/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: WebSocket Connection established
07-21 23:09:43.626 23782-24750/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploaderonMessage: {
                                                                                                                                                 "state": "listening"
                                                                                                                                              }
07-21 23:09:52.631 1535-1549/system_process W/ActivityManager: Launch timeout has expired, giving up wake lock!
07-21 23:10:06.626 1535-1546/system_process I/AccountManagerService: getTypesVisibleToCaller: isPermitted? true
07-21 23:10:06.630 1209-1815/? D/WVCdm: Instantiating CDM.
07-21 23:10:06.630 1209-1602/? I/WVCdm: CdmEngine::OpenSession
07-21 23:10:06.630 1209-1602/? I/WVCdm: Level3 Library Sep 28 2015 13:08:28
07-21 23:10:06.631 1209-1602/? W/WVCdm: Could not read /data/mediadrm/IDM1013/ay64.dat2: No such file or directory
07-21 23:10:06.631 1209-1602/? W/WVCdm: Could not load liboemcrypto.so. Falling back to L3.  dlopen failed: library "liboemcrypto.so" not found
07-21 23:10:06.634 1209-1602/? I/WVCdm: CdmEngine::QueryKeyControlInfo
07-21 23:10:06.634 1209-1209/? W/WVCdm: BufferReader::Read<T> : Failure during parse: Not enough bytes (4)
07-21 23:10:06.634 1209-1209/? W/WVCdm: CdmEngine::ExtractWidevinePssh: Unable to read atom size.
07-21 23:10:06.634 1209-1209/? I/WVCdm: CdmEngine::GenerateKeyRequest
07-21 23:10:06.635 1209-1209/? D/WVCdm: PrepareKeyRequest: nonce=1743666901
07-21 23:10:06.688 1209-1601/? I/WVCdm: CdmEngine::CloseSession
07-21 23:10:06.997 4711-4711/? W/Binder_2: type=1400 audit(0.0:10): avc: denied { read } for name="/" dev="tmpfs" ino=3073 scontext=u:r:untrusted_app:s0:c512,c768 tcontext=u:object_r:device:s0 tclass=dir permissive=0
07-21 23:10:07.200 1868-25107/com.google.android.gms.persistent W/GLSUser: [AppCertManager] IOException while requesting key: 
                                                                           java.io.IOException: Invalid device key response.
                                                                               at exx.a(SourceFile:267)
                                                                               at exx.a(SourceFile:2235)
                                                                               at exw.a(SourceFile:46)
                                                                               at exq.a(SourceFile:55)
                                                                               at exp.a(SourceFile:113)
                                                                               at com.google.android.gms.auth.account.be.legacy.AuthCronChimeraService.a(SourceFile:1054)
                                                                               at ovi.run(SourceFile:179)
07-21 23:10:13.840 23782-24750/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploaderonMessage: {
                                                                                                                                                 "error": "Session timed out due to inactivity after 30 seconds."
                                                                                                                                              }
07-21 23:10:13.840 23782-24750/com.example.hello D/WatsonJSInterface: No results found. {
                                                                         "error": "Session timed out due to inactivity after 30 seconds."
                                                                      }
07-21 23:10:13.843 23782-24750/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: WebSocket closed
07-21 23:10:13.843 23782-24750/com.example.hello D/com.ibm.watson.developer_cloud.android.speech_to_text.v1.audio.WebSocketUploader: ### Code: 1011 reason: see the previous message for the error details. remote: true
07-21 23:10:13.843 23782-24750/com.example.hello D/WatsonJSInterface: onClose: code:1011 reason:see the previous message for the error details. remote:true

I solved the problem by going "Settings" -> "Apps" -> "MyApp" -> "Permissions" and toggling microphone to "ON" although it would be better if the app does not crash on such situation.

Quick Start Guide

Quick Guide to getting started with IBM Bluemix Watson Android SDK

I am writing this to help anyone who is going through or went through what I did to get the example up and running.

To get the Android SPEECH TO TEXT and TEXT TO SPEECH working following these instructions, I am writing them as I am doing them as of 2-1-2016.

First create an IBM BlueMix account
https://console.ng.bluemix.net

Once created make the page full screen…

Go to the dashboard
https://console.ng.bluemix.net/
and click dashboard on the top right
Click Create a Space on top left
name it serviceTEST ( whatever you want… )
Once completed click Create App
Web App
SDK Node Js App
click Continue
name it serviceTESTApp1 ( whatever you want… ) If you get a weird error message that probably means someone is using that name just a number or 2 to the end… serviceTESTApp12345 etc…
click Finish

Hit the back or left arrow on the top left of the page
You will see 1 CF App
Now we need to make our services, you CANNOT AS OF THIS WRITING make a service without an app…

Click on Services
click Use Services APIs
click on Speech to Text
on top right click on the App drop down and you will see your new app serviceTESTApp1
Select it and click Create

It will ask to ReStage, say YES

hit the back arrow on the top left
You will now see Services has 1

Click Services and click Speech to Text
click Service Credentials on the left
click add Credentials on the right
name it serviceTESTApp1-STT (whatever you want or just leave the default…)

Now you will see your credentials, you need these for the app

Open a notepad session on Windows or Notes session on Mac and copy all of this block of text you will need this for the android app

{
"credentials": {
"url": "https://stream.watsonplatform.net/speech-to-text/api",
"username": “yyyyyyyyyyyy”,
"password": “XXXXXXX”
}
}

Now lets make the Text to Speech Service same steps…

Hit the back or left arrow on the top left of the page

Click on Services
click Use Services APIs
click on Text to Speech
on top right click on the App drop down and you will see your new app serviceTESTApp1
Select it and click Create

It will ask to ReStage, say YES

hit the back arrow on the top left
You will now see Services has 2

Click Services and click Text to Speech
click Service Credentials on the left
click add Credentials on the right
name it serviceTESTApp1-TTS (whatever you want or just leave the default…)

Now you will see your credentials, you need these for the app

Open a notepad session on Windows or Notes session on Mac and copy all of this block of text you will need this for the android app

{
"credentials": {
"url": "https://stream.watsonplatform.net/text-to-speech/api",
"username": “yyyyyyyyyyyy”,
"password": “wwwwwwwww”
}
}

Now lets test your credentials before we move on
take the url above they are different
"url": "https://stream.watsonplatform.net/text-to-speech/api",
copy and paste the https…./api in your browser and use the user name and password blue mix provided you above

You will know you got in it will say invalid credentials if it didn’t work and if it did work you will see a JSON output and documentation recommendations…

Now we are done with Bluemix

go here and download the android SDK ( which you probably already have… )

https://github.com/watson-developer-cloud/speech-android-sdk

click download zip on top right

unzip and in the folder you will see 2 folders, 2 project and a grade folder

Open Android Studio
Click open an existing project
Select the folder you just downloaded with all the 3 folders and files in it…
or
Click File > Open > and Select the folder you just downloaded with all the 3 folders and files in it…

Now lets set the credentials we just tested in place

Click on Project on the left side of the window if you don’t see any folders or code…

go to Examples > res > values > strings.xml

where you see default credentials… delete that block and replace with below….
  

<!-- STT default credentials --> 
<string name="STTdefaultUsername">yyyyyyyy</string>
 <string name="STTdefaultPassword">xxxxxxxx</string>
 <string name="STTdefaultTokenFactory">https://stream.watsonplatform.net/speech-to-text/api</string> 


<!-- TTS default credentials --> 
<string name="TTSdefaultUsername">yyyyyyyy</string> 
<string name="TTSdefaultPassword">xxxxxxx</string> 
<string name="TTSdefaultTokenFactory">https://stream.watsonplatform.net/text-to-speech/api</string>

Now fill in the URL for Token Factory , Username , and password you received

We are done with this file

Now open Example project > Java > com….. > MainActivity

find this method initSTT comment out don’t delete it… then copy and paste this version of the method

 private boolean initSTT() {
     // initialize the connection to the Watson STT service
     String username = getString(R.string.STTdefaultUsername);
     String password = getString(R.string.STTdefaultPassword);
     String tokenFactoryURL = getString(R.string.STTdefaultTokenFactory);
     String serviceURL = "wss://stream.watsonplatform.net/speech-to-text/api";
     SpeechConfiguration sConfig = new SpeechConfiguration(SpeechConfiguration.AUDIO_FORMAT_OGGOPUS);
     SpeechToText.sharedInstance().initWithContext(this.getHost(serviceURL), getActivity().getApplicationContext(), sConfig);
     // Basic Authentication
     SpeechToText.sharedInstance().setCredentials(username, password);
     SpeechToText.sharedInstance().setModel(getString(R.string.modelDefault));
     SpeechToText.sharedInstance().setDelegate(this);
     return true;
 }

find this method initTTS comment out don’t delete it… then copy and paste this version of the method

 private boolean initTTS() {
     String username = getString(R.string.TTSdefaultUsername);
     String password = getString(R.string.TTSdefaultPassword);
     String tokenFactoryURL = getString(R.string.TTSdefaultTokenFactory);
     String serviceURL = "https://stream.watsonplatform.net/text-to-speech/api";
     TextToSpeech.sharedInstance().initWithContext(this.getHost(serviceURL));
     TextToSpeech.sharedInstance().setCredentials(username, password);
     TextToSpeech.sharedInstance().setVoice(getString(R.string.voiceDefault));
     return true;
 }

We are done with this file

Now run the app on a test device

If you see any please check internet connection messages, that simply means your credentials are not working. Test them again on your browser and also make sure the URLs for Speech to Text and Text to Speech are correct they are different urls and the credentials will NOT work on both

Now from this point if anyone has any resources on how to make the REST speech to text services work with Android please reply or message me so we can collaborate.

thanks

Improve the instructions on how to add the sdk into the android studio

No watsonsdk.aar

This might be a dumb question but I'm totally stuck on step two of installation.

Download the speech-android-wrapper.aar
Once unzipped drag the watsonsdk.aar file into your Android Studio project view under the libs folder.

I download the speech-amdroid-wrapper.aar and there is no watsonsdk.aar inside of it. Where do I get the watsonsdk.aar?

Kind regards

Websocket ssl error

I followed the instructions in the quick start guide but was unable to run the example I get the error below:

com.ibm.watson.developer_cloud.android.examples E: WebSocket error
com.ibm.watson.developer_cloud.android.examples E: ssl == null
com.ibm.watson.developer_cloud.android.examples E: ssl == null

I read here https://developer.ibm.com/answers/questions/174846/using-websockets-in-speech-to-text.html that there needs to be a cert installed for wss but I could not find instructions related to it can you point me to where the instructions are ?

documentation is awful mate...

You really don't maintain this... and you just make developers look for other solutions -> IBM loses money -> why should they pay you if you can't write some basic documentation ? ....

watson-developer-cloud / speech-android-sdk Goto Github PK

speech-android-sdk's Introduction

Watson Speech Android SDK

Table of Contents

Installation

Getting credentials

A Quick Start Guide

Speech To Text

Implement the ISpeechDelegate and SpeechRecorderDelegate in the MainActivity

Instantiate the SpeechToText instance

Get a list of models supported by the service

Get details of a particular model

Pick the model to be used

Start Audio Transcription

End Audio Transcription

Receive speech power levels during the recognize

Text To Speech

Instantiate the TextToSpeech instance

Get a list of voices supported by the service

Pick the voice to be used

Generate and play audio

Common issues

speech-android-sdk's People

Contributors

Stargazers

Watchers

Forkers

speech-android-sdk's Issues

Recommend Projects

Recommend Topics

Recommend Org