Giter VIP home page Giter VIP logo

cognitive-speech-stt-windows's Introduction

Samples for Microsoft C# speech client library

September 2018: New Microsoft Cognitive Services Speech SDK available

We released a new Speech SDK supporting the new Speech Service. The new Speech SDK comes with support for Windows, Android, Linux, Javascript and iOS.

Please check out Microsoft Cognitive Services Speech SDK for documentation, links to the download pages, and the samples.

NOTE: The content of this repository is supporting the Bing Speech Service, not the new Speech Service. Bing Speech Service has been deprecated, please use the new Speech Service.

This repository contains samples for transcribing speech to text using Microsoft Speech Service, an offering within Microsoft Cognitive Services, formerly known as Project Oxford.

The C# client library

The C# client library is provided as Nuget packages. There are two NuGet packages available at nuget.org.

The sample

This sample is a Windows WPF application to demonstrate the use of Speech-to-Text with Microsoft Speech API. It demonstrates the following features using a wav file or external microphone input:

  • Short-form recognition
  • Long-form dictation
  • Recognition with intent

Build the sample

  1. Start Microsoft Visual Studio 2015 and select File > Open > Project/Solution.
  2. Navigate to the folder where you cloned the repository.
  3. Double-click the Visual Studio 2015 Solution file SpeechToText-WPF-Sample.sln.
  4. There are 2 projects in the solution. One is for the x86 platform, and the other is for the x64 platform. Choose the build platform (x86 or x64) and configuration (Debug or Release) of your preference, and build the project.
  5. For intent recognition to work, you need to sign up to the Language Understanding Intelligent Service (LUIS). Please put the endpoint URL of your LUIS app in app.config file in the samples/SpeechRecognitionServiceExample folder. For more infomraiton on the endpoint URL of LUIS app, see Publish LUIS App. You must replace the character & in the LUIS endpoint URL with & to ensure that the URL is correctly interpreted by the XML parser.

Run the sample

Before running the sample, you must first have a subscription key. You can get free trial subscription keys from the Cognitive Services Subscription page. After you select the Speech API, click Get API Key to get the key. It returns a primary and secondary key. Both keys are tied to the same quota, so you may use either key.

Paste your subscription key to the text edit box saying "Paste your subscription key here to start" on the top right corner. You can choose to persist your subscription key in your machine by clicking "Save Key" button. When you want to delete the subscription key from the machine, click "Delete Key" to remove it from your machine.

Microsoft will receive the audio you upload and may use them to improve the speech API and related services. By submitting an audio, you confirm you have consent from everyone in it.

Contributing

We welcome contributions. Feel free to file issues and pull requests on this repository and we'll address them as we can. Learn more about how you can help on our Contribution Rules & Guidelines.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

All Microsoft Cognitive Services SDKs and samples are licensed with the MIT License. For more details, see LICENSE.

Sample images are licensed separately, please refer to LICENSE-IMAGE.

Developer Code of Conduct

Developers using Cognitive Services, including this client library and sample, are expected to follow the "Developer Code of Conduct for Microsoft Cognitive Services", found at http://go.microsoft.com/fwlink/?LinkId=698895.

cognitive-speech-stt-windows's People

Contributors

delfu avatar jasonvomsft avatar lightfrenzy avatar msftgits avatar priyaravi20 avatar stevchin avatar supernova-eng avatar wolfma61 avatar wsturman avatar zhouwangzw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognitive-speech-stt-windows's Issues

SpeechClient.dll frequently crashes

Recently we have upgraded version 0.4.10.1 to 1.0.0.2 (the current one) hoping it will be more stable.
With the old version we've had 2 crashes:

  1. When going online after being offline.
  2. When deleting the DataRecognitionClient object after calling EndAudio, because callbacks were still streaming results (we use managed C++).

With the new version we have 2 crashes of which the first is catastrophic:

  1. Very frequently: Less than a seconds after sending audio to the transcription server SpeechClient.dll working thread crashes.
  2. As in #2 above.

Stack traces for the new crashes:

  1. ExceptionAddressSymbolInfo=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D6783, module offset(base): 0x46783(0x7ffc0f290000)
    StackTrace00=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D6783, module offset(base): 0x46783(0x7ffc0f290000)
    StackTrace01=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D744E, module offset(base): 0x4744e(0x7ffc0f290000)
    StackTrace02=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D8A95, module offset(base): 0x48a95(0x7ffc0f290000)
    StackTrace03=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D7DD9, module offset(base): 0x47dd9(0x7ffc0f290000)
    StackTrace04=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D3A8D, module offset(base): 0x43a8d(0x7ffc0f290000)
    StackTrace05=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2D31C8, module offset(base): 0x431c8(0x7ffc0f290000)
    StackTrace06=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2C18DA, module offset(base): 0x318da(0x7ffc0f290000)
    StackTrace07=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2FF087, module offset(base): 0x6f087(0x7ffc0f290000)
    StackTrace08=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFC0F2FEC4D, module offset(base): 0x6ec4d(0x7ffc0f290000)

  2. ExceptionAddressSymbolInfo=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4E30209, module offset(base): 0x60209(0x7ffdc4dd0000)
    StackTrace00=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4E30209, module offset(base): 0x60209(0x7ffdc4dd0000)
    StackTrace01=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4E30398, module offset(base): 0x60398(0x7ffdc4dd0000)
    StackTrace02=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4DFDE90, module offset(base): 0x2de90(0x7ffdc4dd0000)
    StackTrace03=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4DFF499, module offset(base): 0x2f499(0x7ffdc4dd0000)
    StackTrace04=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4E3F04C, module offset(base): 0x6f04c(0x7ffdc4dd0000)
    StackTrace05=C:\Program Files\Screenovate\BluePhone\SpeechClient.dll! DllUnregisterServer - 0x00007FFDC4E3EC4D, module offset(base): 0x6ec4d(0x7ffdc4dd0000)

Need urgent help with intents.

Hi guys,

I seem to having a problem with CreateMicrophoneClientWithIntent. I get microphone recognitions but intents don't seem to get returned. I'm actually attending a hackathon event hosted by Microsoft and I don't seem to be getting the answers I need.

Where do I get my LUIS Subscription ID for use with your solution? Where can I find my APP Id?

I've currently been using https://www.luis.ai to obtain these keys. I have tried using the API key for the luisSubscriptionId variable and the app ID for the luisAppId variable but the intent handler never gets called.

Sorry to label this as urgent but ,I'm actually attending a hackathon where I intend to implement this so time is somewhat limited.

GetAuthorizationTokenAsync is going in an infinite loop

When the method is called it generates a token but despite a token keeps going to generating token again and again and the method GetAuthorizationTokenAsync is called again and again infinitely. kindly help me out with the issue!!

The remote server returned an error: (401) Unauthorized

This is similar to #35 but in my case, my first call failed instead of failing it later.

I am getting this error. I have generated multiple keys from my two MSDN accounts that came with my two different Visual Studio accounts. Same error.

I tried following as suggested by zhouwangzw. It is giving same 401
$FetchTokenHeader = @{
'Content-type'='application/x-www-form-urlencoded';
'Content-Length'= '0';
'Ocp-Apim-Subscription-Key' = 'YOUR_API_KEY'
}

Invoke-RestMethod -Method POST -Uri https://api.cognitive.microsoft.com/sts/v1.0/issueToken -Headers $FetchTokenHeader

My resource group being used has location "South Central US", my location is Colorado, my subscription is "Visual Studio Enterprise".

I there anything else I have to do AFTER generating key and BEFORE using it?

Thanks

LoginFailed, transport error

I followed the steps listed in the instructions, ran the sample and got the following error:
Error code: LoginFailed

Also, what value will the sample need for the AuthenticationUri key in app.config

Infinite Error Message Loop

Hi, yesterday i noticed that my API Key expired.
(I use Microsoft.ProjectOxford.SpeechRecognition-x86.1.0.0.6 nuget version).

So, starting the the <<microphoneRecognitionClient.StartMicAndRecognition()>> causes an Infinite error message loop that shows in alternation <> and <>.

So i decided to try to <<microphoneRecognitionClient.EndMicAndRecognition()>> in <<MicrophoneRecognitionClient_OnConversationError>> event and I noticed that the Infinite lopp ended after about 6/7 calls of the error event.

I think this is abnormal operation of the error event caller inside the library.

Speech Client Dll failing in Windows 7 32 bit

When we run application in 32 bit windows 7. we are facing below exception.

System.IO.FileLoadException: 'A procedure imported by 'SpeechClient.dll' could not be loaded.'

Can you please let us know if this has windows7 compatibility.

DataRecognitionClient No phrase response is available

I try to use the DataRecognitionClient to send audio data for speech to text regocnition.
The MicrophoneClient is working fine, however using the DataRecognitionClient does not yield any results at all. I tried using my own wave file (mono, 16bit, 16kHz) and the two sample .wav files (batman and whatstheweatherlike) provided in this repo, using both, ShortPhrase and LongDictation mode.

I use the SpeechRecognitionServiceFactory to create the client instances and connect the OnResponseReceived and OnConversationError events. I noticed that using my own .wav file which has some initial silence, results in OnResponseReceived being called, before EndAudio() is called. However, the RecognitionStatus is not "InitialSilenceTimeout" but always "None".

Using the sample wav files OnResponseReceived is being called after EndAudio, but still the result remains RecognitionStatus "None" and e.PhraseResponse.Results.Length equals zero. OnConversationError is never called, with the exception, when I provide a false key when initializing the client.

Could you direct me where I'm going wrong or fix this issue, please?

Abnormal dispose time after StartRec

Hi,
with the latest library update (1.0.0.6, NuGet), i noticed that after calling -StartMicAndRecognition();-
the dispose() time takes about 2 to 3 secs.
This is weird because if you dispose the -MicrophoneRecognitionClient- just after the constructor, it only takes some milliseconds.
I followed the example and noticed that this happens in it too.
I remain available for any other information.
Thanks.

DataRecognitionClient/SpeechClient cannot handle subsequent recognitions

Hi,

I'm having trouble trying to use DataRecognitionClient processing anything more than one audio request in an app session. It can work, but you seem to need to wait a few moments for it to be "ready" for the next streaming attempt.

For example:

// Create Client
var client = SpeechRecognitionServiceFactory.CreateDataClient(SpeechRecognitionMode.LongDictation, "en-US", "KEY");

// Prepare callback
client.OnResponseReceived += (sender, args) => ...; // Capture transcript here

// Send audio
client.SendAudio(...);

// No further processing
client.EndAudio();

If I attempt to run this code multiple times, even with a new client instance, the second instance will wait for a while/block/cause a crash.

I notice the SpeechClient.dll uses background threads, unsafe callbacks, etc, perhaps there is a race/threading issue.

.net Core 2.0

Hi,

will the C#-Client-Library be made available for .NET Core 2.0 in the future? Ist there any information about this?

thanks

Error on Execution of "Start Recognition"

I have built and retrieved a LUIS endpoint. I have a Speech Services Key and it is stored in the app. The app builds, but as soon as I click "Start Recognition", I get repetitive error handling:

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

What am I missing?

How to complete a recognize loop

I use the Kinect v2 as microphone input device, and I want to use audio stream to complete auto recognition, like the Kinect SDK v2 Sample “Speech Basics”, I just speech without start Button.

Call WebSocket service for Long-Running in .net core?

Hey,

I am trying to call the Websocket version of the service in .Net Core, but notice that there is no support for this library on .Net Core. I feel I should be able to implement the service myself, by making the right request via a ClientWebSocket, but I can't get it to work. I came here hoping for answers, but it seems that the C# library which makes these calls isn't open source? Is there source available for this library?

If not, can you help me? I'm trying to connect to WebSocket implementation of the Bing Speech API because I need to send multiple large files for transcription. It needs to be .net core because the service is running on Linux.

I've tried a number of things along the following lines:

 var cws = new ClientWebSocket();
            cws.Options.SetRequestHeader("X-ConnectionId", Guid.NewGuid().ToString("N"));
            cws.Options.SetRequestHeader("Authorization", "Bearer " + token);
            await cws.ConnectAsync(new Uri(@"wss://speech.platform.bing.com/speech/recognize/interactive/cognitiveservices/v1"), new CancellationToken());

Problem calling Dispose

I'm facing a problem when i try to call the form Close() method inside the Eeventhandler OnResponseReceived: when the form closing rises...
private void MainForm_FormClosing(object sender, FormClosingEventArgs e)
{
if (micClient != null)
{
micClient.Dispose(); //stucks for several seconds
}

        base.OnClosing(e);

}
the MicrophoneRecognitionClient.Dispose() stucks for serveral seconds. It may take also 10 to 15 seconds.
Hope someone can help me.

Client Shut Down after some silences

I had this problem at least twice now. In the first instance, I have 30 seconds silences at the beginning and the client simply not doing anything for me. When I removed the silence at the beginning, it worked.

In the second instance, it appears somehow it determined that the transcription should be ended at around 1 minute 45 seconds and all the remaining transcripts are missing.

I wonder what is the policy here and what I could do to prevent this.
Thanks,

Melvin

using this sample code behind proxy

Hi,
I selected "ShortPhrase mode" and click Start, following error message occurred:

--- Error received by OnConversationErrorHandler() ---
Error code: NameNotFound
Error text: Transport error

I think the cause of this problem is proxy because I'm behind proxy.
how to make this sample code work behind proxy ?

Thank you!

Azure Function app support

Unable to run this framework in Azure function app. I am getting this error

Could not load file or assembly 'SpeechClient, Version=0.0.0.0, Culture=neutral, PublicKeyToken=720c26922c782415'. This assembly was compiled for a different processor.

It should be something to do with .net framework support I guess. I am getting a similar error in Visual studio Mac as well.

Little direction for Mac/Azure Function developers would be appreciated. Thanks

Unable to install SpeechRecognition from Nuget

Have tried clearing all packages, uninstalling and reinstall nuget, restarting VS, running VS as administrator. Not sure if it's just a NuGet error, since I can install the Newtonsoft.Json package fine

VS2015, Windows10

PM> Install-Package Microsoft.ProjectOxford.SpeechRecognition-x86
Attempting to gather dependency information for package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' with respect to project 'Kinect8', targeting 'Windows,Version=v8.1'
Attempting to resolve dependencies for package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' with DependencyBehavior 'Lowest'
Resolving actions to install package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2'
Resolved actions to install package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2'
Adding package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' to folder '<path_1>'
Added package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' to folder '<path_1>'
Install failed. Rolling back...
Package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' does not exist in project 'Kinect8'
Removing package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' from folder '<path_1>'
Removed package 'Microsoft.ProjectOxford.SpeechRecognition-x86.0.4.10.2' from folder '<path_1>'
Install-Package : Failed to add reference to 'SpeechClient'.
At line:1 char:1
+ Install-Package Microsoft.ProjectOxford.SpeechRecognition-x86
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [Install-Package], Exception
    + FullyQualifiedErrorId : NuGetCmdletUnhandledException,NuGet.PackageManagement.PowerShellCmdlets.InstallPackageCommand

SpeechClient.dll crashed as transcription started.

Build: 1.0.0.3
Dump: https://drive.google.com/file/d/0BwtlfDyTIqVmblV1Ymk2VjJSSGM/view?usp=sharing
Crashing thread is 6044.
Note: I always use 2 instances of DataRecognitionClient concurrently.
Stacktrace:
00=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E6783, module offset(base): 0x46783(0x7ff8986a0000)
01=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E744E, module offset(base): 0x4744e(0x7ff8986a0000)
02=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E8A95, module offset(base): 0x48a95(0x7ff8986a0000)
03=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E7DD9, module offset(base): 0x47dd9(0x7ff8986a0000)
04=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E3A8D, module offset(base): 0x43a8d(0x7ff8986a0000)
05=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986E31C8, module offset(base): 0x431c8(0x7ff8986a0000)
06=SpeechClient.dll! DllUnregisterServer - 0x00007FF8986D18DA, module offset(base): 0x318da(0x7ff8986a0000)
07=SpeechClient.dll! DllUnregisterServer - 0x00007FF89870F087, module offset(base): 0x6f087(0x7ff8986a0000)
08=SpeechClient.dll! DllUnregisterServer - 0x00007FF89870EC4D, module offset(base): 0x6ec4d(0x7ff8986a0000)
09=???! ??? - 0x00007FF845A0E802
10=???! ??? - 0x000001CA4FE69BD0
11=???! ??? - 0x000001CA32B56CC0
12=???! ??? - 0x000001CA4B71F76C
13=C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll! ??? - 0x00007FF8A4D646B5, module offset(base): 0x46b5(0x7ff8a4d60000)
14=???! ??? - 0x00007FF845A0E1A6
15=???! ??? - 0x000001CA4FE69BD0
16=???! ??? - 0x0000003D5BEFE8C0
17=???! ??? - 0x000001CA4B71F26C
18=???! ??? - 0x000001CA4B71F9EC
19=???! ??? - 0x000001CA4B71F5DC
20=???! ??? - 0x000001CA4B71F76C
21=???! ??? - 0x000001CA4B71F62C
22=???! ??? - 0x0000003D5BEFE880
23=speechClient.dll! DllUnregisterServer - 0x00007FF8987613A8, module offset(base): 0xc13a8(0x7ff8986a0000)
24=???! ??? - 0x000001CA30A973E0
25=???! ??? - 0x000001CA4FE69BD0
26=???! ??? - 0x000001CA32A6EFC8
27=???! ??? - 0x000001CA32B56CC0
28=???! ??? - 0x000001CA4FBB4E50
29=???! ??? - 0x000001CA32B57098
30=???! ??? - 0x000001CA32CA5DB8
31=???! ??? - 0x0000003D5BEFE9B0
32=???! ??? - 0x00007FF845A0DACF
33=???! ??? - 0x000001CA32B56CC0
34=???! ??? - 0x000001CA32B56D58
35=???! ??? - 0x000060525372E2B2

SpeechClient.dll for UWP apps?

Hi, is it possible to get a build of the SpeechClient.dll targeted at .NETCore or a Portable Library so it can be used in a UWP app?

I want to use Audiograph in conjunction with the Speech API.

Thanks!

SpeechClient does not have a strong name

--created a specific thread for this issue as it was nested in an unrelated post.

I'm receiving the following warning after installing the latest version of 'Microsoft.ProjectOxford.SpeechRecognition; version 1.0.0.6'; it suggests that the assembly is not signed; where can I locate the signed version of this assembly?
CS8002: Referenced assembly 'SpeechClient, Version=0.0.0.0, Culture=neutral, PublicKeyToken=null' does not have a strong name

Unfortunately this is a blocker for me, preventing me from implementing any features that reference it.

Can we please have another update -> when we can expect to see this issue resolved?

Can not load SpeechClient.dll

Hi,

I use the SDK library to build the application with Speech Client. Every machine works well except one machine got this issue below. I tried to build the project with x64, x86 but nothing changes. Could you help me with this issue? Thank you.

Warning when build
1>C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\MSBuild\15.0\Bin\Microsoft.Common.CurrentVersion.targets(2052,5): warning MSB3270: There was a mismatch between the processor architecture of the project being built "MSIL" and the processor architecture of the reference "SpeechClient, Version=0.0.0.0, Culture=neutral, PublicKeyToken=720c26922c782415, processorArchitecture=AMD64", "AMD64". This mismatch may cause runtime failures. Please consider changing the targeted processor architecture of your project through the Configuration Manager so as to align the processor architectures between your project and references, or take a dependency on references with a processor architecture that matches the targeted processor architecture of your project.
Runtime error
Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'SpeechClient.dll' or one of its dependencies. The specified module could not be found.

How to get the required keys

According to the instructions, I need three keys: my LUIS App ID and Subscription ID plus a Speech API subscription key.

I have created a LUIS account, but the only key that offers me is a Programmatic API Key, which is possibly what the instructions mean by the Speech API subscription key. I can see no obvious way of getting the others.

This is my first foray into MS cognitive services and I just want to get this sample running to get a feel for the speech recognition API. I've never worked with LUIS before.

Where can I find those keys?

Multiple Clients Causes Premature Microphone Termination

Hello,

When I use Cognitive-Speech-STT-Windows client library and I have one instance running on our network everything works fine. However when I have multiple instances running on separate computers, we get an intermittent issue where the speech recognition stops recording withing 300 ms.

Does this have anything to do with the fact that we are initializing the microphone client the same way on each instance that is running the service?

We are using this with Cognitive Services Custom Speech Service. Each client uses the same primary and secondary key and the same custom speech authentication URI.

Does anyone thing that somehow the different instances on the microphone client running on different machines are conflicting with each other?

Any feedback or guidance would greatly be appreciated

Kind Regards,

Jason

Want to add your own .wav files?

This is a terrific sample but i wanted to use my own .wav files and found out that i needed to do all of the following 3 actions:

  1. Add the .wav file(s) to the Samples/SpeechRecognitionService/Example folder
  2. Modify the app.config file for the value for the "LongWaveFile" key to the name of the .wav file
  3. Add the .wav file(s) to the Debug folder (..bin/x64/Debug). If you fail to do this, app will crash with 'System.IO.FileNotFoundException occurred'

Login Faied & Transport error

I follow every step to build this app, When I try to use it. It show following error message, I use my key from Bing Speech - Preview. any suggestion what I should try next?

error message:
--- Start speech recognition using short wav file with ShortPhrase mode in en-US language ----

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

The remote server returned an error: (401) Unauthorized

Today I started receive The remote server returned an error: (401) Unauthorized error. My app is working since Cognitive Services was Project Oxford.
Now I'm getting this, and I can't find where I must login to get a new apiKey.

Everything I found on internet redirect me to Azure. I've never signed up to Azure to do this.
How do I fix this?

Not getting any phrase response in Bing Speech API client library

Hello all !
I am exploring the sample of Cognitive services Bing Speech API for about 2 days. Yesterday it worked perfect, but today when I started it again then the application was not taking any speech input directly. I've already tested my Microphone and its working fine with "Cortana", but in the case of this sample it's not working. I have also tried re-installing the audio drivers and then tested but still it was same as before.

What could be the possible reasons for this problem and what can I do to solve them ?

Requires methods or properties for authentication tokens

When the speech is recognized continuously, recognition stops.
I think the reason is because the tokens expired. I want you to create a new property that will give you an outside token. Or, you should have a method to re-fetch the token.

error CS1061: 'DataRecognitionClient' does not contain a definition for 'WaitForFinalResponse' and no extension method 'WaitForFinalResponse'

From @kbpachauri on June 29, 2016 18:2

I am trying to call WaitForFinalResponse, but getting error, According to sdk documentation, this method is supported (https://staging.www.projectoxford.ai/Files/Doc/Speech/Windows/html/ff4169e5-6142-558a-92af-48618f592f39.htm)
this.dataClient.WaitForFinalResponse();
kindly help. Thanks

Copied from original issue: microsoft/ProjectOxford-ClientSDK#82

C# example does not work

I'm getting errors while running the WPF application.


--- Start speech recognition using long wav file with LongDictation mode in en-US language ----

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

repeated....

It looks the app is able to upload the wav file, the SendAudioHelper method completes its work.

Here is my config.

Redacted deployment info -
image

And my redacted app.config

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <startup>
    <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5"/>
  </startup>
  <appSettings>
    <add key="luisAppID" value="75..." />
    <add key="luisSubscriptionID" value="50..." />
    <add key="ShortWaveFile" value="whatstheweatherlike.wav" />
    <add key="LongWaveFile" value="batman.wav" />
    <!-- 
     Enter an optional authentication Uri such as: 
     https://westus.api.cognitive.microsoft.com/sts/v1.0/issuetoken
    -->
    <add key="AuthenticationUri" value="" />
  </appSettings>
</configuration>

And in the app there are repeated errors -

image

Response is looped

Started today: the OnResponseReceived event loops the same response over and over again after I have provided the first input throught the microphone. I happens when I call StartMicAndRecognition() to receive the next input.

I have not changed any code, so is anybody experiencing the same?

The only thing I did yesterday was to remote to my computer, could that cause the problem?

Cannot enroll more than one phrase at a time.

We are asked to enroll three voice samples for a single profile id.But after enrolling first voice sample, I am unable to add another two samples.

I am getting invalid phrase exception.

Please help me out.

issue when running the sample SpeechToText _WPF_sample

Hi,
When we try running the sample SpeechToText_WPF_sample (with a free subscriptionKey), somtimes it works but frequently we are blocked with :
--- Start speech recognition using microphone with ShortPhrase mode in en-US language ----
--- Microphone status change received by OnMicrophoneStatus() ---
********* Microphone status: True *********
Please start speaking.

and after a few minutes (maybe 1 or 2 minutes) we get this message (and all radiobutton are disabled) :
--- Microphone status change received by OnMicrophoneStatus() ---
********* Microphone status: False *********

i don't think its a network problem because when i test with a bad SubscriptionKey i get instantely this error message :
--- Error received by OnConversationErrorHandler() ---
Error code: LoginFailed
Error text: Transport error

could you help us please ?

Regards,
Wajdi

There is any response after Results.length = 0

`private void Recognizer_OnResponseReceived(object sender, SpeechResponseEventArgs e)
{

        if (e.PhraseResponse.Results.Length == 0)
        {
            Console.WriteLine("No phrase response is available.");
        }
        else
        {
            Console.WriteLine("********* Final n-BEST Results *********");
            for (int i = 0; i < e.PhraseResponse.Results.Length; i++)
            {
                Console.WriteLine( "[{0}] Confidence={1}, Text=\"{2}\"", i, e.PhraseResponse.Results[i].Confidence, e.PhraseResponse.Results[i].DisplayText);
            }
            Console.WriteLine();
        }
    }`

after console output the "No phrase response is available", I speech but no recognition result.

is it fixed to always use the .wav audio file....?

while using DataRecognitionClient, are we have to use only .wav audio files or have the option to send other audio file formats...? lack of documentation its seems hard. How we can use SpeechAudioFormat type...?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.