Giter VIP home page Giter VIP logo

google-cloud-speech-node-socket-playground's Introduction

Google Cloud Speech Node with Socket Playground

License: MIT

An easy-to-set-up playground for cross device real-time Google Speech Recognition with a Node server and socket.io. Phew.

Yo this is a test

Run Local

  1. get a free test key from Google
  2. place it into the src folder and update the path in the .env file
  3. open the terminal and go to the src folder
  4. run npm install
  5. run node app.js or with nodemon: nodemon app
  6. go to http://127.0.0.1:1337/

Run on Server

Same as run local 1-4.

  1. config the .env Port for a port that you've opened on the server. I'm using 1337 here, too.
  2. go to your server adress

I recommend using pm2 or something similar, to keep the process running even when closing the terminal connection.

Examples

Config

It's possible to set a recognition context / add misunderstood words for better recognition results in the app.js request params. For more details on the configuration, go here.

For other languages than english, look up your language code.

How Does the Client Process the Stream?

Google Cloud sends intermittent responses to the uploaded audio stream. Each response from Google Cloud contains the current estimation of the full sentence for the streamed audio.

When Google Cloud senses that the audio has reached an end of sentence, it will issue a response with an isFinal flag set to true. Once this flag is issued, the client will finalize the sentence and write it to the document.

This process is repeated until the user ends the recording.

Interim Natural Language Processing

The client application highlights different parts of speech, such as nouns and verbs, by using this natural language processing library.

Socket Connection

The client communicates with the server using Socket.io.

Troubleshooting

  • If you have delays in calls, check if IPV6 is disabled on your server

Super Reduced Version for Devs

There is now a super reduced log only verison. It show's only two buttons, logs the results to the console and has no nlp. Use this if you want to implement it somewhere else.

Made by Vinzenz Aubry

google-cloud-speech-node-socket-playground's People

Contributors

cameronnapoli avatar vin-ni avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.