Giter VIP home page Giter VIP logo

asterisk-external-media's Introduction

Asterisk External Media Sample

This package demonstrates how to use the ARI External Media feature to transcribe the audio from a bridge using the Google Speech APIs.

Installation

Prerequisites

  • A functional Asterisk 16.6.0+ installation.
  • A conference bridge or phone configured.
  • Node.JS version 10 or greater.
  • Google Speech API credentials set in environment variable GOOGLE_APPLICATION_CREDENTIALS.
    See https://cloud.google.com/speech-to-text/docs/ for more information.

Run npm install from the top of the source tree. This will install the required npm packages including node-ari-client and @google-cloud/speech. You can then run the transcriber as bin/ari-transcriber. If you add the -g option to npm install to install system wide, you can just run ari-transcriber.

Usage

$ ari-transcriber --help
ari-transcriber [options] <dialstring>

Start the transcription server

Positionals:
  dialstring  Extension to dial such as "Local/1234"

Incoming Audio Server
  --format, -f        Asterisk format/codec                                         [string] [choices: "ulaw", "slin16"] [default: "ulaw"]
  --listenServer, -l  Address and port on which to listen for audio from Asterisk                     [string] [default: "127.0.0.1:9999"]

Speech
  --speechModel         Google Speech API model                  [string] [choices: "phone_call", "video", "default"] [default: "default"]
  --speechLang          BCP-47 Language code.  en-US, fr-CA, etc.                  [string] [choices: "en-US", "fr-CA"] [default: "en-US"]
  --speakerDiarization  Outputs words associated to speaker index to the console.                               [boolean] [default: false]

ARI
  --ariServerUrl, -a  The URL for the Asterisk instance                                        [string] [default: "http://127.0.0.1:8088"]
  --ariUser, -u       The user configured in ari.conf                                                       [string] [default: "asterisk"]
  --ariPassword, -p   The password for the user configured in ari.conf                                      [string] [default: "asterisk"]

Transcription WebSocket
  --sslCert  WebSocket secure server (wss) certificate. If omitted, a non-secures (ws) websocket server will be created           [string]
  --sslKey   WebSocket secure server key. Must be supplied if sslCert is ssplied                                                  [string]
  --wssPort  WebSocket server port.  If omitted, no websocket server will be started                                              [number]

Options:
  --version          Show version number                                                                                         [boolean]
  --help             Show help                                                                                                   [boolean]
  --audioOutput, -o  A file into which the raw audio from Asterisk can be written                                                 [string]

Examples:
  ari-transcriber --format=slin16 --sslCert=/etc/letsencrypt/live/myserver/fullchain.pem
  --sslKey=/etc/letsencrypt/live/myserver/privkey.pem --wssPort=39990 --speakerDiarization 'Local/1234'

The ari-transcriber performs several tasks:

  • Creates an ari-client instance
  • Creates a WebSocket server from which the live transcription can be accessed
  • Starts an audio server to receive the audio from Asterisk
  • Creates an instance of Google Speech Provider that takes the audio from the server, transcribes it, and sends the transcription out the websocket
  • Uses the ARI instance to:
    • Create a mixing bridge
    • Create a Local channel which dials the conference bridge to be monitored
    • Place the local channel into the mixing bridge
    • Create an External Media channel that directs the audio to the audio server
    • Place the External Media channel into the mixing bridge

Why the Local channel and bridge? To keep the sample as simple as possible, it's assumes that a conference bridge is already available. Since that bridge wouldn't be controlled by ARI/Stasis, we can't just add the External Media channel directly to it. Instead we have to create a Local channel that dials the conference bridge, then bridge that channel with the External Media channel. If you were ading External Media capabilities to your own application, chances are that your app would already control the participant bridge and you wouldn't need the Local channel and mixing bridge.

Try It!

You don't need the WebSocket transcription server to try this. Just a phone to call.

$ export GOOGLE_APPLICATION_CREDENTIALS=<path to Google API credentials>
$ ari-transcriber --format=slin16 'Local/1234'

asterisk-external-media's People

Contributors

dependabot[bot] avatar gtjoseph avatar sh0g0cub3 avatar tanakomtalawat avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.