revdotcom / revai-python-sdk Goto Github PK

View Code? Open in Web Editor NEW

35.0 41.0 11.0 2.06 MB

Rev AI Python SDK

Home Page: https://docs.rev.ai

License: MIT License

Makefile 1.00% Python 98.74% Dockerfile 0.26%

python rev sdk captions realtime speech-recognition speech-to-text transcription-job

revai-python-sdk's Introduction

Rev AI Python SDK

Documentation

See the API docs for more information about the API and more python examples.

Installation

You don't need this source code unless you want to modify the package. If you just want to use the package, just run:

pip install --upgrade rev_ai

Install from source with:

python setup.py install

Requirements

Python 3.8+

Usage

All you need to get started is your Access Token, which can be generated on your AccessToken Settings Page. Create a client with the generated Access Token:

from rev_ai import apiclient

# create your client
client = apiclient.RevAiAPIClient("ACCESS TOKEN")

Sending a file

Once you've set up your client with your Access Token sending a file is easy!

# you can send a local file
job = client.submit_job_local_file("FILE PATH")

# or send a link to the file you want transcribed
job = client.submit_job_url("https://example.com/file-to-transcribe.mp3")

job will contain all the information normally found in a successful response from our Submit Job endpoint.

If you want to get fancy, both submit job methods take metadata,notification_config, skip_diarization, skip_punctuation, speaker_channels_count,custom_vocabularies, filter_profanity, remove_disfluencies, delete_after_seconds,language, and custom_vocabulary_id as optional parameters.

The url submission option also supports authentication headers by using the source_config option.

You can request transcript summary.

# submitting a human transcription jobs
job = client.submit_job_url("https://example.com/file-to-transcribe.mp3",
    language='en',
    summarization_config=SummarizationOptions(
        formatting_type=SummarizationFormattingOptions.BULLETS
    ))

You can request transcript translation into up to five languages.

job = client.submit_job_url("https://example.com/file-to-transcribe.mp3",
    language='en',
    translation_config=TranslationOptions(
        target_languages: [
            TranslationLanguageOptions("es", TranslationModel.PREMIUM),
            TranslationLanguageOptions("de")
        ]
    ));

All options are described in the request body of the Submit Job endpoint.

Human Transcription

If you want transcription to be performed by a human, both methods allow you to submit human transcription jobs using transcriber=human with verbatim, rush, segments_to_transcribe and test_mode as optional parameters. Check out our documentation for Human Transcription for more details.

# submitting a human transcription jobs
job = client.submit_job_url("https://example.com/file-to-transcribe.mp3",
    transcriber='human',
    verbatim=False,
    rush=False,
    test_mode=True
    segments_to_transcribe=[{
        start: 2.0,
        end: 4.5
    }])

Checking your file's status

You can check the status of your transcription job using its id

job_details = client.get_job_details(job.id)

job_details will contain all information normally found in a successful response from our Get Job endpoint

Checking multiple files

You can retrieve a list of transcription jobs with optional parameters

jobs = client.get_list_of_jobs()

# limit amount of retrieved jobs
jobs = client.get_list_of_jobs(limits=3)

# get jobs starting after a certain job id
jobs = client.get_list_of_jobs(starting_after='Umx5c6F7pH7r')

jobs will contain a list of job details having all information normally found in a successful response from our Get List of Jobs endpoint

Deleting a job

You can delete a transcription job using its id

client.delete_job(job.id)

All data related to the job, such as input media and transcript, will be permanently deleted. A job can only by deleted once it's completed (either with success or failure).

Getting your transcript

Once your file is transcribed, you can get your transcript in a few different forms:

# as text
transcript_text = client.get_transcript_text(job.id)

# as json
transcript_json = client.get_transcript_json(job.id)

# or as a python object
transcript_object = client.get_transcript_object(job.id)

# or if you requested transcript translation(s)
transcript_object = client.get_translated_transcript_object(job.id,'es')

Both the json and object forms contain all the formation outlined in the response of the Get Transcript endpoint when using the json response schema. While the text output is a string containing just the text of your transcript

Getting transcript summary

If you requested transcript summary, you can retrieve it as plain text or structured object:

# as text
summary = client.get_transcript_summary_text(job.id)

# as json
summary = client.get_transcript_summary_json(job.id)

# or as a python object
summary = client.get_transcript_summary_object(job.id)

Getting captions output

You can also get captions output from the SDK. We offer both SRT and VTT caption formats. If you submitted your job as speaker channel audio then you must also provide a channel_id to be captioned:

captions = client.get_captions(job.id, content_type=CaptionType.SRT, channel_id=None)

# or if you requested transcript translation(s)
captions = client.get_translated_captions(job.id, 'es')

Streamed outputs

Any output format can be retrieved as a stream. In these cases we return the raw http response to you. The output can be retrieved via response.content, response.iter_lines() or response.iter_content().

text_stream = client.get_transcript_text_as_stream(job.id)

json_stream = client.get_transcript_json_as_stream(job.id)

captions_stream = client.get_captions_as_stream(job.id)

Streaming audio

In order to stream audio, you will need to setup a streaming client and a media configuration for the audio you will be sending.

from rev_ai.streamingclient import RevAiStreamingClient
from rev_ai.models import MediaConfig

#on_error(error)
#on_close(code, reason)
#on_connected(id)

config = MediaConfig()
streaming_client = RevAiStreamingClient("ACCESS TOKEN",
                                        config,
                                        on_error=ERRORFUNC,
                                        on_close=CLOSEFUNC,
                                        on_connected=CONNECTEDFUNC)

on_error, on_close, and on_connected are optional parameters that are functions to be called when the websocket errors, closes, and connects respectively. The default on_error raises the error, on_close prints out the code and reason for closing, and on_connected prints out the job ID. If passing in custom functions, make sure you provide the right parameters. See the sample code for the parameters.

Once you have a streaming client setup with a MediaConfig and access token, you can obtain a transcription generator of your audio. You can also use a custom vocabulary with your streaming job by supplying the optional custom_vocabulary_id when starting a connection!

More optional parameters can be supplied when starting a connection, these are metadata, filter_profanity, remove_disfluencies, delete_after_seconds, and detailed_partials. For a description of these optional parameters look at our streaming documentation.

response_generator = streaming_client.start(AUDIO_GENERATOR, custom_vocabulary_id="CUSTOM VOCAB ID")

response_generator is a generator object that yields the transcription results of the audio including partial and final transcriptions. The start method creates a thread sending audio pieces from the AUDIO_GENERATOR to our [streaming] endpoint.

If you want to end the connection early, you can!

streaming_client.end()

Otherwise, the connection will end when the server obtains an "EOS" message.

Submitting custom vocabularies

In addition to passing custom vocabularies as parameters in the async API client, you can create and submit your custom vocabularies independently and directly to the custom vocabularies API, as well as check on their progress.

Primarily, the custom vocabularies client allows you to submit and preprocess vocabularies for use with the streaming client, in order to have streaming jobs with custom vocabularies!

In this example you see how to construct custom vocabulary objects, submit them to the API, and check on their progress and metadata!

from rev_ai import custom_vocabularies_client
from rev_ai.models import CustomVocabulary

# Create a client
client = custom_vocabularies_client.RevAiCustomVocabulariesClient("ACCESS TOKEN")

# Construct a CustomVocabulary object using your desired phrases
custom_vocabulary = CustomVocabulary(["Patrick Henry Winston", "Robert C Berwick", "Noam Chomsky"])

# Submit the CustomVocabulary
custom_vocabularies_job = client.submit_custom_vocabularies([custom_vocabulary])

# View the job's progress
job_state = client.get_custom_vocabularies_information(custom_vocabularies_job['id'])

# Get list of previously submitted custom vocabularies
custom_vocabularies_jobs = client.get_list_of_custom_vocabularies()

# Delete the CustomVocabulary
client.delete_custom_vocabulary(custom_vocabularies_job['id'])

For more details, check out the custom vocabularies example in our examples.

For Rev AI Python SDK Developers

Remember in your development to follow the PEP8 style guide. Your code editor likely has Python PEP8 linting packages which can assist you in your development.

Local testing instructions

Prequisites: virtualenv, tox

To test locally use the following commands from the repo root

virtualenv ./sdk-test
. ./sdk-test/bin/activate
tox

This will locally run the test suite, and saves significant dev time over waiting for the CI tool to pick it up.

revai-python-sdk's People

Contributors

Stargazers

Watchers

Forkers

upasanaghosh dbader yilu1021 coffeepy jdongian bhigy hrist0stoichev moretore jakethesnake420 noroshan

revai-python-sdk's Issues

Microphone example - How to?

In your microphone example,

example_mc = MediaConfig('audio/x-raw', 'interleaved', 44100, 'S16LE', 1)
streamclient = RevAiStreamingClient(access_token, example_mc)

How do I pass my speech? Can you give a sample example where there's a GUI to do it?

AttributeError: 'RevAiAPIClient' object has no attribute 'send_job_local_file'

When running:

client = apiclient.RevAiAPIClient(config.revai_access_token)
job = client.submit_job_local_file(file)

I get this error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/rev_ai/apiclient.py", line 135, in submit_job_local_file
    response.raise_for_status()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://api.rev.ai/speechtotext/v1/jobs

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1076)

I'm trying to run the microphone sample from the examples directory and this is what I'm facing.

Connection Closed. Code : 1009; Reason : Max frame length of 96200 has been exceeded.

Hi, on Windows 10 + Python 3.11.9, if I use the "generator_streaming_example.py" example as is, it runs fine, all my raw and flac files work.

Once I change

response_generator = streamclient.start(MEDIA_GENERATOR)

response_generator = streamclient.start(MEDIA_GENERATOR, language="de")

nothing is returned, no partials, no finals, and i quickly run into "Connection Closed. Code : 1009; Reason : Max frame length of 96200 has been exceeded."

Is v1 not supported anymore? Cheers

Language mismatch in transcription

This is more of a question than an issue.

If we set the language parameter to "en" and the media language is "es" (or any other language, the conversation takes place in that lang alone), do we get poor confidences or an actual error?

Maybe one from this list?
"internal_processing" "download_failure" "duration_exceeded" "duration_too_short" "invalid_media" "empty_media" "transcription" "insufficient_balance" "invoicing_limit_exceeded"

Thanks!

Backend also updated?

Since you guys did the API cleanup I have not been able to submit a job with my old code using 0.0.3

It fails during json decoding. (It expecting something, but there was nothing)

I also tried cloning your current version. (Even though that's not what you have on pypi)
and after fixing a formatting issue in apiclient.py I did a submission, but it fails in the json decoder just like it does with the 0.0.3 version.

BTW, I think that submit_job_local_file() and submit_job_url() would be better if you kept options as optional with sensible defaults instead of forcing the construction of an options class.

Example Scripts Throw SSL Error

Hi,

If you take "generator_streaming_example.py" and run it with a valid access token, it throws an exception on line 42:

response_generator = streamclient.start(MEDIA_GENERATOR)

with the below error trace.

Any ideas? I get the same error when running "microphone_streaming_example.py" with the same command (line 102).

/Users/nlee/PycharmProjects/AudioRead/env/bin/python3.7 /Users/nlee/PycharmProjects/AudioRead/revai-python-sdk-develop/examples/generator_streaming_example.py
Traceback (most recent call last):
File "/Users/nlee/PycharmProjects/AudioRead/revai-python-sdk-develop/examples/generator_streaming_example.py", line 42, in
response_generator = streamclient.start(MEDIA_GENERATOR)
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/rev_ai/streamingclient.py", line 94, in start
self.on_error(e)
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/rev_ai/streamingclient.py", line 16, in on_error
raise error
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/rev_ai/streamingclient.py", line 92, in start
self.client.connect(url)
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/websocket/_core.py", line 223, in connect
options.pop('socket', None))
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/websocket/_http.py", line 126, in connect
sock = _ssl_socket(sock, options.sslopt, hostname)
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/websocket/_http.py", line 260, in _ssl_socket
sock = _wrap_sni_socket(sock, sslopt, hostname, check_hostname)
File "/Users/nlee/PycharmProjects/AudioRead/env/lib/python3.7/site-packages/websocket/_http.py", line 239, in _wrap_sni_socket
server_hostname=hostname,
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 412, in wrap_socket
session=session
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 850, in _create
self.do_handshake()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/ssl.py", line 1108, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1045)

Process finished with exit code 1

cleanup

rm HISTORY.md
update long_description
update MANFIEST

Mislabeled attribute - apiclient.RevAiAPIClient

Hi - it seems that in the documentation, client.send_job_local_file is included. Instead, when running it should be client.submit_job_local_file

Thanks

How to specify transcription with v1?

In certain cases, I find the v1 transcriptions to be preferable to the v2 transcriptions. How do I specify which version to use in python? Currently, I submit a job with this:

    token = 'TOKEN'
    client = apiclient.RevAiAPIClient(token)
    # initiate request
    job = client.submit_job_local_file(audio_path)
    x = 1
    # Create waiting mechanism for rev.ai to process files.
    while x == 1:
        try:
            # get transcript as json
            transcript_json = client.get_transcript_json(job.id)
            # save output
            rev_json_saved_path = f'rev_json/{audio_id}.json' 
            with open(path, 'w') as f:
                json.dump(transcript_json, f)
            x -= 1
        except:
            time.sleep(3)

I see in the docs below that I can set 'transcriber' to 'machine', 'machine_v2', or 'human'.
https://docs.rev.ai/api/asynchronous/transcribers/

First, where do I enter the 'transcriber' parameter in python code?

Second, if v2 is now default, what is the parameter for v1?

Thank you.

`remove_atmospherics` is not supported

When remove_atmospherics parameter is passed in functions submit_job_url() / submit_job_local_file(), below error is encountered.

TypeError: submit_job_XXX() got an unexpected keyword argument 'remove_atmospherics'

remove src/init.py and move rev_api to top level

src/init.py gets copied over to site_packages directory. This causes issues with pylint and seems to be a bad idea to place this file in site_packages.
Stumbled upon the file while trying to fix pylint and this article pointed me in the right direction.
https://jamesmallen.net/2018/01/29/pylint_django-error-nosuchchecker/

Unexpected item in the results

Hi, I'm using Rev_AI for speech transcription,

from rev_ai import apiclient
client = apiclient.RevAiAPIClient("...")
job = client.submit_job_local_file("...", skip_diarization=True, skip_punctuation=True)
job_details = client.get_job_details(job.id)
transcript_text = client.get_transcript_text(job.id)

and the result is: "Speaker 0 00:00:00 Um if um capturing um interrupted um words and specific differences um is crucial for um your application"
is there any method for ignoring 'speaker0' and '00:00:00', and outputting only the content of the speech?

Broken Settings Page link to Access Token

The Settings Page link in this statement of your README is broken:
All you need to get started is your Access Token, which can be generated on your Settings Page. Create a client with the given Access Token:

Will you please let me know how to get an access token.

machine_v2 option for local files?

Using the Python API, I get the following error when running an async job specifying 'transcriber."

File "", line 8, in
transcriber = 'machine_v2')

TypeError: submit_job_local_file() got an unexpected keyword argument 'transcriber'

job = client.submit_job_local_file( filename = outputaudiofilename, skip_diarization = True, skip_punctuation = True, remove_disfluencies = True, speaker_channels_count = 1, language ="en", transcriber = "machine_v2")

Could not open socket: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1000)

Trying to run a microphone streaming example. Using Python version 3.12.2 and on Windows 10. I keep getting this error when trying to run the code: Could not open socket: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1000). Any ideas on how to remedy this?

Pip version conflicts due to pinned requirements

Hi there, something I wanted to flag real quick because I'm bumping into it right now as a paying customer :-)

The revai-python-sdk library specifies exact (==) version matches in it's setup.py dependencies via requirements.txt:

requests==2.21.0
enum34==1.1.6
six==1.12.0
websocket-client==0.56.0
mock==3.0.5

(Source)

Since many other libraries depend on commonly used dependencies like six it's easy to run into version conflicts in larger projects with six (and other dependencies) pinned to exact version numbers with the == version specifier.

It would be preferable if the dependencies for revai-python-sdk could be made more flexible.

For example, see how Google handle this in their Python API SDK: https://github.com/googleapis/google-api-python-client/blob/a527de24cda7c7d62a371203158fd5f617a1c08c/setup.py#L37-L44

(Also, it looks like mock should be moved to requirements_dev.txt since it's only used in the tests as far as I can tell.)

Thanks for looking into it!

Add support for specifying speaker names with human transcription

Provide a way to use the speaker_names option for human transcription: https://docs.rev.ai/api/asynchronous/reference/#operation/SubmitTranscriptionJob!ct=application/json&path=speaker_names&t=request

2.11.0 hasn't been published to PyPI

missed words

Are there plans to add a switch to return IDK or "null" on low-confidence matches? When the engine fails to return a word, but knows speech is present, it would be good to record this.

[question] How to properly submit S3-hosted files to REV.ai ?

I would like to ask a question about submitting media files via the media_url parameter.

I am trying to submit the following s3_url

https://interview-tool-bucket.s3.amazonaws.com/static/4444443543545354342543.mp4?AWSAccessKeyId=AKIAX5MNXLROC24ZXCEI&Signature=ZMdHn4tTmf9azM6vwNtM7LJD6Oc%3D&Expires=1617361746

Please note that it will expire after some time, but right now it is valid.

I receive the following error in response (even so that the URL is not yet expired)
{"parameters":{"media_url":["Provided media_url is not a valid fully-qualified http, or https URL."]}

I would like to know how to properly submit S3-hosted files to REV.ai

I don't want to make them public. That is the reason why I am generating the URL having temporal access. But it looks like REV.ai doesn't like those URLs.

revdotcom / revai-python-sdk Goto Github PK

revai-python-sdk's Introduction

Rev AI Python SDK

Documentation

Installation

Requirements

Usage

Sending a file

Human Transcription

Checking your file's status

Checking multiple files

Deleting a job

Getting your transcript

Getting transcript summary

Getting captions output

Streamed outputs

Streaming audio

Submitting custom vocabularies

For Rev AI Python SDK Developers

Local testing instructions

revai-python-sdk's People

Contributors

Stargazers

Watchers

Forkers

revai-python-sdk's Issues

Recommend Projects

Recommend Topics

Recommend Org