Giter VIP home page Giter VIP logo

conrad's Introduction

conrad

R-CMD-check CRAN status

conrad is a reboot of mscstts. Instead of httr, which is superseded and not recommended, we use httr2 to perform HTTP requests to the Microsoft Cognitive Services Text to Speech REST API.

conrad serves as a client to the Microsoft Cognitive Services Text to Speech REST API. The Text to Speech REST API supports neural text to speech voices, which support specific languages and dialects that are identified by locale. Each available endpoint is associated with a region.

Before you use the text to speech REST API, a valid account must be registered at the Microsoft Azure Cognitive Services and you must obtain an API key. Without an API key, this package will not work.

Installation

Install the CRAN version:

install.packages("conrad")

Or install the development version from GitHub:

# install.packages("devtools")
devtools::install_github("fhdsl/conrad")

Getting an API key

  1. Sign in/Create an Azure account on Microsoft Azure Cognitive Services.
  2. Click + Create a resource (below “Azure services” or click on the Hamburger button)
  3. Search for “Speech” and Click Create -> Speech
  4. Create a Resource group and a “Name”.
  5. Choose Pricing tier (you can choose the free version with Free F0)
  6. Click Review + create, review the Terms, and click Create.

If the deployment was successful, you should see ✅ Your deployment is complete on the next page.

  1. Under Next steps, click Go to resource
  2. Look on the left sidebar and under Resource Management, click Keys and Endpoint
  3. Copy either KEY 1 or KEY 2 to clipboard. Only one key is necessary to make an API call.

Once you complete these steps, you have successfully retrieved your API keys to access the API.

⚠️ Remember your Location/Region, which you use to make calls to the API. Specifying a different region will lead to a HTTP 403 Forbidden response.

For more detailed information on each step, refer to the API Key vignette.

Setting your API key

You can set your API key in a number of ways:

  1. Edit ~/.Renviron and set MS_TTS_API_KEY = "YOUR_API_KEY"
  2. In R, use options(ms_tts_key = "YOUR_API_KEY").
  3. Set export MS_TTS_API_KEY=YOUR_API_KEY in .bash_profile/.bashrc if you’re using R in the terminal.
  4. Pass api_key = "YOUR_API_KEY" in arguments of functions such as ms_list_voices(api_key = "YOUR_API_KEY").

Get a list of voices

ms_list_voice() uses the tts.speech.microsoft.com/cognitiveservices/voices/list endpoint to get a full list of voices for a specific region. It attaches a region prefix to this endpoint to get a list of voices for that region.

For example, to get a list of all the voices for the westus region, it uses the https://westus.tts.speech.microsoft.com/cognitiveservices/voices/list endpoint.

⚠️ Be sure to specify the Speech resource region that corresponds to your API Key.

ms_list_voice(api_key = "YOUR_API_KEY", region = "westus")

Convert text to speech

ms_synthesize() uses the tts.speech.microsoft.com/cognitiveservices/v1 endpoint to convert text to speech. The endpoint requires Speech Synthesis Markup Language (SSML) to specify the language, gender, and full voice name.

⚠️ Be sure to specify the Speech resource region that corresponds to your API Key.

# Convert text to speech
res <- ms_synthesize(script = "Hello world, this is a talking computer", region = "westus", gender = "Male")
# Returns hexadecimal representation of binary data

# Create file to store audio output
output_path <- tempfile(fileext = ".wav")
# Write binary data to output path
writeBin(res, con = output_path)
# Play audio in browser
play_audio(audio = output_path)

If you want more examples of different voices with different scripts, refer to the Introduction to conrad vignette.

Get an access token

ms_get_token() makes a request to the issueToken endpoint to get an access token. The function require an API key and region as inputs. The access token is used to send requests to the API.

⚠️ Be sure to specify the Speech resource region that corresponds to your API Key.

ms_get_token(api_key = "YOUR_API_KEY", region = "westus")

Major differences to mscstts

  • To enhance the reliability of our package, we have transitioned from using httr to httr2 for handling HTTP requests to the Text to Speech REST API. This change was motivated by the fact that httr is no longer being actively maintained, with updates limited to those necessary for CRAN compatibility. In contrast, httr2 represents a modern reimagining of httr and is strongly recommended for usage.
  • It resolves the HTTP 403 Forbidden issue. An HTTP 403 Forbidden response status code signifies that the server comprehends the request but denies authorization. In the case of mscstts::ms_synthesize(), the problem arose due to the use of an invalid voice within the HTTP request, specifically concerning the chosen region. For instance, the SSML might have contained a voice name that was not supported in the westus region. As a consequence, the server would reject the HTTP request.
  • We have made significant improvements to the documentation across the entire package. These enhancements include simpler function names, commented functions for clarity, removal of unnecessary functions and arguments, and URLs directing users to pages that explain text-to-speech jargon.

We believe that these improvements will greatly enhance the usability of the package and make it even more reliable in the long-term.

Acknowledgements

conrad wouldn’t be possible without prior work on mscstts by John Muschelli and httr2 by Hadley Wickham.

conrad's People

Contributors

howardbaek avatar

Stargazers

Allister Hodge avatar

Watchers

Kostas Georgiou avatar  avatar  avatar

conrad's Issues

Release conrad 1.0.0

First release:

Prepare for release:

  • git pull
  • Check if any deprecation processes should be advanced, as described in Gradual deprecation
  • devtools::build_readme()
  • urlchecker::url_check()
  • devtools::check(remote = TRUE, manual = TRUE)
  • devtools::check_win_devel()
  • rhub::check_for_cran()
  • git push
  • Draft blog post

Submit to CRAN:

  • usethis::use_version('major')
  • devtools::submit_cran()
  • Approve email

Wait for CRAN...

  • Accepted 🎉
  • git push
  • usethis::use_github_release()
  • usethis::use_dev_version()
  • usethis::use_news_md()
  • git push
  • Finish blog post
  • Tweet
  • Add link to blog post in pkgdown news menu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.