speechmarkdown / speechmarkdown-js Goto Github PK

View Code? Open in Web Editor NEW

72.0 72.0 16.0 836 KB

Speech Markdown grammar, parser, and formatters for use with JavaScript.

License: MIT License

TypeScript 99.89% JavaScript 0.11%

speechmarkdown-js's People

Contributors

Stargazers

Watchers

Forkers

arjun-g fleker labworksio sirfredrick awesomeaariv die-lautmaler afirstenberg guillaumegarcia13 boltomli npocccties derekpung equatorsavage botsquad plosson drtheuns

speechmarkdown-js's Issues

Ampersand leads to InvalidSsmlException

Converting the following string

Hallo Wie gehts? & Was machst Du hier?

leads to the following InvalidSsmlException when trying to pass it to AWS Polly:

InvalidSsmlException: Invalid SSML request
    at Object.extractError (/app/node_modules/aws-sdk/lib/protocol/json.js:52:27)
    at Request.extractError (/app/node_modules/aws-sdk/lib/protocol/rest_json.js:55:8)
    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
    at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
    at Request.emit (/app/node_modules/aws-sdk/lib/request.js:688:14)
    at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)
    at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)

Is escaping "dangerous" characters like & something Speechmarkdown should deal with? Is this something the developer should sanitize prior to passing it to Speechmarkdown?

Further references

Grammar - telephone not parsing parenthesis or dashes correctly

Various tests in telephone-standard.spec.ts are commented out.
Need to fix grammar to accept values with parenthesis or dashes. See SpeechMarkdownGrammar.ts

Support All US IPA Chars

The full list of IPA characters supported by Alexa is shown here:
https://developer.amazon.com/docs/custom-skills/speech-synthesis-markup-language-ssml-reference.html#phoneme

This includes the following chars: d͡ʒ and t͡ʃ

The file SpeechMarkdownGrammar.ts has been updated to include those chars, but when testing in ipa-standard-alphabet-us.spec.ts the tests error.

Solution: add the chars to the tests and get them to pass.

Support umlauts and other languages

It seems that myna parser and its 'letters' range doesn't support other languages than english. I spotted that in german it removes umlauts.
For example:
würdest du lieber
after parsing becomes
wrdest du lieber.

I have tested other umlauts, same outcome. I guess other languages like french would have the same issue too.

To fix this I guess we can do two things, either fix it in myna parser which is not actively maintained, so I am afraid, that it would be hard to get support from that, or try adding these special letters with other special characters.

Grammar - plainText does not recognize minus sign

Text that is the focus of a modifier, does not parse correctly if it has a minus sign: (x-soft)[vol:'x-soft']. Fix grammar so that a wide range of characters can be used in the text.

Grammar - emphasis short format

There is an issue with short emphasis format using this parser. I think short format emphasis should be separated by whitespaces from both sides in order to get picked up by parser.
Currently words separated by hyphens, get wrapped in emphasis tag.

loop-the-loop

becomes

loop<emphasis level="none">the</emphasis>loop

I think in this case it should not get picked up. It should only be wrapped with emphasis when hyphens are separated with whitespaces

loop -the- loop

Same happens with other short format emphasis wrappers, even though I agree that they might not be used in a real world word, but words/sayings with hyphens are quite often and they should not be affected.

Amazon Polly support

Hi,

I'm using Amazon Polly via API and it seems like the Alexa specification is not matching.

I'd like to contribute some fixes but would like to ask if there's any preference or coding standard for?

How to handle different feature support for standard|neural?
How to handle features exclusive to a few voices, like newscaster:

Matthew or Joanna voices, which are available only in American English (en-US), Lupe, in US Spanish (es-US) and Amy, in British English (en-GB). It is only supported when using Neural format.

How to return Voice information as it's not passed as SSML but as options when calling API?

Full feature information: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html

Failing test on Polly:

const fails = {
  "audio-standard.spec.ts": `!["https://sample-dev.s3.amazonaws.com/path/18327803923/f7fb4173-4eab-46fc-80ec-020204a615f9.mp3?AWSAccessKeyId=AKXXXXXXXXXXXXXXXXXX&Expires=1596986208&Signature=VL6q9pYc8NTjf6gKVqN0Cem0WTA="]
  Announcing Speech Markdown.`,
  "disappointed-section.spec.ts": `#[disappointed] Hey there, nice to meet you`,
  "disappointed-standard.spec.ts": `We can switch (from disappointed)[disappointed] to (really disappointed)[disappointed:"pizza"].`,
  "dj-section.spec.ts": `#[dj] Hey there, nice to meet you`,
  "emphasis-standard.spec.ts": `A (reduced)[emphasis:"reduced"] level`,
  "excited-section.spec.ts": `#[excited] Hey there, nice to meet you`,
  "excited-standard.spec.ts": `We can switch (from excited)[excited] to (really excited)[excited:"pizza"].`,
  "interjection-standard.spec.ts": `(Wow)[interjection], I didn't see that coming.`,
  "multiple-modifiers-same-text.spec.ts": `Your balance is: (12345)[number;emphasis:"strong";whisper;pitch:"high"].`,
  "pitch-standard.spec.ts": `A (high)[pitch:"high"] pitch
  A (high)[pitch:'high'] pitch`,
  "prosody-multiple-modifiers.spec.ts": `Multiple modifiers on same (text)[vol;pitch;rate]`,
  "sections-standard.spec.ts": `#[voice:'Brian'] Hey there, nice to meet you`,
  "voice-standard.spec.ts": `Why do you keep switching voices (from one)[voice:"device"] to (the other)[voice:"kendra"]?
  Why do you keep switching voices (from one)[voice:'device'] to (the other)[voice:'kendra']?`,
};

Grammar - multiple modifiers for the same text

When you have more than one modifier on the same text:

Your balance is: (12345)[number;emphasis:"strong"].

The resulting SSML should return nested tags:

      <speak>
      Your balance is: <say-as interpret-as="number"><emphasis level="strong">12345</emphasis></say-as>.
      </speak>

Various tests in multiple-modifiers-same-text.spec.ts have been skipped. Need to fix the formatters for Alexa and Google Assistant to handle processing multiple tags.

The solution should include a list of valid SSML tags with a sortId so that the rendering of the SSML is consistent regardless of the order of the modifiers in Speech Markdown.

The .ast ending in the rules in SpeechMarkdownGrammar.ts specify what will appear as nodes in the Abstract Syntax Tree (AST). Currently textModifier, textModifierText, and textModifierKey have .ast. Possibly another rule needs .ast so that each key/value pair will appear together.

Use the following to see a text rendering of the AST:

const smd = require('speechmarkdown-js');

const markdown = `Your balance is: (12345)[number;emphasis:"strong"].`;
const options = {
    platform: 'amazon-alexa'
};

const speech = new smd.SpeechMarkdown();
const tree = speech.toASTString(markdown);

The result will be something like:

(document: (paragraph: (simpleLine: (plainText: Your balance is: ) (textModifier: (plainText: 12345) (textModifierKey: number) (textModifierKey: emphasis) (textModifierText: strong)) (plainText: .) (lineEnd: ))))

Grammar - ipa does not recognize characters

Tests for ipa are skipped.

When using markdown such as (pecan)[ipa:"ˈpi.kæn"] these characters cause an error: ".", "ˈ", and "æ". There are likely others. Need to make sure the grammar plainText is expanded to all characters supported by ipa.

Grammar - date not parsing slashes correctly

Various tests in date-standard.spec.ts are commented out.
Need to fix grammar to accept values with slashes. See SpeechMarkdownGrammar.ts

Formatters - sections adding extra line breaks in SSML & Plain Text

The tests in sections-standard.spec.ts renders sections in Alexa formatter and nothing in the Google and Plain Text formatters.

Extra line breaks/new lines are rendered in the SSML and Plain Text output.

How to use speechmarkdown.min.js?

@rmtuckerphx
How to use speechmarkdown.min.js?
build:minify
I have created single file browser/speechmarkdown.min.js
Please give some examples, thanks!

Fix "sub" tag so that it allows for words with spaces

Allow the following as a sub:

Visit our website at (www.speechmarkdown.org)[sub:"speech mark down dot org"].

Modify output so code can run in a browser

The TypeScript should be able to be used in Node.js running either JavaScript or TypeScript.

Want to also make the code run-able in a browser. Use the code in a component that is able to accept Speech Markdown in a text area and translate that to Plain Text or SSML in a read-only text area.

Add dictionary to options for sub and phoneme

The SpeechMarkdown class or toSSML method accepts an options object. Add a dictionary element to options to allow for a dictionary of words or phrases that will be converted to a phoneme SSML tag if the platform supports it (Alexa) or a sub SSML tag otherwise (Google).

The structure would be something like this:

[
  {
    "text": "potato",
    "ipa": "pəˈteɪtəʊ",
    "sub": "poteytoh",
    "section": "food",
  }
]

See https://github.com/cellular/jovo-plugin-ssml

Need to decide if:

any text in the string will be checked against each item in the dictionary
support a dictionary Speech Markdown tag so only tagged words would be checked: (potato)[dictionary]

Could also specify a section for each word so only a subsection of the dictionary will be checked: (potato)[dictionary:"food"]

Grammar - fraction not parsing plus or slash correctly

Various tests in fraction-standard.spec.ts are commented out.
Need to fix grammar to accept values with slashes. See SpeechMarkdownGrammar.ts

Special characters are ignored by the formatters in the output

The markdown includes special characters that help form constructs such as:

Text modifier - (text)[key:’value’;key;key:’value’)
Header - #[ key:’value’; key:’value’]
Audio - ![‘url’]

When these special characters are included in non-markdown text or as the text portion of the text modifier,these characters are not rendering even though they should.

For example, the following Speech Markdown:

This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored

Is currently being rendered as:

<speak>
This is text with parens but this and other special characters:  are ignored
</speak>

When instead it should be rendered as:

<speak>
This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored
</speak>

How to remove just Speechmarkdown markup but leave Markdown (i.e. NOT back to plain text)?

Hi,
Do you have a formatter or technique for either removing only the SpeechMarkdown (SM) markup directly from a Markdown-based document or for the SM markdown to be filtered or linted by external Markdown interpreters?

The idea is, I'd rather not have to duplicate the text of a Markdown document and include SM mark-up within it, then utilise the singular SM+MD text to drive both UI/Web and speech.

Any ideas how I could do this?

Add support of Microsoft Azure SSML

Based on document https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup

It's useful and mostly compliant with Alexa with little differences I think. Of course not those special supports.

I'm trying to draft in https://github.com/boltomli/speechmarkdown-js/tree/add-microsoft-azure-ssml

Limitation to Node engine below 11

The package.json limits the Node.js engine to below Node 11. This has some impact on our CI / CD pipelines as some of them are already running on Node 11 / 12. Is there any need for that limitation?

How to use in python?

@rmtuckerphx
How to use in python?
thanks

Dashes in dates get treated as emphasis

When there is a date, formatted with dashes, these dashes still get treated as emphasized text.
So when using toText it means the dashes get stripped.

So 2020-10-10 becomes 20201010. this is quite unexpected.

Note this only is the issue for numbers, for normal words it is ok, Mother-in-law for instance stays untouched.

Would it be an option to treat numbers as word characters as well?

Nested formatters break outside formatters

There is a bug, when speech markdown formatters are nested in one another, only inside formatters get parsed, while outside formatters are left as a plain text.
Example:

(break after this [0.2s] another break after this [0.2s])[rate:"slow"]

is parsed to:

break after this <break time="0.2s"/> another break after this <break time="0.2s"/>rate:"slow"

As you see prosody tag is not added and formatter stays too. If I remove breaks from the text:

(break after this another break after this)[rate:"slow"]

it resolves to

<prosody rate="slow">break after this another break after this</prosody>

When left alone, prosody works.

I tried putting other formatters, not only breaks inside formatters and they didn't work correctly either.

Support both double and single quotes

The audio tag supports both double and single quotes:

!['https://www.speechmarkdown.org/test.mp3']  // YES
!["https://www.speechmarkdown.org/test.mp3"]  // YES

But other tags such as emphasis, only support double quotes:

(text)[emphasis:"strong"]  // YES
(text)[emphasis:'strong']  // NO