Giter VIP home page Giter VIP logo

speechmarkdown-js's People

Contributors

afirstenberg avatar arjan avatar boltomli avatar dependabot[bot] avatar derekpung avatar drtheuns avatar fleker avatar guillaumegarcia13 avatar johniwasz avatar rmtuckerphx avatar rokasvaitkevicius avatar sirfredrick avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

speechmarkdown-js's Issues

Ampersand leads to InvalidSsmlException

Converting the following string

Hallo Wie gehts? & Was machst Du hier?

leads to the following InvalidSsmlException when trying to pass it to AWS Polly:

InvalidSsmlException: Invalid SSML request
    at Object.extractError (/app/node_modules/aws-sdk/lib/protocol/json.js:52:27)
    at Request.extractError (/app/node_modules/aws-sdk/lib/protocol/rest_json.js:55:8)
    at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
    at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
    at Request.emit (/app/node_modules/aws-sdk/lib/request.js:688:14)
    at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)
    at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)

Is escaping "dangerous" characters like & something Speechmarkdown should deal with? Is this something the developer should sanitize prior to passing it to Speechmarkdown?

Further references

Support umlauts and other languages

It seems that myna parser and its 'letters' range doesn't support other languages than english. I spotted that in german it removes umlauts.
For example:
würdest du lieber
after parsing becomes
wrdest du lieber.

I have tested other umlauts, same outcome. I guess other languages like french would have the same issue too.

To fix this I guess we can do two things, either fix it in myna parser which is not actively maintained, so I am afraid, that it would be hard to get support from that, or try adding these special letters with other special characters.

Grammar - emphasis short format

There is an issue with short emphasis format using this parser. I think short format emphasis should be separated by whitespaces from both sides in order to get picked up by parser.
Currently words separated by hyphens, get wrapped in emphasis tag.

loop-the-loop

becomes

loop<emphasis level="none">the</emphasis>loop

I think in this case it should not get picked up. It should only be wrapped with emphasis when hyphens are separated with whitespaces

loop -the- loop

Same happens with other short format emphasis wrappers, even though I agree that they might not be used in a real world word, but words/sayings with hyphens are quite often and they should not be affected.

Amazon Polly support

Hi,

I'm using Amazon Polly via API and it seems like the Alexa specification is not matching.

I'd like to contribute some fixes but would like to ask if there's any preference or coding standard for?

  • How to handle different feature support for standard|neural?
  • How to handle features exclusive to a few voices, like newscaster:

Matthew or Joanna voices, which are available only in American English (en-US), Lupe, in US Spanish (es-US) and Amy, in British English (en-GB). It is only supported when using Neural format.

  • How to return Voice information as it's not passed as SSML but as options when calling API?

Full feature information: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html

Failing test on Polly:

const fails = {
  "audio-standard.spec.ts": `!["https://sample-dev.s3.amazonaws.com/path/18327803923/f7fb4173-4eab-46fc-80ec-020204a615f9.mp3?AWSAccessKeyId=AKXXXXXXXXXXXXXXXXXX&Expires=1596986208&Signature=VL6q9pYc8NTjf6gKVqN0Cem0WTA="]
  Announcing Speech Markdown.`,
  "disappointed-section.spec.ts": `#[disappointed] Hey there, nice to meet you`,
  "disappointed-standard.spec.ts": `We can switch (from disappointed)[disappointed] to (really disappointed)[disappointed:"pizza"].`,
  "dj-section.spec.ts": `#[dj] Hey there, nice to meet you`,
  "emphasis-standard.spec.ts": `A (reduced)[emphasis:"reduced"] level`,
  "excited-section.spec.ts": `#[excited] Hey there, nice to meet you`,
  "excited-standard.spec.ts": `We can switch (from excited)[excited] to (really excited)[excited:"pizza"].`,
  "interjection-standard.spec.ts": `(Wow)[interjection], I didn't see that coming.`,
  "multiple-modifiers-same-text.spec.ts": `Your balance is: (12345)[number;emphasis:"strong";whisper;pitch:"high"].`,
  "pitch-standard.spec.ts": `A (high)[pitch:"high"] pitch
  A (high)[pitch:'high'] pitch`,
  "prosody-multiple-modifiers.spec.ts": `Multiple modifiers on same (text)[vol;pitch;rate]`,
  "sections-standard.spec.ts": `#[voice:'Brian'] Hey there, nice to meet you`,
  "voice-standard.spec.ts": `Why do you keep switching voices (from one)[voice:"device"] to (the other)[voice:"kendra"]?
  Why do you keep switching voices (from one)[voice:'device'] to (the other)[voice:'kendra']?`,
};

Grammar - multiple modifiers for the same text

When you have more than one modifier on the same text:

Your balance is: (12345)[number;emphasis:"strong"].

The resulting SSML should return nested tags:

      <speak>
      Your balance is: <say-as interpret-as="number"><emphasis level="strong">12345</emphasis></say-as>.
      </speak>

Various tests in multiple-modifiers-same-text.spec.ts have been skipped. Need to fix the formatters for Alexa and Google Assistant to handle processing multiple tags.

The solution should include a list of valid SSML tags with a sortId so that the rendering of the SSML is consistent regardless of the order of the modifiers in Speech Markdown.

The .ast ending in the rules in SpeechMarkdownGrammar.ts specify what will appear as nodes in the Abstract Syntax Tree (AST). Currently textModifier, textModifierText, and textModifierKey have .ast. Possibly another rule needs .ast so that each key/value pair will appear together.

Use the following to see a text rendering of the AST:

const smd = require('speechmarkdown-js');

const markdown = `Your balance is: (12345)[number;emphasis:"strong"].`;
const options = {
    platform: 'amazon-alexa'
};

const speech = new smd.SpeechMarkdown();
const tree = speech.toASTString(markdown);

The result will be something like:

(document: (paragraph: (simpleLine: (plainText: Your balance is: ) (textModifier: (plainText: 12345) (textModifierKey: number) (textModifierKey: emphasis) (textModifierText: strong)) (plainText: .) (lineEnd: ))))

Grammar - ipa does not recognize characters

Tests for ipa are skipped.

When using markdown such as (pecan)[ipa:"ˈpi.kæn"] these characters cause an error: ".", "ˈ", and "æ". There are likely others. Need to make sure the grammar plainText is expanded to all characters supported by ipa.

Modify output so code can run in a browser

The TypeScript should be able to be used in Node.js running either JavaScript or TypeScript.

Want to also make the code run-able in a browser. Use the code in a component that is able to accept Speech Markdown in a text area and translate that to Plain Text or SSML in a read-only text area.

Add dictionary to options for sub and phoneme

The SpeechMarkdown class or toSSML method accepts an options object. Add a dictionary element to options to allow for a dictionary of words or phrases that will be converted to a phoneme SSML tag if the platform supports it (Alexa) or a sub SSML tag otherwise (Google).

The structure would be something like this:

[
  {
    "text": "potato",
    "ipa": "pəˈteɪtəʊ",
    "sub": "poteytoh",
    "section": "food",
  }
]

See https://github.com/cellular/jovo-plugin-ssml

Need to decide if:

  • any text in the string will be checked against each item in the dictionary
  • support a dictionary Speech Markdown tag so only tagged words would be checked: (potato)[dictionary]

Could also specify a section for each word so only a subsection of the dictionary will be checked: (potato)[dictionary:"food"]

Special characters are ignored by the formatters in the output

The markdown includes special characters that help form constructs such as:

Text modifier - (text)[key:’value’;key;key:’value’)
Header - #[ key:’value’; key:’value’]
Audio - ![‘url’]

When these special characters are included in non-markdown text or as the text portion of the text modifier,these characters are not rendering even though they should.

For example, the following Speech Markdown:

This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored

Is currently being rendered as:

<speak>
This is text with parens but this and other special characters:  are ignored
</speak>

When instead it should be rendered as:

<speak>
This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored
</speak>

How to remove just Speechmarkdown markup but leave Markdown (i.e. NOT back to plain text)?

Hi,
Do you have a formatter or technique for either removing only the SpeechMarkdown (SM) markup directly from a Markdown-based document or for the SM markdown to be filtered or linted by external Markdown interpreters?

The idea is, I'd rather not have to duplicate the text of a Markdown document and include SM mark-up within it, then utilise the singular SM+MD text to drive both UI/Web and speech.

Any ideas how I could do this?

Limitation to Node engine below 11

The package.json limits the Node.js engine to below Node 11. This has some impact on our CI / CD pipelines as some of them are already running on Node 11 / 12. Is there any need for that limitation?

Dashes in dates get treated as emphasis

When there is a date, formatted with dashes, these dashes still get treated as emphasized text.
So when using toText it means the dashes get stripped.

So 2020-10-10 becomes 20201010. this is quite unexpected.

Note this only is the issue for numbers, for normal words it is ok, Mother-in-law for instance stays untouched.

Would it be an option to treat numbers as word characters as well?

Nested formatters break outside formatters

There is a bug, when speech markdown formatters are nested in one another, only inside formatters get parsed, while outside formatters are left as a plain text.
Example:

(break after this [0.2s] another break after this [0.2s])[rate:"slow"]

is parsed to:

break after this <break time="0.2s"/> another break after this <break time="0.2s"/>rate:"slow"

As you see prosody tag is not added and formatter stays too. If I remove breaks from the text:

(break after this another break after this)[rate:"slow"]

it resolves to

<prosody rate="slow">break after this another break after this</prosody>

When left alone, prosody works.

I tried putting other formatters, not only breaks inside formatters and they didn't work correctly either.

Support both double and single quotes

The audio tag supports both double and single quotes:

!['https://www.speechmarkdown.org/test.mp3']  // YES
!["https://www.speechmarkdown.org/test.mp3"]  // YES

But other tags such as emphasis, only support double quotes:

(text)[emphasis:"strong"]  // YES
(text)[emphasis:'strong']  // NO

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.