speechmarkdown / speechmarkdown-js Goto Github PK
View Code? Open in Web Editor NEWSpeech Markdown grammar, parser, and formatters for use with JavaScript.
License: MIT License
Speech Markdown grammar, parser, and formatters for use with JavaScript.
License: MIT License
Converting the following string
Hallo Wie gehts? & Was machst Du hier?
leads to the following InvalidSsmlException
when trying to pass it to AWS Polly:
InvalidSsmlException: Invalid SSML request
at Object.extractError (/app/node_modules/aws-sdk/lib/protocol/json.js:52:27)
at Request.extractError (/app/node_modules/aws-sdk/lib/protocol/rest_json.js:55:8)
at Request.callListeners (/app/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/app/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/app/node_modules/aws-sdk/lib/request.js:688:14)
at Request.transition (/app/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/app/node_modules/aws-sdk/lib/state_machine.js:14:12)
Is escaping "dangerous" characters like &
something Speechmarkdown should deal with? Is this something the developer should sanitize prior to passing it to Speechmarkdown?
Further references
Various tests in telephone-standard.spec.ts
are commented out.
Need to fix grammar to accept values with parenthesis or dashes. See SpeechMarkdownGrammar.ts
The full list of IPA characters supported by Alexa is shown here:
https://developer.amazon.com/docs/custom-skills/speech-synthesis-markup-language-ssml-reference.html#phoneme
This includes the following chars: d͡ʒ and t͡ʃ
The file SpeechMarkdownGrammar.ts
has been updated to include those chars, but when testing in ipa-standard-alphabet-us.spec.ts
the tests error.
Solution: add the chars to the tests and get them to pass.
It seems that myna parser and its 'letters' range doesn't support other languages than english. I spotted that in german it removes umlauts.
For example:
würdest du lieber
after parsing becomes
wrdest du lieber
.
I have tested other umlauts, same outcome. I guess other languages like french would have the same issue too.
To fix this I guess we can do two things, either fix it in myna parser which is not actively maintained, so I am afraid, that it would be hard to get support from that, or try adding these special letters with other special characters.
Text that is the focus of a modifier, does not parse correctly if it has a minus sign: (x-soft)[vol:'x-soft']
. Fix grammar so that a wide range of characters can be used in the text.
There is an issue with short emphasis format using this parser. I think short format emphasis should be separated by whitespaces from both sides in order to get picked up by parser.
Currently words separated by hyphens, get wrapped in emphasis tag.
loop-the-loop
becomes
loop<emphasis level="none">the</emphasis>loop
I think in this case it should not get picked up. It should only be wrapped with emphasis when hyphens are separated with whitespaces
loop -the- loop
Same happens with other short format emphasis wrappers, even though I agree that they might not be used in a real world word, but words/sayings with hyphens are quite often and they should not be affected.
Hi,
I'm using Amazon Polly via API and it seems like the Alexa specification is not matching.
I'd like to contribute some fixes but would like to ask if there's any preference or coding standard for?
standard|neural
?Matthew or Joanna voices, which are available only in American English (en-US), Lupe, in US Spanish (es-US) and Amy, in British English (en-GB). It is only supported when using Neural format.
Voice
information as it's not passed as SSML but as options when calling API?Full feature information: https://docs.aws.amazon.com/polly/latest/dg/supportedtags.html
const fails = {
"audio-standard.spec.ts": `!["https://sample-dev.s3.amazonaws.com/path/18327803923/f7fb4173-4eab-46fc-80ec-020204a615f9.mp3?AWSAccessKeyId=AKXXXXXXXXXXXXXXXXXX&Expires=1596986208&Signature=VL6q9pYc8NTjf6gKVqN0Cem0WTA="]
Announcing Speech Markdown.`,
"disappointed-section.spec.ts": `#[disappointed] Hey there, nice to meet you`,
"disappointed-standard.spec.ts": `We can switch (from disappointed)[disappointed] to (really disappointed)[disappointed:"pizza"].`,
"dj-section.spec.ts": `#[dj] Hey there, nice to meet you`,
"emphasis-standard.spec.ts": `A (reduced)[emphasis:"reduced"] level`,
"excited-section.spec.ts": `#[excited] Hey there, nice to meet you`,
"excited-standard.spec.ts": `We can switch (from excited)[excited] to (really excited)[excited:"pizza"].`,
"interjection-standard.spec.ts": `(Wow)[interjection], I didn't see that coming.`,
"multiple-modifiers-same-text.spec.ts": `Your balance is: (12345)[number;emphasis:"strong";whisper;pitch:"high"].`,
"pitch-standard.spec.ts": `A (high)[pitch:"high"] pitch
A (high)[pitch:'high'] pitch`,
"prosody-multiple-modifiers.spec.ts": `Multiple modifiers on same (text)[vol;pitch;rate]`,
"sections-standard.spec.ts": `#[voice:'Brian'] Hey there, nice to meet you`,
"voice-standard.spec.ts": `Why do you keep switching voices (from one)[voice:"device"] to (the other)[voice:"kendra"]?
Why do you keep switching voices (from one)[voice:'device'] to (the other)[voice:'kendra']?`,
};
When you have more than one modifier on the same text:
Your balance is: (12345)[number;emphasis:"strong"].
The resulting SSML should return nested tags:
<speak>
Your balance is: <say-as interpret-as="number"><emphasis level="strong">12345</emphasis></say-as>.
</speak>
Various tests in multiple-modifiers-same-text.spec.ts
have been skipped. Need to fix the formatters for Alexa and Google Assistant to handle processing multiple tags.
The solution should include a list of valid SSML tags with a sortId so that the rendering of the SSML is consistent regardless of the order of the modifiers in Speech Markdown.
The .ast
ending in the rules in SpeechMarkdownGrammar.ts
specify what will appear as nodes in the Abstract Syntax Tree (AST). Currently textModifier, textModifierText, and textModifierKey have .ast. Possibly another rule needs .ast so that each key/value pair will appear together.
Use the following to see a text rendering of the AST:
const smd = require('speechmarkdown-js');
const markdown = `Your balance is: (12345)[number;emphasis:"strong"].`;
const options = {
platform: 'amazon-alexa'
};
const speech = new smd.SpeechMarkdown();
const tree = speech.toASTString(markdown);
The result will be something like:
(document: (paragraph: (simpleLine: (plainText: Your balance is: ) (textModifier: (plainText: 12345) (textModifierKey: number) (textModifierKey: emphasis) (textModifierText: strong)) (plainText: .) (lineEnd: ))))
Tests for ipa are skipped.
When using markdown such as (pecan)[ipa:"ˈpi.kæn"]
these characters cause an error: ".", "ˈ", and "æ". There are likely others. Need to make sure the grammar plainText
is expanded to all characters supported by ipa.
Various tests in date-standard.spec.ts
are commented out.
Need to fix grammar to accept values with slashes. See SpeechMarkdownGrammar.ts
The tests in sections-standard.spec.ts
renders sections in Alexa formatter and nothing in the Google and Plain Text formatters.
Extra line breaks/new lines are rendered in the SSML and Plain Text output.
@rmtuckerphx
How to use speechmarkdown.min.js?
build:minify
I have created single file browser/speechmarkdown.min.js
Please give some examples, thanks!
Allow the following as a sub:
Visit our website at (www.speechmarkdown.org)[sub:"speech mark down dot org"].
The TypeScript should be able to be used in Node.js running either JavaScript or TypeScript.
Want to also make the code run-able in a browser. Use the code in a component that is able to accept Speech Markdown in a text area and translate that to Plain Text or SSML in a read-only text area.
The SpeechMarkdown class or toSSML method accepts an options object. Add a dictionary element to options to allow for a dictionary of words or phrases that will be converted to a phoneme SSML tag if the platform supports it (Alexa) or a sub SSML tag otherwise (Google).
The structure would be something like this:
[
{
"text": "potato",
"ipa": "pəˈteɪtəʊ",
"sub": "poteytoh",
"section": "food",
}
]
See https://github.com/cellular/jovo-plugin-ssml
Need to decide if:
Could also specify a section for each word so only a subsection of the dictionary will be checked: (potato)[dictionary:"food"]
Various tests in fraction-standard.spec.ts
are commented out.
Need to fix grammar to accept values with slashes. See SpeechMarkdownGrammar.ts
The markdown includes special characters that help form constructs such as:
Text modifier - (text)[key:’value’;key;key:’value’)
Header - #[ key:’value’; key:’value’]
Audio - ![‘url’]
When these special characters are included in non-markdown text or as the text portion of the text modifier,these characters are not rendering even though they should.
For example, the following Speech Markdown:
This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored
Is currently being rendered as:
<speak>
This is text with parens but this and other special characters: are ignored
</speak>
When instead it should be rendered as:
<speak>
This is text with (parens) but this and other special characters: []()*~`@#\\_!+- are ignored
</speak>
Hi,
Do you have a formatter or technique for either removing only the SpeechMarkdown (SM) markup directly from a Markdown-based document or for the SM markdown to be filtered or linted by external Markdown interpreters?
The idea is, I'd rather not have to duplicate the text of a Markdown document and include SM mark-up within it, then utilise the singular SM+MD text to drive both UI/Web and speech.
Any ideas how I could do this?
Based on document https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup
It's useful and mostly compliant with Alexa with little differences I think. Of course not those special supports.
I'm trying to draft in https://github.com/boltomli/speechmarkdown-js/tree/add-microsoft-azure-ssml
The package.json
limits the Node.js engine to below Node 11. This has some impact on our CI / CD pipelines as some of them are already running on Node 11 / 12. Is there any need for that limitation?
@rmtuckerphx
How to use in python?
thanks
When there is a date, formatted with dashes, these dashes still get treated as emphasized text.
So when using toText
it means the dashes get stripped.
So 2020-10-10
becomes 20201010
. this is quite unexpected.
Note this only is the issue for numbers, for normal words it is ok, Mother-in-law
for instance stays untouched.
Would it be an option to treat numbers as word characters as well?
There is a bug, when speech markdown formatters are nested in one another, only inside formatters get parsed, while outside formatters are left as a plain text.
Example:
(break after this [0.2s] another break after this [0.2s])[rate:"slow"]
is parsed to:
break after this <break time="0.2s"/> another break after this <break time="0.2s"/>rate:"slow"
As you see prosody
tag is not added and formatter stays too. If I remove breaks from the text:
(break after this another break after this)[rate:"slow"]
it resolves to
<prosody rate="slow">break after this another break after this</prosody>
When left alone, prosody works.
I tried putting other formatters, not only breaks inside formatters and they didn't work correctly either.
The audio tag supports both double and single quotes:
!['https://www.speechmarkdown.org/test.mp3'] // YES
!["https://www.speechmarkdown.org/test.mp3"] // YES
But other tags such as emphasis, only support double quotes:
(text)[emphasis:"strong"] // YES
(text)[emphasis:'strong'] // NO
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.