jankapunkt / easy-speech Goto Github PK

View Code? Open in Web Editor NEW

153.0 5.0 16.0 1.34 MB

🔊 Cross browser Speech Synthesis also known as Text to speech or TTS; no dependencies; uses Web Speech API

Home Page: https://jankapunkt.github.io/easy-speech/

JavaScript 100.00%

web text-to-speech browser cross-browser web-speech-api hacktoberfest javascript tts js speech-synthesis

easy-speech's Introduction

Hey there 👋

My name is Jan Küster and I am all about JavaScript and Meteor

I love to work on projects, that are about education. I currently work on lea.online, a diagnostic and learning platform for German adults with low to very low literacy skills. The target group is employed adults, which are limited in their career path due to low fundamental literacy skills.

I am also a constant contributor around Meteor and it's packages ecosystem. Recently I joined up with Storyteller CZ and Alim S. Gafar for the weekly Meteor.js Community Podcast on YouTube.

Co-/maintaining Orgs

Big thanks to my sponsors!

Social

Stack

OS
Tools
Backend
Frontend
Testing
Mobile

Read

These are curatest lists of some of my articles on dev.to, the platform where I regularly publish insights, summaries and findings of my recent developmenmt work. Click on a section to expand and see all articles for it.

Articles on Meteor

Beginners / Fundamentals

Why choose Meteor (or not) for your next project?

Meteor 5 minutes setup for newcomers

Async Meteor Method calls

Write isomorphic code in MeteorJs

Reviving an ancient Meteor.js project in 10 minutes 🦖

Meteor Blaze - single state helper for ReactiveDict

Tutorials and Guides

Gradually upgrading a Meteor.js project to 3.0

Prepare your Meteor.js project for the big 3.0 release!

Meteor and React Native - Create a native mobile app

Easy CRUD setup with Meteor in only 4 steps from scratch

Transform any Meteor App into a PWA

Meteor and standard lint

Bootstrapping an Admin account in Meteor

Advanced

Plugin architecture with Meteor

Microservices with Meteor

Meteor browser bundle and Node-Stubs - beware what you import

MeteorJs and Meteor-Up MongoDB migration to a new major version 🛠️

General articles on JavaScript

Fundamentals

How to mess up your JavaScript code like a boss (THIS IS A MUST READ FOR ANY JS DEV!)

All your JavaScript code is polluted

Prevent infinite loops in JavaScript

ES6 classes with private members

My opinionated JavaScript package template repository - zero config, start immediately

How to write a great switch statement in JavaScript

How stringify Proxy to JSON

Stubbing jQuery with sinon

Create a JavaScript Promise with timeout

Text, Fonts

Make text fit its parent size using JavaScript

Converting my OTF font into multiple web fonts with this bash script

Coding Puzzles

Let's solve a one-liner code puzzle

One-Liner Puzzle - Fill Array with indices

JavaScript puzzle oneliner - get unique values from list

JavaScript code puzzle - can you call this function?

Support my work and articles

GitHub Sponsors: https://github.com/sponsors/jankapunkt
PayPal: https://paypal.me/kuesterjan
Books I want to read (Amazon wishlist): https://www.amazon.de/hz/wishlist/ls/12YMIY0QNH9TK?ref_=wl_share

easy-speech's People

Contributors

Stargazers

Watchers

Forkers

nomiscientist jj449 ellukesmith rachitgarg-12 gopal2021025 andre-dietrich azizulhasan k2m5t2 sajtempler pierophp boogh46 channyeintun hoangbinh alexlana mhsadique ivoryguard

easy-speech's Issues

FAQ Section

Is your feature request related to a problem? Please describe.

We don't have a clear and understandable FAQ

Describe the solution you'd like

We should have some FAQ.md that is in the top-level of the repository. It should explain some important concepts and provide answers to common questions.

Most important it should clearly show what's OS-level (and thus not being influenced by this library), what's browser-specific and what can be influenced by this library.

Describe alternatives you've considered

Using the wiki, but that is problematic as it's not shipped with the NPM package, while the FAQ can be shipped and is offline-readable.

We should also link any issue that is related to this in the FAQ, no matter whether it's open or closed.

Additional context

General topics

why this library exists?
can I load my own voices?
my voices sound terrible
there is no voice spoken
can I include a cloud service with this library?

OS-specific topics

These are common issues that may arise but are locked on the OS-level so we can't do a thing about it.

no voices at all (Chromium on some Linux)
specific voices missing although installed on OS-level (iOS 16)
voices gone or changed after update (MacOS, iOS)
voices sound bad; bad quality (often in Firefox and Chromium on Linux systems)

Good practices to tackle them are

implement a fallback mechanism using certain voices (for example Chrome always ships some google voices in multiple languages with acceptable quality)
fail on missing voices and display messages to users with guides on how to install the voices
what else?

Browser-specific topics

These are issues that are specific to the browser and may only be changed by the vendor

speechSynthesis or other crucial classes not available in window (for example in Opera)
speech is automactically cancelled after x seconds or words (chrome)
pause and stop are doing the same (android)
what else?

Voice-Specific Voices

google voice exceed text length (#227)

Provide a pause() and resume() workaraound for Android Browsers

Hi @jankapunkt thanks for your great work here in this project.

This issue i'm filling up here is more about speechSynthesis itself, not the EasySpeech library. Just would like to know if you or anyone have any clue on what can be done in this particular case.

Problem is that the following snippet works well on Desktop Chrome but not on Mobile Chrome (Android):

var to_speak = "The text we want browser to pronounce, long enough to be able to pause it.."
var utterance = new SpeechSynthesisUtterance(to_speak)
utterance.lang = 'en-US'
window.speechSynthesis.speak(utterance)

// Then pause and resume:

window.speechSynthesis.pause()
window.speechSynthesis.resume()

According to this comment:

easy-speech/index.js

Line 797 in 045f6aa

// exec pause on Android causes speech to end but not to fire end-event

..it looks like this is a known issue.

But even using window.speechSynthesis.cancel() I could not workaround that.
I'm pretty convinced that being able to workaround that, it will be possible to invoke pause and resume using EasySpeech.pause() AND EasySpeech.resume() abstractions.

Speach Synthesis on Mobile Edge Browser Broken

[x] I have read the FAQ section and have checked
that none of it describe the issue I'm about to report.

Describe the bug
Since the goal of this library is cross platform and cross browser speech synthesis, I was expecting it to work on mobile MS Edge as well. (All other tested browsers are just fine: mobile chrome, firefox, desktop chrome, firefox, edge)

However, even the demo sample does not (or no longer?) work with mobile Edge since no voices are found.

To Reproduce
Steps to reproduce the behavior:

Go to the live demo
Try to select a voice => not possible
Try to speak any text => not possible due to no voice
See error log on the right hand side and status

Expected behavior
Working speech synthesis on mobile Edge.
This might very well also be a browser issue - maybe you have more insights on that?

Screenshots
n/A

Smartphone (please complete the following information):

Device: Samsung Galaxy A52
OS: One UI 6 / Android 14
Browser: Microsoft Edge on Android
Version: 123.0.2420.74

Additional context
Yes, I read the FAQ.
Yes, I installed voices on Android system level.
Still, Edge does not resprect these settings.

I found at least one more reddit adressing the very same issue - it used to work in the past but is broken now:
https://www.reddit.com/r/edge/comments/1498gcn/any_idea_why_on_android_edge_the_voices_from/

MDN example is down as well:
https://mdn.github.io/web-speech-api/speak-easy-synthesis/

I really hope to get more insights here. Thanks in advance!

Is there any way of setting the language?

First, cool project!

Second, after reading about it, I was expecting something like

EasySpeech.speak({ text: "Bonjour mon ami", language: "fr" })

to work out of the box, but it does not seem to be a feature.

Or am I missing something?

I was really attracted by this: "It will not only detect, which features are available but also loads an optimal default voice, based on a few heuristics" that I read here: https://dev.to/jankapunkt/cross-browser-speech-synthesis-the-hard-way-and-the-easy-way-353

`boundary` event doesnt fire on Chrome.

[x] I have read the FAQ section and have checked
that none of it describe the issue I'm about to report.

Describe the bug
The boundary event doesnt fire on Chrome whereas its working perfectly on Edge.

To Reproduce

EasySpeech.speak({
  text: myText,
  voice: myVoice, 
  pitch: 1,
  rate: 1,
  volume: 1,
  // there are more events, see the API for supported events
  boundary: event => {
    console.log(event);
  },
  end: event => {
    this.startIndex = null;
    this.endIndex = null;
  }
});

Expected behavior
It should work properly across the browers.

Desktop (please complete the following information):

OS: Windows
Browser Chrome
Version 122.0.6261.95

Find a way for robots/crawlers to not cause an error due to init fails

Robots/crawlers seem to not support web speech api and thus may raise errors. We need a solution to get around this, either via code or via documentation. User-specific implementation is favoured before inernals.

Add header

Add header to your site -:
The header includes the name of the site like the one's given below .

Does not work on KaiOS

Hi, this is a really, really, really great project. We use it now in LiaScript to generate the default TTS output for courses. I tested it on KaiOS, which seems to have a TTS-support, with the standard API and good voices (the demo from MDN works fine). However, easy-speech does not support this OS/Browser.

Maybe it would be possible to add another unknown - system, next to android, safari, etc. that allows to access the API, when the browser supports TTS and also offers some voices?

I know, why supporting edge-case feature-phones? But, they have a larger market share in India than IPhone ...

https://www.kaiostech.com/developers/

It would be great to add this support to easy-speech ...

Kind regards ...

Any voices with "(natual)" fail to be used to speak

Describe the bug
Any voices with "(natual)" fail to be used to speak

To Reproduce
Steps to reproduce the behavior:
In your online demo, try to pick up any voice with "(natual)", such as "Maisie". Then nothing is spoke out while other non-natual voices work.

I tested on Microsoft Edge v.121.

Expected behavior
All voices can be used.

Desktop (please complete the following information):

OS: Windows 11
Browser: Edge
Version: 121

I tried to following

// Get the voices available 
const voices = window.speechSynthesis.getVoices();

const textToSpeak = "This project was created, because it's always a struggle to get the synthesis part of Web Speech API running on most major browsers.";
// Create a new SpeechSynthesisUtterance object
const utterance = new SpeechSynthesisUtterance(textToSpeak);
// Find the voice that includes "Maise"
let maiseVoice = voices.find(voice => voice.name.includes("Maisie"));

if (maiseVoice === undefined)
  maiseVoice=voices[0];
// Set the voice and other parameters
utterance.voice = maiseVoice; // Use the first voice available
utterance.pitch = 1;
utterance.rate = 1; 

// Speak the text
window.speechSynthesis.speak(utterance);

in https://playcode.io/1699188. The voice works well.

So this must be a new bug for a few months ago every voice worked.

`onend` event don't work on Safari

Is your feature request related to a problem? Please describe.
The end event is messy in Safari.

Describe the solution you'd like
By adding window. before the variable utterance it works perfectly in Safari.

I worked on it in a fork, if you think it can be helpful, I will love to contribute.

Add Type Declarations to the project

I use TypeScript.

It would be nice to have the Type Declarations files (d.ts) to the project.

I'll open a PR with that.

Max Length for Text to read in some voices?

I recognized that (even in your demo), that the Google Voices (e.g. Google UK English Male) can't read very long text.

When I truncate my text to 4053 charactes it works. With 4054 it doesn't.

Here's the text: (4054 characters long. To make it work, just remove one character)

How forward-thinking marketing leaders are setting the course for growth 600+ senior marketers shared their take on the state of marketing today. We’ve distilled their insights here. Deep dive into emerging trends shaping the market and consumer behavior, and the strategies set to leverage them for growth. Uncover how AI and machine learning can play into marketing, the importance of prescriptive analytics and data-driven decision-making, how consumers’ drive towards sustainability interacts with brand purpose, and when to prioritize long-term vs short-term investments. Reasons to be cheerful: marketing’s emerging optimism in the face of adversity Macroeconomic and geopolitical risks continue to present a challenging scenario for businesses. But emerging technologies, diverse revenue streams and changes in consumer behavior all present opportunities for positive change. GfK’s new research shows that 78% of CMOs are optimistic about the future. And progressive, forward-thinking marketing leaders have every reason to be upbeat. Leading-edge technology is elevating the function’s customer-facing and back-office operations. Real-time insights and prescriptive data analytics are gaining traction. Investment in AI and related innovative technology is growing as use cases proliferate. And as ESG concerns move to the forefront of board discussions, CMOs have a chance to become champions of the consumer-critical sustainability agenda. Each holds importance in the boardroom and offers CMOs a platform to increase their strategic influence. Indeed, they have a unique opportunity to position themselves ever closer to the heart of their organization and instigate positive change. “I see a massive correlation among CMOs that embrace technology and those that are optimists,” says Gonzalo Garcia Villanueva, global CMO at NielsenIQ. “We’re also seeing companies better understand the need to strategically invest in their brand for long-term growth – and this extends to sustainability too. Purpose and the bottom-line are converging and this is a positive for marketers.” 78% of senior marketers say they are optimistic about the future Optimism across markets Certain sectors continue to prove their resilience despite ongoing economic uncertainty. The technology and durables market is representative of a new post-Covid, digital buying cycle, likewise the retail sector. Both are bouncing back as consumers look to make purchases postponed during the pandemic. “Consumers are making more time to evaluate what they spend their money on,” reports one senior marketing leader at L’Oréal. “This is an opportunity for ecommerce because it enables consumers to survey their spending. As a brand, we have a duty to ensure that they have all the information they need.” Market leaders have adjusted by investing in digital-first activations and campaigns. This strategy builds closer connections with consumers in the virtual arenas they increasingly inhabit. It also offers better customer experiences that deliver long-term value. These can be further optimized through data insights and generative AI that underpin greater personalization and automation. “Marketers are increasingly leading the charge towards data maturity, becoming ever more sophisticated in how data is used,” says Gonzalo Garcia Villanueva, global CMO at NielsenIQ. “Advanced data capabilities that help organizations make better decisions to prepare for whatever comes next will be a key marker of success in the future.” CMOs that can make the business case for adopting analytics and more innovative technology - and then make the most of it - will be in an enviable position. Those that can’t, will find themselves trapped in a cycle of reactive, short-term activity that erodes their margins and brand value over time. Advanced data capabilities that help organizations make better decisions to prepare for whatever comes next will be a key marker of success in the future. Most marketers show impressive confidence in their ability to justify marketing investment

Is this a known thing?
It there any variable I can check to get this max length?

DEMO - add option to list voices from all languages

The demo should include the option to list voices from all languages

DEMO Website - slider's don't apply

On the DEMO users can apply sliders to change rate, pitch and volume. It seems they don't work correctly.
Demo code is in the docs folder.

resumeInfinity fix breaks speech in Chrome on mac OS

I'm currently running Chrome version 105.0.5195.102 on mac OS, and the resumeInfinity fix causes certain strings to hold up the queue. Disabling the fix seems to allow it to run correctly, with both shorter and longer strings.

I'm not sure on the reason, besides the fact that a quick pause and resume seems to prevent any 'end' events from firing, preventing the next item in the queue from starting. It also seemed to be more of an issue with slightly longer strings - I assume resumeInfinity only gets called when processing a longer string?

How can I add this library as a script tag?

I want to add this library as a script tag on the html, I don't use npm. How can I do it? Something like
<script src="./easy-speech.min.js"></script>
Thanks!

MacOS Firefox - all loaded, events dispatch but no sound can be heard

Safari needs help if speech is delayed after user interaction

Interesting discovery with Safari, and possibly other browsers on ipad. Seems like if the speech is delayed after the user interaction (i.e. click) the speech won't work.
For example, if the first action is some audio from webaudio api, followed by speech, the speech won't play even if part of the same call stack.
Solution thanks to https://github.com/scp93ch seems to be to quickly play a volume=0 speech before the music or other audio, and this seems to "prime the pump" so later the speech will work. Maybe something to implement direct into your library but either way figured I'd let you know.

Register / integrate with multiple CDN

Is your feature request related to a problem? Please describe.
U'd like this library to be used with as many CDN services as possible

Describe the solution you'd like
Prepare to use with at least the following CDN providers:

unpkg
jsdelivr
cdnjs
esm.sh
-jspm

This may include additional (non-breaking!) changes in the package.json

On top, the docs should extend the installation guide by the CDNs

Typescript types don't work out of the box

After installing and importing the package, I get:

There are types at '.../node_modules/easy-speech/dist/index.d.ts', but this result could not be resolved when respecting package.json "exports". The 'easy-speech' library may need to update its package.json or typings.

I'm not entirely sure, but I think changing the exports field to this should work:

"exports": {
  ".": {
    "import": {
      "types": "./dist/index.d.ts",
      "default": "./dist/EasySpeech.js"
    }
  }
},

Chrome Ubuntu loads voices only after page refresh when tts is used on the very first page

Close all chrome windows, and try to get voices, no voices will be loaded, onvoiceschanged never dispatches
Refresh the page -> voices load, onvoiceschanged will be dispatched.

EasySpeech.speak stop currently playing speech

Hello.

The speak method of the W3C SpeechSynthesis API is responsible for queuing a SpeechSynthesisUtterance for text-to-speech (TTS). Therefore, window.speechSynthesis.speak(utterance) allows new text to be played without interrupting the currently playing speech.

However, when the EasySpeech.speak method is called, the current speech stops and the new text is played.

It seems that EasySpeech.speak should operate in a way similar to window.speechSynthesis.speak, adding the new SpeechSynthesisUtterance to the queue without interrupting the currently playing speech.

Or, is there a way for EasySpeech.speak to not interrupt the currently playing speech?

Microsoft voices only loading on browser refresh in Chrome on Windows

[X] I have read the FAQ section and have checked
that none of it describe the issue I'm about to report.

Describe the bug
A clear and concise description of what the bug is.

On the initial load of the page in latest Chrome on Windows 10 (not mobile), all the google voices available load, but not the 3 microsoft voices. But if users refresh the page, the 3 microsoft voices appear.

Additional context
[Add any other context about the problem here]

(https://longislandcw.github.io/morsebrowser/easysp23/).
Click/expand More Settings accordion-> check expert settings, check voice, and then see the list in "choose speaker..." (note that all non-english google voices are hidden from the user).

Brave and Edge seem ok, ie the microsoft voices load without need to refresh.

Voice codes changed

Describe the bug
Voice codes changed. My language mapping is no longer working. It's not really an issue as I can edit my mapping, but if this keeps changing, then it might be an issue in the future.

To Reproduce
...

Expected behavior
...

Screenshots
This is my language mapping before.

Desktop (please complete the following information):

OS: [e.g. iOS] macOS
Browser [e.g. chrome, safari] chrome
Version [e.g. 22] 117

Smartphone (please complete the following information):

Device: [e.g. iPhone6]
OS: [e.g. iOS8.1]
Browser [e.g. stock browser, safari]
Version [e.g. 22]

Additional context
not sure why the codes would change as it's standardized by Web Speech API, doesn't it?
this is the correct mapping as of today.

The voice has been updated so this is no longer an issue. Just wondering how the voices code mapping changed. Was this changed on teh Web API side or from this library?

Playback speed (normal rate=1.0) faster on mac m1 that other laptops

Something we've noticed while testing EasySpeech is that playback with same content on M1 mac plays back at a faster rate than the same exact test on other laptops (given the exact same TTS settings). Anyone else seen this?

Build for multiple targets

EasySpeech should be available for multiple targets (umd, amd, esm etc.).

Add timeout in case onvoiceschanged never dispatches

Currently on FF100 64bit Ubuntu the onvoiceschanged event never gets dispatched. In such case we need to add a configurable timeout that should allow to continue the execution / cancellation of loading.

EasySpeech returns different voices set for different users

At first, this is not an EasySpeech-specific issue. I have the same issue even when I use "window.speechSynthesis.getVoices()" directly.

On the same PC, same Edge browser, when different users get the voices list, one gets 326 while another gets 48. The 48 is part of 326.

It puzzled me. For some reason, some online voices seem only available to some users. But I still couldn't figure out the reason. Of course, I have no solution yet.

Any ideas?

Speak when voices won't load but speech synth and utterance are available (Brave shields up issue)

When brave shields are up, the list of voices is never seen.
Nonetheless, an utterance can speak.
.speak may need some option like "do it anyway!" even when the voices array is empty.

jankapunkt / easy-speech Goto Github PK

easy-speech's Introduction

Hey there 👋

Co-/maintaining Orgs

Big thanks to my sponsors!

Social

Stack

Read

Articles on Meteor

General articles on JavaScript

Support my work and articles

easy-speech's People

Contributors

Stargazers

Watchers

Forkers

easy-speech's Issues

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

General topics

OS-specific topics

Browser-specific topics

Voice-Specific Voices

Recommend Projects

Recommend Topics

Recommend Org