Giter VIP home page Giter VIP logo

talk-to-chatgpt's Issues

the prompt is not sent automatically

The checkbox to send the message automatically is checked though the extension doesn't manage to send the message, probaly the extension need an upgrade

'Respond' word

Hello, great extension.

Just like you have a 'Stop' word and a 'Pause' word, could you add also an optional 'Respond' word.
The benefit would be that if you want to say something to ChatGPT the way it is now, you always feel the pressure to say the whole thing without pausing and that is uncomfortable and at times not possible because sometimes you have to pause and think of what you want to say in mid sentence but currently you cannot do this because the extension will send the incomplete sentence that you started saying to chatgpt.

But if there is a 'Respond' word then you can pause as long as you want and as often as you want and then when you are done with what you wanted to say then you say your 'Respond' word and then you get chatgpt to respond to you. Just like when people are talking over Walkie Talkies they use the word "Over" which means I’m done with my transmission, ready for your reply

Unable to hear the voice from ChatGPT and other issues

At first, I was free to use the tool and explore all kinds of voices in the setting, but once I tried "Stop" with my voice, it stopped and it was no longer able to generate a voice for ChatGPT(it was not muted). My input function is still working but I have to press ⏩ to input my voice every time. I tried uninstalling and reinstalling the tool but the issue stays the same. PLEASE HELP!! This is an awesome tool for me and other language learners and I want to keep using it! Thanks

chatgpt 的网页并不好用,希望可以使用自己的api

由于使用的人太多,gpt的网页并不好用,仅对话两三次就需要刷新页面才能继续对话。希望可以实现在本地通过api和和chatgpt语音对话,这可以是一个脚本,app,或者其他。
/感谢你开发的浏览器扩展

Really appreciate your Chrome Extension

It's running smoothly. Thanks first.

I was wondering if you can add a new feature of supporting personal GCP/Azure API key to use more natural sounds.

One example is AwesomeTTS which is an Anki plugin to do exactly what I proposed.
https://github.com/AwesomeTTS/awesometts-anki-addon

Maybe you can just "COPY" AwesomeTTS's business model and then charge a fee for people who don't want to manage CLOUD
API but still want better natural sounds for ChatGPT.

doesn't work with chinese.

It doesn't read Chinese out loud If the ai responds with chinese . Which is kind of sad because I like Chinese :).

for example:

image

will not be read out loud. I have the settings like that:

image

Option for Amazon Polly

I have pulled down the project and attempted to add this myself, but Amazon's Polly API generates very pleasant speech-to-text.

I wish there were an option or adding an AWS API key and selecting an Amazon Polly's voice.

Polly Voices

Hello,

I use the voices from Amazon Polly AWS because the quality is very good but I can't use them with talk-to-chatgpt extension because only Google or Microsoft voices are listed.

Do you think that it is possible for you to add the Amazon Polly in the parameter voices list in a near futur?

Thank you.

VOICES

Pop-up

Hello, I'm using these extensions, but I have a problem with a pop-up that asks me to confirm the message sending. I tried to disable it in Chrome, but it didn't change anything, it still asks me to confirm before sending.

How to modify the code, TTS the entire paragraph directly without Split the text into sentences?

Really appreciate your Chrome Extension, I LOVE IT!

There is one question that has been bugging me . I know that in order for the speech synthesis to start speaking as soon as possible, you split the text into sentences and read them aloud one by one.

But this leads to huge pauses between each sentence.

how can we modify the code ,so TTS the entire paragraph directly without Split the text into sentences?

THANKS!

Voix supplémentaires

Bonjour et un très grand bravo pour votre travail.
Est il possible d'avoir un moyen d obtenir d'autres styles de voix en Français, plus naturelle ?
D'avance merci.

No Permission Pop-Up in Brave Browser

I have installed the extension in Brave and set extension permissions to allow microphone use. When I am in ChatGPT and click 'START' for Talk-to-ChatGPT, the extension says 'OK', but there is no pop-up for microphone permission. Speech is not registered. Brave settings are set to allow 'Sites can ask to use your microphone', and Talk-to-ChatGPT is listed in allowed extensions.

talk to chatgpt and AIPRM gets a little messy, PLEEEZ help, its so important to me <3 (im a girl btw)

aiprm is amazing extension for gpt, i use it to control the output language of chatgpt, and also the tone and writing style, but it also adds a "continue" button, that totally fucks over talktochatgpt's manually send (thru voice) or automatic sending message because for some reason it triggers the continue button now instead of just sending the msg, so a chrome browser prompt pops up and asks me if im sure i want to discard prompt, fucks over everything and all my hopes and dreams plz help its a small thing but makes a huge problkem for me

Is it possible to add Google Cloud TTS voices?

I'm interested in adding another voice for Japanese. There is one current available Japanese voice, and it is an unnatural, stereotypical announcer-type pronunciation. It doesnt sound anything like a real person would sound, if you were talking to them in a one on one conversation. Would be nice to have a more natural sounding voice. Is there any way to add additional text-to-speech voice options?

I see that google cloud offers 11 options:
https://cloud.google.com/text-to-speech/docs/voices

Of the voices listed there, I think this is clearly the best choice, most natural sounding for sure:
ja-JP-Neural2-B

Second place
ja-JP-Wavenet-B

The rest, sound fake/announcer-like similar to the current option.

Not sure what the source of the available TTS languages is and whether or not it is easy to plug in or access additional voice options.

BTW - THANKS!! For this killer app. Really love it.

WG

error in talk to chatgpt it is not responding automatically

please help talk to chatgpt is not working correctly, it is reading some numbers at the beginning of the response, and it is not responding automatically either, a few hours ago it began to give this problem, I already restarted the pc, I even have 2 plus accounts in none It also works, I changed to a different PC and it doesn't work either, what happens? please help.

Activation phrase.

It's really amazing, thank you so much! I would just like to be able to activate it with a keyword like Alexa and it would be perfect.

Key commands for vision impaired

Thank you for creating this amazing tool. It's a game changer for those of us using voice to interact with their PC. Would you please consider adding a key commands that would trigger the start and skip functionality? This would enable us a screen free experience. Thanks again for all your work!

speech recognition stops after a short pause

Would it be possible to have a setting to increase the time you have between pauses in speaking to ChatGPT? I've used the extension to create a bot for a social audio app and when people talk to the bot account if they stop speaking for a moment, it assumes they're done and starts to process what has already been said (but they aren't finished with what they're actually saying, they just paused to think).

Any insights on this? Thanks!

'Stop Phrase' feature with a unique phrase like "Stop Text To Speech" so that it never falsely triggers

It would be very helpful if you could set a 'Stop Phrase' and 'Pause Phrase' that could stop/pause the bot while it is speaking.

The purpose of this feature is to make sure the bot does not say these words itself. By setting a 'Stop Phrase' you could ensure that the bot does not say these words itself, so that no users will consider it as a problematic bug.

I.e. by default the Pause Phrase could be set to something like "Pause Text To Speech", which is something that would be too rare for the bot to say.

Will respond to 1 initial prompt, then ceases to work.

I've tried logging out/in , reinstalling extension, checking mic settings. It is only allowing an initial prompt at best. Usually I test it with a "hello" to gauge server stability and that is working ok, then following that there is no more STT function. Standard keyboard prompts continue to work fine.

Sometimes I see a red ribbon, followed by a green ribbon, attached to the bottom of the extension widget but it goes away. Sometimes I hear an "ok" from the extension, and I have brief functionality. What are these ribbons telling me?

reading code

could it be possible to detect a code block and skip reading the code blocks?

UI Updates, ChatGPT UX.

Hello,

I am a user of your talk to ChatGPT project and I really appreciate your work. However, I have a suggestion for improving the user interface of the widget.

Currently, the UI is very simple and does not resemble chatgpt's UI.

I have created some tampermonkey code that modifies the UI of the widget to make it look more good looking.

// ==UserScript==
// @name         New Userscript
// @namespace    http://tampermonkey.net/
// @version      0.1
// @description  try to take over the world!
// @author       You
// @match        https://chat.openai.com/chat
// @icon         https://www.google.com/s2/favicons?sz=64&domain=openai.com
// @grant        none
// ==/UserScript==

(function () {
  "use strict";
  setTimeout(() => {
    var ttgpt = document.querySelector("#TTGPTSettings");
    var startBtn = ttgpt.querySelector("#CNStartButton");
    var heading = ttgpt.querySelector("a");
    var br = ttgpt.querySelector("br");
    br.remove();

    ttgpt.style.padding = "0px 8px 0px 8px";
    heading.style.color = "transparent";
    heading.style.height = "1px";
    heading.style.pointerEvents = "none";
    heading.style.display = "block";

    heading.href = "javascript:void(0)";
    heading.onClick = () => {};

    ttgpt.style.background = "#3e3f4b";
    startBtn.style.cssText = "";
    startBtn.classList.value =
      "flex py-3 px-3 items-center justify-center w-full gap-3 rounded-md hover:bg-gray-500/10 transition-colors duration-200 text-white cursor-pointer text-sm mb-2 flex-shrink-0 border border-white/20 mt-3";
    startBtn.textContent = "🎙️Start Talking";

    setInterval(() => {
      try {
        var ttgptModal = document.querySelector("#TTGPTSettingsArea");

        ttgptModal.querySelector("table").style.width = "88%";

        ttgptModal.querySelectorAll("button").forEach((btn) => {
          btn.classList.value =
            " py-3 px-3 gap-3 rounded-md hover:bg-gray-500/10 transition-colors duration-200 text-white cursor-pointer text-sm mb-2 flex-shrink-0 border border-white/20 bg-gray-900";
        });

        ttgptModal.querySelectorAll('input[type="text"]').forEach((input) => {
          input.classList.value =
            " flex flex-col w-full py-2 flex-grow md:py-3 md:pl-4 relative border border-black/10 bg-white dark:border-gray-900/50 dark:text-white dark:bg-gray-700 rounded-md shadow-[0_0_10px_rgba(0,0,0,0.10)] dark:shadow-[0_0_15px_rgba(0,0,0,0.10)]";
          input.style.color = "white";
          input.style.width = "100%";
          input.style.height = "37px";
        });

        ttgptModal.querySelectorAll('input[type="number"]').forEach((input) => {
          input.classList.value =
            " flex flex-col w-full py-2 flex-grow md:py-3 md:pl-4 relative border border-black/10 bg-white dark:border-gray-900/50 dark:text-white dark:bg-gray-700 rounded-md shadow-[0_0_10px_rgba(0,0,0,0.10)] dark:shadow-[0_0_15px_rgba(0,0,0,0.10)]";
          input.style.color = "white";
          input.style.width = "100%";
          input.style.height = "37px";
        });

        ttgptModal.querySelectorAll("select").forEach((select) => {
          select.classList.value =
            " flex flex-col w-full py-2 flex-grow md:py-3 md:pl-4 relative border border-black/10 bg-white dark:border-gray-900/50 dark:text-white dark:bg-gray-700 rounded-md shadow-[0_0_10px_rgba(0,0,0,0.10)] dark:shadow-[0_0_15px_rgba(0,0,0,0.10)]";
          select.style.color = "white";
          select.style.width = "100%";
          //   select.style.height = "50px";
        });
      } catch (e) {}
    }, 1000);

    // py-3 px-3 gap-3 rounded-md hover:bg-gray-500/10 transition-colors duration-200 text-white cursor-pointer text-sm mb-2 flex-shrink-0 border border-white/20 bg-gray-900
    // flex flex-col w-full py-2 flex-grow md:py-3 md:pl-4 relative border border-black/10 bg-white dark:border-gray-900/50 dark:text-white dark:bg-gray-700 rounded-md shadow-[0_0_10px_rgba(0,0,0,0.10)] dark:shadow-[0_0_15px_rgba(0,0,0,0.10)]
  }, 1500);
})();

Speech recognition doesn't work on microsoft edge

I get these two errors in the console repeated infinitely

Error while listening
​I've stopped listening

I want to use microsoft edge because it has better TTS voices, which support high playback rates. The google voices sound garbled at high speeds

Ukrainian speech synthesis

I have found only instructions on how to add speech recognition in the extension code, but couldn't find where it pulls the speech synthesis besides voice.lang stuff. Can you add Ukrainian as a TTS? ChatGPT can talk it, and google has the TTS for Ukrainian existing...

Selected text-to-speech voice is saved but not loaded correctly (tested in Microsoft Edge)

If I select a voice from the menu, it works temporarily, but then resets to the default voice when the page is reloaded.

I looked further into it and I noticed that after the page is reloaded the console does show the log:

Reloading settings from cookie: en-US-Microsoft Zira - English (United States),1,1,,stop,pause

Which means that the voice name is saved and and then loaded from the cookie, and this conditional body is reached:

if (voice.lang + "-" + voice.name == CN_WANTED_VOICE_NAME) {
	CN_WANTED_VOICE = voice;
	console.log("I will read using voice " + voice.name + " (" + voice.lang + ")");
	return false;
}

The CN_WANTED_VOICE variable should be assigned with the correct voice, but it doesn't seem to actually use it?

The voice falls back to the default one (Microsoft David) and no related message is shown in the console.

I'm not sure why this is happening. It's strange, since the log message shows that it is loaded correctly.

voice

Hi, I'm not a programmer. Can you add Vietnamese language for AI's voice?

Android

At first try it didn't work on Chrome/Edge on Android.
Is it supposed to work on Android?

Mobile support?

Thank you for the wonderful contribution!

I want to talk to ChatGPT while on a walk. Is this use case supported? If not, what are the steps involved in implementing it?

Thanks again! 🙏

Edit: for reference there is the SiriGPT Siri shortcut, but it appears to be closed source, and requires configuring an OpenAI API key, and prefixing everything with "Hey Siri, SiriGPT..." -- it is also responding with an empty string for me on iOS 16.3.1.

Skip the code

I don't want to hear the code . I want to skip it , just listen description

Kudos but disable the automatic Submit

Close to perfection, kudos.
Still a small snag - it assumes perfection in speech recognition.

Me no coder ;), so after conversing with lab.openai.com (accepts longer code, but still not full, had to trim some) and ChatGPT at the same time yet again, I have changed one bit, because the Speech to Text output needs to be manually fixed each time to avoid its misinterpretations:

// Version		: 1.4.1 - Automatic "Submit" disabled by Manamama
[...]
function CN_SendMessage(text) {
	// Send the message
	jQuery("textarea").val(text);
	//jQuery("textarea").closest("div").find("button").click();

To make it a tad more elegant, I asked AI to modify the other code once again ;)

> AI change the code so that the send button click is NOT simulated. The user has to press Enter or Send instead.
Show me the original CN_SendMessage function and the code that you provided that modifies the CN_SendMessage function. Explain the changes:

// Wait for the Save & Submit button press to submit the message or automatic:
var CN_SUBMIT_ON_ENTER = false; // or true
[...]
function CN_SendMessage(text) {
  // Stop speech recognition until the answer is received
  if (CN_SPEECHREC) {
    clearTimeout(CN_TIMEOUT_KEEP_SPEECHREC_WORKING);
    CN_SPEECHREC.stop();
  }

  // Set the value of the textarea element to the given text
  jQuery("textarea").val(text);

  // Submit the message if the user has set the press to CN_SUBMIT_ON_ENTER variable as true
  if (CN_SUBMIT_ON_ENTER) {
    jQuery("textarea").closest("div").find("button").click();
  }  
}

This modified version of the CN_SendMessage function removes the code that simulates a button click to submit the message. Instead, it only sets the value of the textarea element to the given text and stops the speech recognition process if it is currently running.

I have added more and tried to fork it in full, now here, with another small suggested fix to the Readme as well, but due to my (or GPT's ;) meagre coding skills it still seems to stop listening upon the manual Submit.

Voice playback is not smooth

The audio playback of the text-to-speech voice is not smooth, and has unnatural pauses while speaking.

Is this solvable, or a limitation of the way the ChatGPT response is served back to the client?

'Stop' word is not working.

I want to stop TTS playback by saying 'Stop' but it's not working. I tried various times to make sure I'm saying 'Stop' loudly and clearly, but it does not stop the speech.

Problem with Brave .

The extension does not work well in the Web Browser Brave has problems with the microphone.

Stop and Pause voice commands sometimes fail.

Sometimes the speech-to-text outputs "Pause." and "Stop." instead of "Pause" and "Stop". The periods at the end of the words cause the plugin not to recognize them. This can be corrected with the following changes:

// if (final_transcript.toLowerCase() == CN_SAY_THIS_WORD_TO_STOP) {
if (final_transcript.toLowerCase() == CN_SAY_THIS_WORD_TO_STOP || final_transcript.toLowerCase().slice(0, -1) == CN_SAY_THIS_WORD_TO_STOP) {
// if (final_transcript.toLowerCase() == CN_SAY_THIS_WORD_TO_PAUSE) {
if (final_transcript.toLowerCase() == CN_SAY_THIS_WORD_TO_PAUSE || final_transcript.toLowerCase().slice(0, -1) == CN_SAY_THIS_WORD_TO_PAUSE) {

Microphone access on phone

I don't know if its possible but if your extension could request access to your phones microphone when using a chromium based browser like Kiwi that supports extensions then it would work on android phones as well. I have not been able to find a way to turn on microphone access to the site or your extension without a request being sent from the site or extension. If someone knows a way that would also be helpful.

thanks again... awesome extension.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.