Démo Speech Text Chat

This project is a Flask web application that captures audio from the user's microphone and uses OpenAI's Whisper model for speech-to-text transcription.

This software is provided as is, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose, and non-infringement. In no event shall the authors or copyright holders be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the software or the use or other dealings in the software.

Features

Record audio directly in the web browser.
Send the audio data to a Flask server.
Use Whisper to transcribe the audio to text.
Display the transcription result on the web page.
Real Time Transcription
Working chatbot
Streamed response
Text-to-speech responses

Todo

Various feedback loops to improve the quality of the transcript as well as the questions/answers
log conversations to central contexts in case further processing is needed
Real time transcription improvements
Chatbot improvement (user initial prompt handling)
AOB

Also:

Fix unsafe Wekzeug usage for production

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

What things you need to install the software and how to install them:

python3
pip

Also, for optional dependencies:

virtualenv
ffmpeg (for the audio processing (both this project and openai-whisper))
Docker (optional for containerization)

And for the audio outputs (dependencies from pyttsx3, see Synthesizer support):

sapi5 (Windows)
nsss (Mac OS)
espeak (Linux)

Installing

How to get the app running?

Clone the repository:

git clone https://github.com/spiderweak/demo-speech-text-chat

Navigate to the project directory:

cd demo-speech-text-chat

Create a virtual environment (alternatively, use conda, but please try not to install this directly on your system, it's not a good practice):

python -m venv .venv
source .venv/bin/activate

Install the required dependencies:

pip install -r requirements.txt

Running the Application

Run the application with:

python run.py

Access the web application at http://127.0.0.1:5000/.

If necessary, you can run this application in headless mode to put your own frontend interface, don't forget you still need to send your data to the correct backend routes.

python run.py --headless

This last feature has not been extensively tested.

Docker Support

If you wish to use Docker for deployment:

Build the Docker image:

docker build -t demo-speech-text-chat .

Run the Docker container:

docker run -p 5000:5000 demo-speech-text-chat

As you can do with the non-dockerized version, you can run the project with docker in headless mode:

docker run -p 5000:5000 demo-speech-text-chat --headless

The container does not copy chat models locally, but you can mount the models in a dedicated folder if you want to use your own model. Just don't forget to change the environment variables defined in the .env file:

docker run -p 5000:5000 -v $(pwd)/app/models:/app/models demo-speech-text-chat

Interfaces diagram

"Chatbot integration diagram"

Contributing

This project does not accept exterior contributions for now.

Authors

Antoine "Spiderweak" BERNARD

License

Unless part of the system is incompatible with it, consider this project under CC BY-NC-SA and mostly used for research purposes and teaching.

Acknowledgments

Thanks to these project, that make most of the project run

OpenAI for the Whisper model and for the disclaimer in the opening statement of this README.
Flask

spiderweak / demo-speech-text-chat Goto Github PK

demo-speech-text-chat's Introduction

Démo Speech Text Chat

Features

Todo

Getting Started

Prerequisites

Installing

Running the Application

Docker Support

Interfaces diagram

Contributing

Authors

License

Acknowledgments

demo-speech-text-chat's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent