Giter VIP home page Giter VIP logo

rfannn / auravision Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 39.31 MB

AuraVision (AV) is a real-time speech-to-text demo tool designed to help the deaf and hard-of-hearing community by converting spoken words into text. This demo runs on Windows, Linux, and Mac PCs and supports English, Spanish, and Farsi. The final product will be built with Vosk and Raspberry Pi for hardware deployment.

License: MIT License

Batchfile 6.44% Shell 6.49% Python 61.37% HTML 25.69%
artificial-intelligence machine-learning python raspberry-pi voice-recognition vosk

auravision's Introduction

AuraVision (AV) ๐Ÿ‘๏ธโ€๐Ÿ—จ๏ธ

Version 0.2.0

AuraVision (AV) is a real-time speech-to-text demo tool designed to assist the deaf and hard-of-hearing community. It converts spoken words into text and displays them in real time. This demo is intended for use on Windows, Linux, and Mac PCs and supports English, Spanish, and Farsi. The final version will be developed as a hardware product using Vosk and Raspberry Pi.

Features ๐ŸŒŸ

  • Real-Time Speech Recognition: Converts spoken words to text instantly. ๐ŸŽ™๏ธ
  • Multi-Language Support: English, Spanish, and Farsi. ๐ŸŒŽ
  • Text Display Options: Customizable text display, including reversing text for Farsi. ๐Ÿ“
  • Cross-Platform: Runs on Windows, Linux, and Mac PCs. ๐Ÿ’ป
  • Flask Web Interface: Displays real-time text updates in a web browser with a simple fade-in animation. ๐ŸŒ

Installation ๐Ÿ› ๏ธ

  1. Clone the Repository:

    git clone https://github.com/Rfannn/AuraVision.git
    cd AuraVision
  2. Install Dependencies: Ensure Python 3.x is installed, then install the required packages:

    pip install -r requirements.txt
  3. Download Language Models: Download Vosk language models and place them in your desired directory. Vosk Models.

  4. Set Up Paths: Make sure to set the correct paths for the Vosk models in the code:

if lang == 'en':
    model_path = "your-path-to-english-model"
elif lang == 'es':
    model_path = "your-path-to-spanish-model"
elif lang == 'fa':
    model_path = "your-path-to-farsi-model"
  1. Alternative Installation (Windows):

    • Run init.bat to automatically set up the environment and install dependencies.
    • Note: Make sure you have administrative privileges to execute batch files.
  2. Alternative Installation (Linux/Mac):

    • Run init.sh to set up the environment and install dependencies.
    • Remember to give execute permissions to the shell script:
      chmod +x init.sh

Usage ๐Ÿš€

  1. Run the Flask Server:
python app.py
  1. Run the Speech Recognition Script:
python main.py
  1. Choose Language: Enter the language code when prompted:

The script will load the appropriate model and start processing audio.

  1. View Text: Text will appear in the console. For Farsi, text will be reversed. Or you could open your web browser and navigate to http://localhost:5000 to see the real-time text updates with a simple fade-in animation.

Configuration โš™๏ธ

Customize the model paths and other settings in the script as needed. Ensure the paths to language models are correct.

Future Development ๐ŸŒ

This demo is designed for PC platforms (Windows, Linux, Mac) and is a precursor to a hardware product that will use Vosk and Raspberry Pi. Stay tuned for updates on the hardware version of AuraVision! ๐Ÿ› ๏ธ

Contributing ๐Ÿค

Contributions are welcome! Please fork the repository and submit a pull request for any improvements or fixes. Open an issue on GitHub for questions or feature requests.

License ๐Ÿ“œ

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgements ๐Ÿ™Œ

  • Vosk for speech recognition.
  • Deep Translator for language translation.
  • Colorama for colored text output.
  • Flask for displaying text output on external devices.

Contact ๐Ÿ“ฌ

Feel free to reach out to me via any of the following channels:

Looking forward to connecting with you! ๐Ÿ˜Š๐Ÿ‘

auravision's People

Contributors

rfannn avatar

Stargazers

 avatar

Watchers

Kostas Georgiou avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.