Giter VIP home page Giter VIP logo

a.l.i.c.e.'s Introduction

Project A.L.I.C.E. Adaptive Linguistic Interpreter and Commmand Executor

This documentation is still in progress. Expect things to be missing.

This project aims to create an adaptive voice assistant that enables users to execute commands. Unlike current offerings by big tech, this software is meant to be run locally.

There are multiple parts to this project

  1. Mic Array Recorder and streamer.
  2. STT Converter
  3. Text Analyser and command executor
  4. TTS Converter
  5. Playback engine.
  6. Payload Library

Mic Array Recorder and streamer

Currently this project runs on a rpi 3b+ with MAtrix Creator. It has a 8 microphone array that records audio, then uses webrtc's VAD detection to check for audio. Once audio is detected it creates a beamformed audio stream which is streamed to a local server for further analysis

STT Converter

This portion of the project runs in a GPU backed VM. It uses webrtc to recieve audio and then converts the speech to text. It is currently used in conjunction with whisper from openai. Once speech is converted to text, it will then send it over a socket to the Text analyser.

Text analysis and command executor

This portion gf the project recieves text over websockets. Once recieved, it shall then use a backend to analyse the text for intent or use. For this portion we currently use OpenAi's GPT-3.5 model. The model is asked to return a payload that can be easily analyzed. After recieving the payload it will then invoke the request commands. These commands must be programmed amnually for now or by using plugins in the future.

TTS converter.

Uses a neural TTS engine to convert replies into natural language. Streams audio to an entity with a speaker source.

Playback Engine.

Plays recieved audio back tot he user.

Payload Library

Defines all the communication interfaces used by the project

NOTES:

There is a bug in av for aiortc where we are unable to install av due to an issue with cython. use the follwing to bypass the issue

PIP_CONSTRAINT=c.txt pip install av==10.0.0

PyPi does not have the latest version of Pyogg. Instead use this:

pip install git+https://github.com/TeamPyOgg/PyOgg

Using LLAMA cpp for python bindings.

CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
CUDACXX=/usr/local/cuda-12/bin/nvcc CMAKE_ARGS="-DLLAMA_CUBLAS=on -DCMAKE_CUDA_ARCHITECTURES=native" FORCE_CMAKE=1 pip install llama-cpp-python --no-cache-dir --force-reinstall --upgrade

a.l.i.c.e.'s People

Contributors

qnlbnsl avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.