Giter VIP home page Giter VIP logo

aida's Introduction

AiDA: The Youtube Chat Assistant

Welcome to AiDA, a prototype built in 1 day for 42-London Hackathon that was focused on Redefining Education with AI. AiDA enables you to chat with your youtube videos as they play in real-time. It also tries to classify your intent, if you are looking for a summary it sketches a quick chart to explain the main points of the video. The goal is to bring videos to life to improve children's learning experience on youtube.

Watch the video

P.S: AiDA got 1st place winning the 42 Hackathon!

Requirements

AiDA is web based app built using Python and streamlit, it utilizes Langchain for Retrieval Augmented Generation (RAG) chatbot, vectorized document storage with OpenAI and Chroma, GPT/Claude models for analysing quering and chatting, GPT to generate mermaid chart and streamlit mermaid library to sketch the chart.

Running AiDA

To get started with AiDA, ensure you have Python 3.6+ installed. Clone this repository and install the required dependencies:

git clone https://github.com/yourgithub/AiDA.git
cd AiDA
streamlit run app.py

Contributing

Any contributions you make are greatly appreciated. If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue and give it the correct labels.

aida's People

Contributors

tahatobaili avatar

Watchers

Kostas Georgiou avatar  avatar

aida's Issues

Mermaid Chart: Inconsistent Generation

Following on the previous issue we now need to generate a nice mermaid chart to supplement the output summary to enhance the user's understanding of the summary.

We are building the chart in two steps:

  1. Generating it.
  2. Displaying it.

The problem is in generating it, check the generate_chart(transcript), it calls ChatGPT to generate it. The problem is that the result is not always accurate (tested on GPT-4) sometime it is a correct chart, sometimes it is not.
To test the chart, run the output on Mermaid's live editor.

Observation: GPT seems to work better with very short videos.

What needs to be done:

  1. Prompting: Play with the prompt a bit to see if you can reach consistent working charts.
  2. Different models: Test against different LLM models.
  3. Finetuning: Can you get hands on a dataset that consists of blocks of texts and their equivalent mermaid charts? Then either provide us with this dataset or finetune any LLM model and share your code if the results are better than GPT.

The chart is then displayed via st_mermaid(mermaid_code), no issues with that.

§Intent Classification: Inconsistent Responses

User Experience:

As the user is watching the video, they can chat with AiDA about it. Their questions, might require an in depth response aided by a nice chart or it might be a simple response. Hence, a chart is not always necessary in this case.

What needs to be done:

To mitigate this redundancy, a nice way is to classify the user's intent, as a start if they are looking for a summary then yes generate a chart and audio file (text-to-speech) as well.

For that check function classify_intent(user_input) it attempts to classify whether the user is requesting a summary or something else via a ChatGPT call. The responses are inconsistent and not always accurate.

To address this, the following steps need to be implemented:

  1. Prompt: We need to make sure ChatGPT response is precisely "True" or "False" to treat the output as a boolean with ast.literal_eval(response.content). It might be that GPT is responding with something like "true" or "true.." or "yes, it is a summary" or the like. Keep playing with the prompt to reach consistent output.

  2. Function Calling: Use a json schema and function calling to return the response to a specific json key, read this.

    Bonus Points: Figure out Function Calling in LangChain!

  3. Testing Robustness: Need to be tested sufficiently to confirm that the intent is correctly classified as "True" if request relates to a summary or "False" for anything else.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.