Giter VIP home page Giter VIP logo

chat-any's Introduction

Chat-any (chat with any website)

1. Introduction

In today's information-driven world, accessing relevant and accurate data quickly is paramount. Traditional search methods can be time-consuming and often yield irrelevant results, creating a demand for more efficient information retrieval systems. My web application addresses this need by allowing users to input website URLs, which are then crawled to build a comprehensive knowledge base. This knowledge base is leveraged by a Retrieval-Augmented Generation (RAG) system, enabling users to interact with the website content through intuitive, conversational AI.

2. Overview

  • System architecture

    system-architecture.drawio.png

The process begins by user inputing website URL. After that, a website is crawled and convert its content into text. This text is then split and embedded using the embeddings model. When a user inputs a prompt, the system performs a similarity search in the embedding space to find relevant information, which is then augmented to the original prompt. This augmented prompt is sent to the large language model (LLM), which generates a detailed and contextually appropriate response that is returned to the user. This system leverages advanced retrieval and generation techniques to provide accurate and relevant answers based on the content of the crawled website.

3. Installation

  • Install dependencies

    pip install -r requirements.txt 
  • Because, we use Gemini-pro as LLM, so you may need to get Gemini API Key. Get an API key. Once you have Google API key, add it into .env file

Optional
  • Caching embedding model
    • Make weights/ and move into the directory:
      !git lfs install
      !git clone https://huggingface.co/BAAI/bge-small-en
  • cd .. to move back to the previous directory
  • Now, uncomment # os.environ["HF_HOME"] = "/workspaces/chat-any/weights" line in app.py

4. Usage

  • To run demo app with Streamlit

    streamlit run app.py

Demo

Alt text

Limit

  • Can only handle english language (Because I use Huggingface: BAAI/bge-small-en as embedding model)
  • I don’t focus on optimizing inference, so creating embeddings or other processes may take a while 🐒. Please take a deep breath and be patient, my friend! πŸ™

chat-any's People

Contributors

lamld203844 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.