Giter VIP home page Giter VIP logo

pingcap / tidb.ai Goto Github PK

View Code? Open in Web Editor NEW
152.0 15.0 19.0 3.19 MB

https://TiDB.AI is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage and LlamaIndex. Open source and free to use.

Home Page: https://tidb.ai

License: Apache License 2.0

TypeScript 62.56% JavaScript 0.34% MDX 1.89% CSS 0.40% HTML 0.04% Dockerfile 0.26% Makefile 0.03% Python 33.76% Mako 0.05% SCSS 0.47% Shell 0.21%
mysql rag serverless vector-database chatbot graphrag knowledge-graph

tidb.ai's Introduction

TiDB.AI

TiDB.AI

Backend Docker Image Version Frontend Docker Image Version E2E Status

Introduction

A open source alternative of Kapa.ai, conversational search tool based on GraphRAG (Knowledge Graph) built on top of TiDB Vector and LlamaIndex and DSPy.

Features

  1. Perplexity-style Conversational Search page: Our platform features an advanced built-in website crawler, designed to elevate your browsing experience. This crawler effortlessly navigates official and documentation sites, ensuring comprehensive coverage and streamlined search processes through sitemap URL scraping.

    out-of-box-conversational-search

    You can even edit the Knowledge Graph to add more information or correct any inaccuracies. This feature is particularly useful for enhancing the search experience and ensuring that the information provided is accurate and up-to-date.

    out-of-box-conversational-search

  2. Embeddable JavaScript Snippet: Integrate our conversational search window effortlessly into your website by copying and embedding a simple JavaScript code snippet. This widget, typically placed at the bottom right corner of your site, facilitates instant responses to product-related queries.

embeddable-javascript-snippet

Deploy

Tech Stack

  • TiDB – Database to store chat history, vector, json, and analytic
  • LlamaIndex - RAG framework
  • DSPy - The framework for programming—not prompting—foundation models
  • Next.js – Framework
  • shadcn/ui - Design

Contact Us

You can reach out to us on @TiDB_Developer on Twitter.

Contributing

We welcome contributions from the community. If you are interested in contributing to the project, please read the Contributing Guidelines.

License

TiDB.AI is open-source under the Apache License, Version 2.0. You can find it here.

tidb.ai's People

Contributors

634750802 avatar eltociear avatar ianthereal avatar icemap avatar mini256 avatar sykp241095 avatar wd0517 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tidb.ai's Issues

[Never Close] Image Storage

We paste images into this GitHub Issue, then retrieve the image URL for use in displaying images in the README.

[draft] milestone 2: private beta

Background

For our next milestone, we'll focus on making our app easier to deploy and use:

Product Refinement:

  • Dedicate efforts to polishing the product, ensuring it meets user expectations in usability.

Deployment:

  • Add support for deploying on Vercel(maybe and Cloudflare).
  • Make it possible to deploy locally using tools like npx for quick testing.

Content Updates:

  • Create a welcoming landing page for new visitors.
  • Write usage docs/api docs to help users get started and integrate.

Todo list

  • #80
  • product
    • api & api token mgmt
    • more LLM support(maybe need to re-split/re-index/re-embedding the content of docs)
    • support more data source: word etc.
    • Role-based access control & mgmt(user, admin)
  • deployment
    • manually deployment(with npx etc.)
    • docker
    • Vercel
    • Cloudflare(optional)
    • fly.io(optional)
    • LLM/tokens usage statistic
  • security
    • cross-domain white list
  • content
    • landing page
      • /
      • /showcases
      • /docs/get-started
      • /docs/api
    • README.md

Fix /home style on small screen

For title&subtitle, we can make the width shorter and wrap the text.

For docs in footer, we can make it one line per document, like this:

Line1: Docs
Line2: Another Link
Line3: Linkeeee4

[Draft] Implement an incremental crawler

src/core/interface.ts

export namespace rag {
  export interface Content<ContentMetadata> {
    content: string[];
    digest: string;
+  lastModifiedAt: Date;
    metadata: ContentMetadata;
  }

  export type ImportSourceTaskResult = {
    enqueue?: Array<{ type: string, url: string }>
    content?: {
      buffer: Buffer
      mime: string
    }
+   incrementalState?: unknown
  }

  export abstract class ImportSourceTaskProcessor<Options> extends Base<Options> {
    abstract support (taskType: string, url: string): boolean;
    abstract process (task: { url: string }): Promise<ImportSourceTaskResult>

+   abstract supportIncremental (taskType: string, url: string): boolean;
+   abstract processIncremental (previousState: unknown): Promise<ImportSourceTaskResult>
  }
}

src/core/db/importSource.ts

export interface ImportSource {
  created_at: Date;
  filter: string | null;
  filter_runtime: string | null;
  id: string;
  type: string;
  url: string;
+ incremental_state: JSON | null;
+ last_scheduled_at: Date | null;
}

Support Bitdeer as provider

Description

Bitdeer is a cloud computing platform that provides computing power for cryptocurrency mining.

In the AI age, Bitdeer also provides multiple managed AI models (For example, llama2, mistral, etc.) API services.

Tasks

Support cost statistics

Calculate the cost of RAG in calling various third-party services, like:

  • Serverless Function (Vercel)
  • Storage
  • Embedding
  • Rerank
  • LLM Completion
  • Database Query (TiDB Serverless)

milestone 1: eating our own dog food

Background

As we develop Vector Search in TiDB Serverless, we'll build an app to test how easy it is to use. This app will provide answers to TiDB usage questions on our official websites using our in-progress vector storage in TiDB Serverless.

Todo list

  • core rag logic
    • data source management
      • upload pdf/markdown/csv etc.
      • crawl sitemap.xml of a domain
    • ui/ux
      • conversational search
        • chat history mgmt after login
      • embeddable js
  • system settings
    • overview: statistics of chats & docs
    • basic info settings
      • logo
      • site name
      • search title & subtitle
      • example questions, max to 4
      • footer links
      • GitHub / Discord etc. social media links top-right
    • oauth configurations
      • GitHub
      • Gmail
    • (pre|post| prompt settings
    • rag loader & spliter configruations: chunk size, overlap etc.
  • Use llamaindex-ts as RAG engine
  • add widget on the bottom-right of (www|ask).pingcap.com to answer questions about tidb usage / use cases

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.