Giter VIP home page Giter VIP logo

db-gpt's Introduction

DB-GPT: Revolutionizing Database Interactions with Private LLM Technology

What is DB-GPT?

DB-GPT is an open-source framework designed for the realm of large language models (LLMs) within the database field. Its primary purpose is to provide infrastructure that simplifies and streamlines the development of database-related applications. This is accomplished through the development of various technical capabilities, including:

  1. SMMF(Service-oriented Multi-model Management Framework)
  2. Text2SQL Fine-tuning
  3. RAG(Retrieval Augmented Generation) framework and optimization
  4. Data-Driven Agents framework collaboration
  5. GBI(Generative Business intelligence)

DB-GPT simplifies the creation of these applications based on large language models (LLMs) and databases.

In the era of Data 3.0, enterprises and developers can take the ability to create customized applications with minimal coding, which harnesses the power of large language models (LLMs) and databases.

Contents

DB-GPT Youtube Video

Demo

Chat Data

chatdata

Chat Excel

excel

Install

Docker Linux macOS Windows

Usage Tutorial

Features

At present, we have introduced several key features to showcase our current capabilities:

  • Private Domain Q&A & Data Processing

    The DB-GPT project offers a range of functionalities designed to improve knowledge base construction and enable efficient storage and retrieval of both structured and unstructured data. These functionalities include built-in support for uploading multiple file formats, the ability to integrate custom data extraction plug-ins, and unified vector storage and retrieval capabilities for effectively managing large volumes of information.

  • Multi-Data Source & GBI(Generative Business intelligence)

    The DB-GPT project facilitates seamless natural language interaction with diverse data sources, including Excel, databases, and data warehouses. It simplifies the process of querying and retrieving information from these sources, empowering users to engage in intuitive conversations and gain insights. Moreover, DB-GPT supports the generation of analytical reports, providing users with valuable data summaries and interpretations.

  • Multi-Agents&Plugins

    It offers support for custom plug-ins to perform various tasks and natively integrates the Auto-GPT plug-in model. The Agents protocol adheres to the Agent Protocol standard.

  • Automated Fine-tuning text2SQL

    We've also developed an automated fine-tuning lightweight framework centred on large language models (LLMs), Text2SQL datasets, LoRA/QLoRA/Pturning, and other fine-tuning methods. This framework simplifies Text-to-SQL fine-tuning, making it as straightforward as an assembly line process. DB-GPT-Hub

  • SMMF(Service-oriented Multi-model Management Framework)

    We offer extensive model support, including dozens of large language models (LLMs) from both open-source and API agents, such as LLaMA/LLaMA2, Baichuan, ChatGLM, Wenxin, Tongyi, Zhipu, and many more.

  • Privacy and Security

    We ensure the privacy and security of data through the implementation of various technologies, including privatized large models and proxy desensitization.

  • Support Datasources

Introduction

The architecture of DB-GPT is shown in the following figure:

The core capabilities primarily consist of the following components:

  1. Multi-Models: We support multiple Large Language Models (LLMs) such as LLaMA/LLaMA2, CodeLLaMA, ChatGLM, QWen, Vicuna, and proxy models like ChatGPT, Baichuan, Tongyi, Wenxin, and more.
  2. Knowledge-Based QA: Our system enables high-quality intelligent Q&A based on local documents such as PDFs, Word documents, Excel files, and other data sources.
  3. Embedding: We offer unified data vector storage and indexing. Data is embedded as vectors and stored in vector databases, allowing for content similarity search.
  4. Multi-Datasources: This feature connects different modules and data sources, facilitating data flow and interaction.
  5. Multi-Agents: Our platform provides Agent and plugin mechanisms, empowering users to customize and enhance the system's behaviour.
  6. Privacy & Security: Rest assured that there is no risk of data leakage, and your data is 100% private and secure.
  7. Text2SQL: We enhance Text-to-SQL performance through Supervised Fine-Tuning (SFT) applied to Large Language Models (LLMs).

SubModule

  • DB-GPT-Hub Text-to-SQL workflow with high performance by applying Supervised Fine-Tuning (SFT) on Large Language Models (LLMs).
  • DB-GPT-Plugins DB-GPT Plugins that can run Auto-GPT plugin directly
  • DB-GPT-Web ChatUI for DB-GPT

Image

🌐 AutoDL Image

Language Switching

In the .env configuration file, modify the LANGUAGE parameter to switch to different languages. The default is English (Chinese: zh, English: en, other languages to be added later).

Contribution

  • Please run black . before submitting the code.
  • To check detailed guidelines for new contributions, please refer how to contribute

RoadMap

KBQA RAG optimization

  • Multi Documents

    • PDF
    • Excel, CSV
    • Word
    • Text
    • MarkDown
    • Code
    • Images
  • RAG

  • Graph Database

    • Neo4j Graph
    • Nebula Graph
  • Multi-Vector Database

    • Chroma
    • Milvus
    • Weaviate
    • PGVector
    • Elasticsearch
    • ClickHouse
    • Faiss
  • Testing and Evaluation Capability Building

    • Knowledge QA datasets
    • Question collection [easy, medium, hard]:
    • Scoring mechanism
    • Testing and evaluation using Excel + DB datasets

Multi Datasource Support

  • Multi Datasource Support
    • MySQL
    • PostgreSQL
    • Spark
    • DuckDB
    • Sqlite
    • MSSQL
    • ClickHouse
    • Oracle
    • Redis
    • MongoDB
    • HBase
    • Doris
    • DB2
    • Couchbase
    • Elasticsearch
    • OceanBase
    • TiDB
    • StarRocks

Multi-Models And vLLM

Agents market and Plugins

  • multi-agents framework
  • custom plugin development
  • plugin market
  • Integration with CoT
  • Enrich plugin sample library
  • Support for AutoGPT protocol
  • Integration of multi-agents and visualization capabilities, defining LLM+Vis new standards

Cost and Observability

Text2SQL Finetune

  • support llms

    • LLaMA
    • LLaMA-2
    • BLOOM
    • BLOOMZ
    • Falcon
    • Baichuan
    • Baichuan2
    • InternLM
    • Qwen
    • XVERSE
    • ChatGLM2
  • SFT Accuracy As of October 10, 2023, through the fine-tuning of an open-source model with 13 billion parameters using this project, we have achieved execution accuracy on the Spider dataset that surpasses even GPT-4!

More Information about Text2SQL finetune

Licence

The MIT License (MIT)

Citation

Please consider citing our project if you find it useful:

@software{db-gpt,
    author = {DB-GPT Team},
    title = {{DB-GPT}},
    url = {https://github.com/eosphoros-ai/DB-GPT},
    year = {2023}
}

Contact Information

We are working on building a community, if you have any ideas for building the community, feel free to contact us.

Star History Chart

db-gpt's People

Contributors

alphahinex avatar aries-ckt avatar cm-liushaodong avatar csunny avatar eltociear avatar fangyinc avatar hpc369 avatar huangzhuxing avatar isadba avatar joecryptotoo avatar lbypatrick avatar oushu1zhangxiangxuan1 avatar qidanrui avatar quqibing avatar qutcat1997 avatar rinne1998 avatar sbabybird avatar sheri528 avatar ssw1999 avatar thebigbone avatar wangzaistone avatar xudafeng avatar xuyuan23 avatar yhjun1026 avatar yihong0618 avatar yiqijiu avatar yjmm10 avatar younisba avatar zfanswer avatar zhanghy-sketchzh avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.