Giter VIP home page Giter VIP logo

suql's Introduction

SUQL (Structured and Unstructured Query Language)
arXiv Github Stars PyPI version

Conversational Search over Structured and Unstructured Data with LLMs

Online demo: https://yelpbot.genie.stanford.edu

What is SUQL

SUQL stands for Structured and Unstructured Query Language. It augments SQL with several important free text primitives for a precise, succinct, and expressive representation. It can be used to build chatbots for relational data sources that contain both structured and unstructured information. Similar to how text-to-SQL has seen great success, SUQL can be used as the semantic parsing target language for hybrid databases, for instance, for:

An example restaurant relational database

Several important features:

  • SUQL seamlessly integrates retrieval models, LLMs, and traditional SQL to deliver a clean, effective interface for hybrid data access;
    • It utilizes techniques inherent to each component: retrieval model and LM for unstructured data and relational SQL for structured data;
  • Index of free text fields built with faiss, natively supporting all your favorite dense vector processing methods, e.g. product quantizer, HNSW, etc.;
  • A series of important optimizations to minimize expensive LLM calls;
  • Scalability to large databases with PostgreSQL;
  • Support for general SQLs, e.g. JOINs, GROUP BYs.

The answer function

One important component of SUQL is the answer function. answer function allows for constraints from free text to be easily combined with structured constraints. Here is one high-level example:

An example for using SUQL

For more details, see our paper at https://arxiv.org/abs/2311.09818.

Installation / Usage tutorial

There are two main ways of installing the SUQL library.

Install from pip

Ideal for integrating the SUQL compiler in a larger codebase / system. See install_pip.md for details.

Install from source

Ideal for using this repo to build a SUQL-powered conversational interface to your data out-of-the-box, like the one for https://yelpbot.genie.stanford.edu discussed in the paper. See install_source.md for details.

Agent tutorial

Check out conv_agent.md for more information on best practices for using SUQL to power your conversational agent.

Release notes

Check release_notes.md for new release notes.

Bugs / Contribution

If you encounter a problem, first check known_issues.md. If it is not listed there, we welcome Issues and/or PRs!

Paper results

To replicate our results on HybridQA and restaurants in our paper, see paper_results.md for details.

Citation

If you find this work useful to you, please consider citing us.

@inproceedings{liu2024suql,
    title={SUQL: Conversational Search over Structured and Unstructured Data with Large Language Models}, 
    author={Shicheng Liu and Jialiang Xu and Wesley Tjangnaka and Sina J. Semnani and Chen Jie Yu and Monica S. Lam},
    booktitle = {Findings of the Association for Computational Linguistics: NAACL 2024},
    year={2024}
}

suql's People

Contributors

george1459 avatar s-jse avatar liamjxu avatar purrania avatar wesleyttt avatar duskybomb avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.