Giter VIP home page Giter VIP logo

raglib's Introduction

raglib

raglib is a Go library for retrieval-augmented generation, providing a basic set of tools and abstractions for building applications that combine information retrieval and language model-based text generation. (Its currently a work in progress, thoughts and PRs welcome!)

Features

  • Retrieve relevant documents from various sources, such as web search results and vector databases
  • Generate text based on the retrieved documents and user input
  • Customize and extend the library components to fit specific use cases

Examples

There's examples of it being used in a HTTP server contained in ./api, that is then consumed by a NextJS + TypeScript web UI in ./web-client

Installation

To use raglib in your Go project, run:

go get github.com/coopslarhette/raglib

Usage

Retrieving Documents

raglib provides the Retriever interface for retrieving relevant documents based on a given query.

import (
    "context"
    "raglib/lib/document"
)

type Retriever interface {
    Query(ctx context.Context, query string, topK int) ([]document.Document, error)
}

The library includes two implementations of this interface:

  1. SERPRetriever: Retrieves documents by scraping Google Search results pages for a given query using the SERP API.
  2. QdrantRetriever: Retrieves documents from a Qdrant vector database using an text-embedding-ada-002 for query embedding.

An example of how to use the SERPRetriever:

client := retrieval.NewSERPAPIClient("your_api_key")
retriever := retrieval.NewSERPRetriever(client)

query := "your search query"
topK := 10

docs, err := retriever.Query(context.Background(), query, topK)
if err != nil {
    // Handle error
}

for i, d := docs {
    fmt.Printf("document at position %d is title: %v", i, d.Title)
}

Generating Text

raglib also provides the Generator interface for retrieving relevant documents based on a given query.:

import (
	"context"
	"raglib/lib/document"
)

type Generator interface {
    Generate(ctx context.Context, documents []document.Document, responseChan chan<- string) error
}

There is one included implementation of the Generator interface: the Answerer struct for answering input text based on the retrieved documents and user input. The Answerer currently uses the OpenAI API for text generation. A facade that allows various model providers, or local LLMs to be used is forthcoming.

Here's an example of how to use the Answerer:

openaiClient := openai.NewClient("your_api_key")
answerer := generation.NewAnswerer(openaiClient)

seedInput := "user input for text generation"
documents := // Retrieved documents

responseChan := make(chan string)
shouldStream := true

go func() {
    if err := answerer.Generate(ctx, prompt, documents, responseChan, shouldStream); err != nil {
        // Handle error    
    }
}()

// Consume the generated text from the responseChan
for response := range responseChan {
    fmt.Print(response)
}

Document Struct

The document package defines the Document struct, which represents a document retrieved by the Retriever. A Document consists of:

  • Passages: A list of relevant passages from the document
  • Title: The title of the document
  • Source: The type of corpus the document came from (e.g., web, personal)
  • WebReference: Information about the document's web source (if applicable)

Contributing

Contributions to raglib are welcome! If you encounter any issues or have suggestions for improvements, please open an issue or submit a pull request on the GitHub repository.

raglib's People

Contributors

coopslarhette avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.