Giter VIP home page Giter VIP logo

sparrow's Introduction

Sparrow

Data extraction with ML and LLM

The Principle

Sparrow is an innovative open-source solution designed for efficient data extraction and processing from various documents and images. It seamlessly handles forms, invoices, receipts, and other unstructured data sources. Sparrow stands out with its modular architecture, offering independent services such as OCR, Donut fine-tuning/inference, and a data labeling UI, all optimized for robust performance. Our current development efforts are focused on enhancing the LLM pipeline (Lemming), promising exciting new features and capabilities. Our vision for Sparrow is to become the leading tool in data extraction, catering to diverse business domains. With a strong commitment to local data processing, we aim to empower customers with secure, cutting-edge technology. Join us in this journey to redefine data handling in the enterprise world.

Services

  • sparrow-data-donut - This service focuses on data preparation specifically for the Donut ML model, including fine-tuning and OCR integration.
  • sparrow-data-ocr - A standalone OCR service, providing robust optical character recognition as part of the Sparrow suite.
  • sparrow-ml-donut - Dedicated to the Donut ML model, this service handles both fine-tuning and inference, streamlining the machine learning workflow.
  • sparrow-ml-lemming - A specialized service for the LLM RAG pipeline, enhancing the capabilities of language model processing.
  • sparrow-ui-donut - A user-friendly interface for managing Donut ML model data labeling services and a dashboard.

Summary:

  • For LLM RAG Enthusiasts - Opt for the lemming service, specifically designed to cater to your needs in LLM RAG applications.
  • For Traditional ML Implementations - The donut service is available for those seeking machine learning solutions independent of LLM.

Sparrow offers a diverse range of services, as outlined previously. Our current developmental focus is primarily on enhancing and expanding the capabilities of the lemming service.

Installation

You have the flexibility to install either the Lemming or the Donut service independently. Each service is designed to operate as a standalone entity, without any dependencies on the other. This modular approach ensures that you can select the service that best meets your specific needs.

Lemming

  1. Install Weaviate local DB with Docker:
docker compose up -d
  1. Install the requirements:
pip install -r requirements.txt
  1. Install Ollama and pull LLM model specified in config.yml

Donut

Follow the install steps outlined here:

  1. Donut Data install steps

  2. Donut ML install steps

  3. Donut UI install steps

OCR

Follow the install steps outlined here:

  1. Sparrow OCR services install steps

Usage

Lemming

  1. Copy text PDF files to the data folder or use the sample data provided in the data folder.

  2. Run the script, to convert text to vector embeddings and save in Weaviate. By default it will use LlamaIndex plugin:

./sparrow.sh ingest

You can specify plugin name explicitly, for example:

./sparrow.sh ingest Haystack
./sparrow.sh ingest LlamaIndex
  1. Run the script, to process data with LLM RAG and return the answer:
./sparrow.sh "invoice_number, invoice_date, client_name, client_address, client_tax_id, seller_name, seller_address,
seller_tax_id, iban, names_of_invoice_items, gross_worth_of_invoice_items, total_gross_worth" "int, str, str, str, str,
str, str, str, str, List[str], List[float], str"

Answer:

{
    "invoice_number": 61356291,
    "invoice_date": "09/06/2012",
    "client_name": "Rodriguez-Stevens",
    "client_address": "2280 Angela Plain, Hortonshire, MS 93248",
    "client_tax_id": "939-98-8477",
    "seller_name": "Chapman, Kim and Green",
    "seller_address": "64731 James Branch, Smithmouth, NC 26872",
    "seller_tax_id": "949-84-9105",
    "iban": "GB50ACIE59715038217063",
    "names_of_invoice_items": [
        "Wine Glasses Goblets Pair Clear Glass",
        "With Hooks Stemware Storage Multiple Uses Iron Wine Rack Hanging Glass",
        "Replacement Corkscrew Parts Spiral Worm Wine Opener Bottle Houdini",
        "HOME ESSENTIALS GRADIENT STEMLESS WINE GLASSES SET OF 4 20 FL OZ (591 ml) NEW"
    ],
    "gross_worth_of_invoice_items": [
        66.0,
        123.55,
        8.25,
        14.29
    ],
    "total_gross_worth": "$212,09"
}

FastAPI Endpoint for Local LLM RAG

Sparrow enables you to run a local LLM RAG as an API using FastAPI, providing a convenient and efficient way to interact with our services.

To set this up:

  1. Start the Endpoint

Launch the endpoint by executing the following command in your terminal:

python api.py
  1. Access the Endpoint Documentation

You can view detailed documentation for the API by navigating to:

http://127.0.0.1:8000/api/v1/sparrow-llm/docs

For visual reference, a screenshot of the FastAPI endpoint

FastAPI endpoint

Example of API call through CURL

curl -X 'POST' \
  'http://127.0.0.1:8000/api/v1/sparrow-llm/inference' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "fields": "invoice_number",
  "types": "int"
}'

Donut

Follow the steps outlined here:

  1. Donut Data usage steps

  2. Donut ML usage steps

  3. Donut UI usage steps

OCR

Follow the steps outlined here:

  1. Sparrow OCR services usage steps

Examples

Inference with local LLM RAG

Request:

./sparrow.sh "invoice_number, invoice_date, client_name, client_address, client_tax_id, seller_name, seller_address,
seller_tax_id, iban, names_of_invoice_items, gross_worth_of_invoice_items, total_gross_worth" "int, str, str, str, str,
str, str, str, str, List[str], List[float], str"

Response:

local RAG

Inference with Donut ML model

Sparrow UI:

Inference Results

Commercial usage

Sparrow is available for free commercial use. This offer applies to organizations with a gross revenue below $5 million USD in the past 12 months.

For businesses exceeding this revenue limit and seeking to bypass GPL license restrictions for inference, please contact me at [email protected] to discuss dual licensing options. The same applies if you are looking for custom workflows to automate business processes, consulting options, and support/maintenance options.

Author

Katana ML, Andrej Baranovskij

License

Licensed under the Apache License, Version 2.0. Copyright 2020-2024 Katana ML, Andrej Baranovskij. Copy of the license.

sparrow's People

Contributors

abaranovskis-redsamurai avatar maxatapplied avatar maxatmpc avatar shrey10926 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.