CASALIOY - Your local langchain toolkit

   Air-gapped LLMs on consumer-grade hardware

LangChain and LlamaCpp (refers to slower imartinez) 👀

Setup your environment

In order to set your environment up to run the code here, first install all requirements:

pip install -r requirements.txt

Then, download the 2 models and place them in a folder called ./models:

LLM: default is ggml-gpt4all-j-v1.3-groovy.bin / run /Demos/customLLM.py (check paths) instead of startLLM.py
Embedding: default to ggml-model-q4_0.bin. / Custom Embeddings model, reference it in /Demos/customLLM.py and ingest.py.

This should look like this

└── repo
      ├── startLLM.py
      ├── ingest.py
      ├── source_documents
      │   └── dsgvo.txt
      ├── models
      │   ├── ggml-gpt4all-j-v1.3-groovy.bin
      │   └── ggml-model-q4_0.bin
      └── Demos/

Test dataset

This repo uses a state of the union transcript as an example.

Ingesting your own dataset

Get your .txt file ready. (PDF, JSON, CSV support in pipeline)

To ingest the data run

python ingest.py <path_to_your_txt_file>

This spins up a local qdrant namespace inside the db folder containing the local vectorstore. Will take time, depending on the size of your document. You can ingest as many documents as you want by running ingest, and all will be accumulated in the local embeddings database. To remove dataset simply remove db folder.

Ask questions to your documents, locally!

In order to ask a question, run a command like:

python startLLM.py

And wait for the script to require your input.

> Enter a query:

Hit enter. You'll need to wait 20-30 seconds (depending on your machine) while the LLM model consumes the prompt and prepares the answer. Once done, it will print the answer and the 4 sources it used as context from your documents; you can then ask another question without re-running the script, just wait for the prompt again.

Note: you could turn off your internet connection, and the script inference would still work. No data gets out of your local environment.

Type exit to finish the script.

LLM options

Optional / Custom models outside of the GPT-J ecosphere (NEW)

Model	BoolQ	PIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	Avg.
ggml-vic-7b-uncensored	73.4	74.8	63.4	64.7	54.9	36.0	40.2	58.2
gpt4all-13b-snoozy q5	83.3	79.2	75.0	71.3	60.9	44.2	43.4	65.3

Optional / Custom models inside of the GPT-J ecosphere

Model	BoolQ	PIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	Avg.
GPT4All-J 6B v1.0	73.4	74.8	63.4	64.7	54.9	36.0	40.2	58.2
GPT4All-J v1.1-breezy	74.0	75.1	63.2	63.6	55.4	34.9	38.4	57.8
GPT4All-J v1.2-jazzy	74.8	74.9	63.6	63.8	56.6	35.3	41.0	58.6
GPT4All-J v1.3-groovy	73.6	74.3	63.8	63.5	57.7	35.0	38.8	58.1
GPT4All-J Lora 6B	68.6	75.8	66.2	63.5	56.4	35.7	40.2	58.1

all the supported models from here (custom LLMs in Pipeline)

How does it work? 👀

Selecting the right local models and the power of LangChain you can run the entire pipeline locally, without any data leaving your environment, and with reasonable performance.

ingest.py uses LangChain tools to parse the document and create embeddings locally using LlamaCppEmbeddings. It then stores the result in a local vector database using Qdrant vector store.
startLLM.py can handle every LLM that is llamacpp compatible (default GPT4All-J). The context for the answers is extracted from the local vector store using a similarity search to locate the right piece of context from the docs.

Pipeline (stuff to do) 🧑‍🎤

⭕ Adding auto-parser for datatypes (i.e PDF, JSON, MD)
⭕ Adding better documentation
⭕ Adding support for faster and more secure Retrieval with Contextual Compression Retriever
⭕ Custom LLM endpoints via Hugging Face Pipelines see
[done] Custom LLM integration via native llamacpp see Demos/*
♾️ README.md updates

💁 Contributing

As an open source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infra, or better documentation.

Star History

Disclaimer

The contents of this repository are provided "as is" and without warranties of any kind, whether express or implied. We do not warrant or represent that the information contained in this repository is accurate, complete, or up-to-date. We expressly disclaim any and all liability for any errors or omissions in the content of this repository.

Furthermore, this repository may contain links to other repositories or websites, which are not under our control. We do not endorse any of these repositories or websites and we are not responsible for their content or availability. We do not guarantee that any of the links provided on this repository will be free of viruses or other harmful components. We hereby exclude liability for any losses or damages that may arise from the use of any links on this or any linked repository or website.

In particular, we make no express or implied representations or warranties regarding the accuracy, completeness, suitability, reliability, availability, or timeliness of any information, products, services, or related graphics contained in this repository for any purpose. We hereby exclude all conditions, warranties, representations, or other terms which may apply to this repository or any content in it, whether express or implied.

We also hereby exclude any liability for any damages or losses arising from the use of binaries in this repository. You acknowledge and agree that any use of any binaries in this repository is at your own risk.

By using this repository, you are agreeing to comply with and be bound by the above disclaimer. If you do not agree with any part of this disclaimer, please do not use this repository.

twinky-kms / casalioy Goto Github PK

casalioy's Introduction

CASALIOY - Your local langchain toolkit

`Air-gapped LLMs on consumer-grade hardware`

LangChain and LlamaCpp (refers to slower imartinez) 👀

Setup your environment

Test dataset

Ingesting your own dataset

Ask questions to your documents, locally!

LLM options

Optional / Custom models outside of the GPT-J ecosphere (NEW)

Optional / Custom models inside of the GPT-J ecosphere

How does it work? 👀

Pipeline (stuff to do) 🧑‍🎤

💁 Contributing

Star History

Disclaimer

casalioy's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent