In order to set your environment up to run the code here, first install all requirements:
pip install -r requirements.txt
Then, download the 2 models and place them in a folder called ./models
:
- LLM: default is ggml-gpt4all-j-v1.3-groovy.bin / run
/Demos/customLLM.py
(check paths) instead ofstartLLM.py
- Embedding: default to ggml-model-q4_0.bin. / Custom Embeddings model, reference it in
/Demos/customLLM.py
andingest.py
.
This should look like this
└── repo
├── startLLM.py
├── ingest.py
├── source_documents
│ └── dsgvo.txt
├── models
│ ├── ggml-gpt4all-j-v1.3-groovy.bin
│ └── ggml-model-q4_0.bin
└── Demos/
This repo uses a state of the union transcript as an example.
Get your .txt file ready. (PDF, JSON, CSV support in pipeline)
To ingest the data run
python ingest.py <path_to_your_txt_file>
This spins up a local qdrant namespace inside the db
folder containing the local vectorstore. Will take time, depending on the size of your document.
You can ingest as many documents as you want by running ingest
, and all will be accumulated in the local embeddings database. To remove dataset simply remove db
folder.
In order to ask a question, run a command like:
python startLLM.py
And wait for the script to require your input.
> Enter a query:
Hit enter. You'll need to wait 20-30 seconds (depending on your machine) while the LLM model consumes the prompt and prepares the answer. Once done, it will print the answer and the 4 sources it used as context from your documents; you can then ask another question without re-running the script, just wait for the prompt again.
Note: you could turn off your internet connection, and the script inference would still work. No data gets out of your local environment.
Type exit
to finish the script.
Model | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA | Avg. |
---|---|---|---|---|---|---|---|---|
ggml-vic-7b-uncensored | 73.4 | 74.8 | 63.4 | 64.7 | 54.9 | 36.0 | 40.2 | 58.2 |
gpt4all-13b-snoozy q5 | 83.3 | 79.2 | 75.0 | 71.3 | 60.9 | 44.2 | 43.4 | 65.3 |
Model | BoolQ | PIQA | HellaSwag | WinoGrande | ARC-e | ARC-c | OBQA | Avg. |
---|---|---|---|---|---|---|---|---|
GPT4All-J 6B v1.0 | 73.4 | 74.8 | 63.4 | 64.7 | 54.9 | 36.0 | 40.2 | 58.2 |
GPT4All-J v1.1-breezy | 74.0 | 75.1 | 63.2 | 63.6 | 55.4 | 34.9 | 38.4 | 57.8 |
GPT4All-J v1.2-jazzy | 74.8 | 74.9 | 63.6 | 63.8 | 56.6 | 35.3 | 41.0 | 58.6 |
GPT4All-J v1.3-groovy | 73.6 | 74.3 | 63.8 | 63.5 | 57.7 | 35.0 | 38.8 | 58.1 |
GPT4All-J Lora 6B | 68.6 | 75.8 | 66.2 | 63.5 | 56.4 | 35.7 | 40.2 | 58.1 |
all the supported models from here (custom LLMs in Pipeline)
Selecting the right local models and the power of LangChain
you can run the entire pipeline locally, without any data leaving your environment, and with reasonable performance.
-
ingest.py
usesLangChain
tools to parse the document and create embeddings locally usingLlamaCppEmbeddings
. It then stores the result in a local vector database usingQdrant
vector store.
-
startLLM.py
can handle every LLM that is llamacpp compatible (defaultGPT4All-J
). The context for the answers is extracted from the local vector store using a similarity search to locate the right piece of context from the docs.
-
⭕ Adding auto-parser for datatypes (i.e PDF, JSON, MD)
-
⭕ Adding better documentation
-
⭕ Adding support for faster and more secure Retrieval with Contextual Compression Retriever
-
⭕ Custom LLM endpoints via Hugging Face Pipelines see
-
[done] Custom LLM integration via native llamacpp see Demos/*
-
♾️ README.md updates
As an open source project in a rapidly developing field, we are extremely open to contributions, whether it be in the form of a new feature, improved infra, or better documentation.
The contents of this repository are provided "as is" and without warranties of any kind, whether express or implied. We do not warrant or represent that the information contained in this repository is accurate, complete, or up-to-date. We expressly disclaim any and all liability for any errors or omissions in the content of this repository.
Furthermore, this repository may contain links to other repositories or websites, which are not under our control. We do not endorse any of these repositories or websites and we are not responsible for their content or availability. We do not guarantee that any of the links provided on this repository will be free of viruses or other harmful components. We hereby exclude liability for any losses or damages that may arise from the use of any links on this or any linked repository or website.
In particular, we make no express or implied representations or warranties regarding the accuracy, completeness, suitability, reliability, availability, or timeliness of any information, products, services, or related graphics contained in this repository for any purpose. We hereby exclude all conditions, warranties, representations, or other terms which may apply to this repository or any content in it, whether express or implied.
We also hereby exclude any liability for any damages or losses arising from the use of binaries in this repository. You acknowledge and agree that any use of any binaries in this repository is at your own risk.
By using this repository, you are agreeing to comply with and be bound by the above disclaimer. If you do not agree with any part of this disclaimer, please do not use this repository.