Giter VIP home page Giter VIP logo

shiftrdw / retrieval-augmented-generation-engine-with-langchain-and-streamlit Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mirabdullahyaser/retrieval-augmented-generation-engine-with-langchain-and-streamlit

0.0 0.0 0.0 11.36 MB

Powerful web application that combines Streamlit, LangChain, and Pinecone to simplify document analysis. Powered by OpenAI's GPT-3, RAG enables dynamic, interactive document conversations, making it ideal for efficient document retrieval and summarization.

Home Page: https://retrieval-augmented-generation.streamlit.app/

Python 100.00%

retrieval-augmented-generation-engine-with-langchain-and-streamlit's Introduction

Retrieval Augmented Generation Engine using LangChain, Streamlit, & Pinecone

Access application on Streamlit Cloud Platform

Demo

Overview

The Retrieval Augmented Engine (RAG) is a powerful tool for document retrieval, summarization, and interactive question-answering. This project utilizes LangChain, Streamlit, and Pinecone to provide a seamless web application for users to perform these tasks. With RAG, you can easily upload multiple PDF documents, generate vector embeddings for text within these documents, and perform conversational interactions with the documents. The chat history is also remembered for a more interactive experience.

Features

  • Streamlit Web App: The project is built using Streamlit, providing an intuitive and interactive web interface for users.
  • Input Fields: Users can input essential credentials like OpenAI API key and Pinecone API key through dedicated input fields.
  • Document Uploader: Users can upload multiple PDF files, which are then processed for further analysis.
  • Document Splitting: The uploaded PDFs are split into smaller text chunks, ensuring compatibility with models with token limits.
  • Vector Embeddings: The text chunks are converted into vector embeddings, making it easier to perform retrieval and question-answering tasks.
  • Flexible Vector Storage: You can choose to store vector embeddings either in Pinecone or a local vector store, providing flexibility and control.
  • Interactive Conversations: Users can engage in interactive conversations with the documents, asking questions and receiving answers. The chat history is preserved for reference.

Prerequisites

Before running the project, make sure you have the following prerequisites:

  • Python 3.7+
  • LangChain
  • Streamlit
  • Pinecone
  • An OpenAI API key
  • PDF documents to upload

Usage

  1. Clone the repository to your local machine:

    git clone https://github.com/mirabdullahyaser/Retrieval-Augmented-Generation-Engine-with-LangChain-and-Streamlit.git
    cd Retrieval-Augmented-Generation-Engine-with-LangChain-and-Streamlit
  2. Install the required dependencies by running:

    pip install -r requirements.txt
  3. Run the Streamlit app:

    streamlit run src/rag_engine.py
  4. Access the app by opening a web browser and navigating to the provided URL.

  5. Input your OpenAI API key, Pinecone API key, Pinecone environment, and Pinecone index name in the respective fields. You can provide them either in the sidebar of the application or place them in the secrets.toml file in the .streamlit directory

  6. Upload the PDF documents you want to analyze.

  7. Click the "Submit Documents" button to process the documents and generate vector embeddings.

  8. Engage in interactive conversations with the documents by typing your questions in the chat input box.

Contributors

Mir Abdullah Yaser

Contact

If you have any questions, suggestions, or would like to discuss this project further, feel free to get in touch with me:

I'm open to collaboration and would be happy to connect!

retrieval-augmented-generation-engine-with-langchain-and-streamlit's People

Contributors

mirabdullahyaser avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.