Giter VIP home page Giter VIP logo

tanvir-ishraq / healifyai--llm-based-healthcare-system Goto Github PK

View Code? Open in Web Editor NEW
21.0 1.0 4.0 6.41 MB

Leverages extensive power of multiple Machine Learning algorithms & LLM to provide in-depth answers to medical queries and predicts condition/diseases based on patient symptoms

Home Page: https://healifyai-llm.onrender.com

License: MIT License

Python 1.45% Jupyter Notebook 98.55%
classification llm question-answering torch flask-server huggingface-transformers api-integration bootstrap-frontend selenium-scraper bert-language-model nlp-deep-learning scikit-learn-random-forest query-processing

healifyai--llm-based-healthcare-system's Introduction

HealifyAI - LLM based Healthcare System

This project aims to develop a comprehensive healthcare system to aid healthcare professionals. While also providing knowledge to patients. It uses a LLM and traditional Machine Learning (ML) to provide in-depth answers to medical health condition queries and can predict diseases based on patient symptoms.
The system consists of two main Modules:

  1. Disease Prediction Model
  2. HealifyLLM - Q&A Language Model

Data Collection, Cleaning, Preprocessing

Healify-LLM model:
  • Engineered brand new LLM Corpus Dataset of size 6800 samples from scratch. Started with scraping based on healthline.com
  • To enhance the corpus for user experience, Sample addition was done with my python script. Enabling it to provide detailed and accurate answers to a wide range of user questioning styles.
Disease Model:

The model was trained on a kaggle dataset from Disease-Symptom Knowledge Database, a database with over hundreds of patient records at the New York Presbyterian Hospital, USA.
Covering 135 Categories of important common but also rare diseases/health conditions.
From a total of 400 Symptoms.

Disease dataset was processed to clean the noisy symptoms, UMLScode etc.
LLM dataset processing required data seperation, sample addition.
The scraping can be found in scraper folder.
All final datasets stored in datasets folder.
All cleaning and processing found in notebooks folder.

Model Training:

Healify-LLM model:

Hyperparameters: We used a batch size of 8. And learning rate was set dynamically using Fast.ai's learning rate finder at every stage for Fine-Tuning.

Training Procedures: We used HuggingFace for the model and imported Fast.ai for hyperparameter tuing

  • RoBERTa model has been used because the QA dataset is complex.
  • Training was done using ULMFiT Research Paper's 3-stage training policy.
  • The model was fine-tuned with 6800 samples around 98% accuracy in 12 epochs. The model was tracked to avoid overfitting observing loss. The model was trained using NVIDIA T4 GPU. With good sample ratio, as total 6800 samples for just 135 diseases.
Disease Model:

Model was trained with sklearn's ensemble random forest algorithm leveraging mutiple decision tree algorithms. This model predicts potential diseases based on the symptoms input.

Model Deployment

A Gradio App was coded to deploy the LLM model in HuggingFace. This live huggingfaces API is later integrated in the Back-End. The Gradio implementation can be found in deployment_hf folder and online here

Live Website Deployment

Deployed a Flask App, built to provide user interface to users. Check the flask-deployment github branch. The website is live here

Future work and limitations

The combination of these two components allows for a robust interactive healthcare system that can assist both patients and healthcare professionals in diagnosing diseases, finding relevant medical information and diseases relation potentially. The system is designed to be user-friendly, with an intuitive interface that makes it easy for anyone to use. Please note that while this system can provide valuable insights and information, it is not intended to replace professional medical advice. Always consult with a healthcare professional for medical concerns.

I do have plan to incorporate a more powerful model for a more seamless interactive experience. Currently I'm occupied with working with a startup and their project. In my free time, I do hope to execute it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.