This Project is all about monitoring the performance of ML Models using various tools like:
-
Prometheus: A monitoring system that wil monitor the performance of ML Models and various other metrics like CPU, Memory, Disk andf manily data drift.
-
Grafana: A visulization tool which will take the metrics from Prometheus and will display them in a graph.
-
Evidently: A tool which will capture the whole ml model performance from data drift to concept drifting and a lot.
-
MongoDB: A database that will capture the prediction in real time and drop it in the DB in the form of documents.
-
Prefect: A tool pretty much similar to Apache Airflow, which majorly used for creating data analytics pipelines and automate task and workflows.
-
Docker-Compose: A part of docker to launch multiple containers pretty much easily.
- Docker
- Docker-Compose
- MongoDB
- Create a conda enviroment
conda create -n monitoring python=3.8 -y
- Run
conda activate monitoring
- Install all the dependencies
pip install -r requirements.txt
- Run
prepare.py
to download data, the data is present in Cloundfront CDN and it will be feteched from their and dropped in the root direc.
Start all the services run:
docker compose up
The compose will start the following services:
- prometheus - TSDB for metrics
- grafana - Visual tool for metrics
- mongo - MongoDB, for storing raw data, predictions, targets and profile reports
- evidently_service - Evindently RT-monitoring service
- prediction_service - main service, which makes predictions
Run the following command to send data to the service:
python send_data.py
The script will send each row from the parquet file to the service and perform predictions the prediction will be stored in the target.csv with a unique id generated by uuid4.
A general format is:
885da328-a41b-4c94-8fbd-f54e14f94cab 20.2
All side jobs are running and their are two services running one is prediction_service and the other is evidently_service.Both communicate with the mongo service to store the data.
Run the following command to perform batch inference:
python prefect_example.py
prefect orion start
- Upload
target.csv
to MongoDB - Load a reference dataset into MongoDB
- Fetch both reference and target datasets from MongoDB
- Perform batch inference
- Generate Evidently HTML report
So, this is the end of the project.
- Would like to thank DataTalkClub for the help in creating this project.