This repo maps out an MLOps infrastructure utilizing the following technologies:
- Jupyter notebooks for hosting and managing data science and machine learning activities
- MLFlow for tracking experiments, experiment runs, experiment metadata (input parameters and output metrics), experiment artifacts (curated datasets, etc.) and trained models
- Azure blob storage as artifact backing store
- SQL Server as experiment metadata backing store
- All services run as Docker containers on your local workstation
- Docker Desktop
- Bash or other Unix-style shell (WSL2 on Windows)
-
clone this repo
-
open a shell in the repo folder root
-
run start.local.sh
> ./start.local.sh
-
wait for all containers to build and start (may take a few minutes)
-
follow the remaining common instructions
- Jupyter and MLFlow servers run as Docker containers on your local workstation
- Cloud-based Azure storage
- Cloud-based SQL storage
- Docker Desktop
- Bash or other Unix-style shell (WSL2 on Windows)
- A new or existing Azure storage account (how to create a new one)
- Add a new blob container to the storage account called artifacts
- A new Azure SQL Database called mlflowmetadata
-
clone this repo
-
open a shell in the repo folder root
-
edit start.cloud.sh and ensure the AZURE_STORAGE_CONNECTION_STRING and MLFLOW_DEFAULT_ARTIFACT_ROOT variables point to your storage account/container, and the MLFLOW_BACKEND_STORE_URI variable points to your SQL database
-
run start.cloud.sh
> ./start.cloud.sh
-
wait for all containers to build and start (may take a few minutes)
-
follow the remaining common instructions
- All services run in Azure cloud
- Enables multi-user collaboration and durable storage of metadata and artifacts
- TBD
- open JupyterLab by navigating to http://localhost:8888/?token=dtic
- open the train_and_serve_model.ipynb notebook and run all cells, in order... this will train and deploy a sample ML model to a local REST endpoint
- open another browser tab and navigate to the MLFlow UI at http://localhost:5000
- view the details of run1 under experiment1
- go back to Jupyter and open invoke_model_endpoint.ipynb to invoke the model REST endpoint