ML pipeline to predict the optimal pricing for AirBnb rentals
Based heavily on the work of Raki Rahman: https://www.linkedin.com/in/mdrakiburrahman
We're going to demo the three core components that make up mlflow:
Note: you don't have to use all three, each feature can be used independently.
This allows us to log all aspects of the ML process - like different hyperparameters we tried, evaluation metrics, as well as the code we ran - alongside other arbitrary artifacts such as test data.
This also provides a leaderboard-style UI that makes it easy to see which model performed the best.
These are all about reproducibility and sharing. They combine GIT, the environment/model framework, either conda or docker and the specification that makes the code re-runnable.
An abstraction that allows us to create/export models from any open source framework via the Tracking and Projects abstractions. We can also export them to a standard format that can be deployed to any number of systems. Since most deployment systems use some sort of container based solution (e.g. AzureML or Sagemaker), models make easy deployments to these systems - or we can deploy directly to Kubernetes or Azure Container Registry.
In this notebook we will demonstrate the following topics:
In this case, we'll be using the "Inside Airbnb" dataset, and loading it from a csv from an Azure Storage Container.
Like plotting on a heatmap to get a better sense of the data.
We perform multiple experiments using scikit-learn's Random Forest Regressor and log the models on MLflow to demonstrate the tracking capabilities.
We will define these components that makes up an MLflow Project.:
- MLProject file
- Conda file
- Run script
We will also load and run a Project straight from git to demonstrate git integration capabilities.
We explore the power of model flavors and framework abstraction capabilities available with MLflow models.
Step 6: Production Serving
Demo: Containerize the trained model and deploy to Azure Container Instances
We will build a Docker Container Image for a trained model and deploy to Azure Container Instance (can easily swap to Kubernetes as well).
We use an HTTP call and Postman to make a prediction against a test payload.