This repository highlights the workflow and ease of use to train machine learning or deep learning models using Azure Databricks.
Azure Machine Learning and Azure Databricks are two of the most popular data science tools my clients use on the Azure Platform. Taking advantage of both and deploying models as a batch/streaming service on Databricks and RESTful as an API in the Azure ML Service is easy to do. Therefore, I wrote a blog describing tips and why a developer would choose to deploy using AML Service and Databricks.
Azure Machine Learning and MLFlow are two very similar tools. In this blog I described the advantages and disadvantages of both tools, and how they can be used together with Azure Databricks. Check out my blog discussing the tools at a high-level.
Azure Machine Learning with Azure Databricks
We will be using the popular MNIST dataset, and will be following closely with this Azure Machine Learning Service example of training a model. The walkthrough also teaches users how to deploy models using the Azure Machine Learning service. The linked example is ran using Azure Notebooks, which is an excellent way to use cloud compute resources while staying in a Jupyter environment. The key difference here is that we will be using Azure Databricks to train and deploy our model.
Please complete the following in order for an end to end implementation:
Azure Machine Learning vs MLFlow
Please complete the following in order for an end to end implementation:
- Set up your Azure environment
- Train a machine learning model with MLFlow
- Train a machine learning model with AzureML
- Cross Validation with MLFlow
- Cross Validation with MLFlow and Azure ML
The walkthrough shows how to do manual deployments of models using AML Service, however, one thing to keep in mind is that typically these would be automated with build and release pipelines (Azure DevOps). If a data scientist wishes to deploy the model to a web service environment as shown in the walkthrough I would recommend the pipelines access the models directly from the model management azure machine learning workspace, while if a streaming solution is desired I would access the model directly from a databricks file system mount.
This repo aims to provide an overview of both Azure Databricks and Azure Machine Learning Services. If there are any confusing steps or errors please let me know. Any other comments or questions you can contact me at [email protected].