To see how the model works and find problems and fixing them there.Three things matter the most
1.We need to plan for periodic training.We have to take care for optimal retraining strategy.
2.Monitor for actual performance.We should keep an eye on how the production model is performing
3.Analyse.Detailed visibility of the model helps us and guide for the model performance.
In this section we will explore ways to diagnose problems in production deployed code and fix them.
1.We will use evidently and mlflow library together.
2.We will deploy ml models in heroku.
First we will calculate data drift for the model.
To diagnose and fix problems the ml training experiments that do data drift are tracked by mlflow tracking
To fix the problems we we will explore the results using mflow ui.
in this exercise we will be using the bike sharing dataset.We will load the data first
content = requests.get("https://archive.ics.uci.edu/ml/machine-learning-databases/00275/Bike-Sharing-Dataset.zip").content
with zipfile.ZipFile(io.BytesIO(content)) as arc:
raw_data = pd.read_csv(arc.open("day.csv"), header=0, sep=',', parse_dates=['dteday'], index_col='dteday')
#observe data structure
raw_data.head()
The same is shown in the train.py file where we can see the data loading option
Make sure we have a heroku account.Also we need active github account too.
The github repo has the following files.
1)requirements.txt
2)train.py
3)Procfile(for heroku)
4)runtime.txt
The required steps for proper deployment of the repo are as follows
1)make sure all requirements for mlflow and evidently library are met.
2)Load the data in train.py script file
3)Define column mapping
4)The logic to log the metrics
5)Define the comparison window
experiment_batches = [
('2011-02-01 00:00:00','2011-02-28 23:00:00'),
('2011-03-01 00:00:00','2011-03-31 23:00:00'),
('2011-04-01 00:00:00','2011-04-30 23:00:00'),
('2011-05-01 00:00:00','2011-05-31 23:00:00'),
('2011-06-01 00:00:00','2011-06-30 23:00:00'),
('2011-07-01 00:00:00','2011-07-31 23:00:00'),
]
Also mention the reference dates for experiments
reference_dates = ('2011-01-01 00:00:00','2011-01-28 23:00:00')
6)Run and log experiments in Mlflow
7)view the results in mlflow webui
Q1) To initiate the training in heroku what commands need to be provided in Procfile to initiate the python script in this exercise <br .>
a)python main.py
b)python train.py
c)python ab.py
d)None of these
Answer b)
Q2)To initiate mlflow within heroku which command is passed with the training parameter
a)python main.py --host 0.0.0.0 --port ${PORT}
b)python main.py & mlflow --host 0.0.0.0
c)web: python train.py & mlflow ui --host 0.0.0.0 --port ${PORT}
d)None of these
Answer c)
update the answer in the procfile where it is mentioned ##inserthere## and save the file.
Q3)What additonal should be added inside requirements.txt file such that it is able to retrieve the mlflow ui
a)mlflow ui
b)ui
c)mlflow
d)None of these
Answer c) update the same in the repo(github) inside requirements.txt file and save it.
Q4)lets focus on train.py file,we are missing essential import for mlflow.What are these
a)import mlflow
b)import ui
c)None of these
Answer a) Add the same in the ##insert here## option in the import section and save the file
Q5)in the train.py script file what is the way to get inside mlflow
a) c = mlclient()
b) client = MlflowClient()
c)None of these
Answer b) add the code within train.py file in the ##insertHere## where it is mentioned login inside mlflow client and save it.
Q6)in the train.py script file what will be the code to start mlflow experiment
a)mlflow.experiment
b)mlflow.start
c)mlflow.set_experiment('Experiment Name')
d)None of these
Answer c) Update the code in the train.py script under set experiment section where it is mentioned ##inserthere## and save the file.
##Solution
The solution file is hosted in github [a link] (https://github.com/AbhiLegend/mlflowsolution)
Login to heroku using the free account.Similarly as shown before create a new app.Deployment method would be github.Make sure the repo [a link] (https://github.com/AbhiLegend/mlflowsolution) is forked
Search and add the forked repo.Uploading and hosting the files would take time.You will see the update link.Goto the link and you will see the mlflow ui.