Giter VIP home page Giter VIP logo

diagonsenfix's Introduction

Diagonsenfix

How to diagnose and fix problem in a production deployed code

The important points

To see how the model works and find problems and fixing them there.Three things matter the most
1.We need to plan for periodic training.We have to take care for optimal retraining strategy.
2.Monitor for actual performance.We should keep an eye on how the production model is performing
3.Analyse.Detailed visibility of the model helps us and guide for the model performance.

Summary

In this section we will explore ways to diagnose problems in production deployed code and fix them.
1.We will use evidently and mlflow library together.
2.We will deploy ml models in heroku.
First we will calculate data drift for the model.
To diagnose and fix problems the ml training experiments that do data drift are tracked by mlflow tracking
To fix the problems we we will explore the results using mflow ui.

Exercise: Data Drift/Mlflow

in this exercise we will be using the bike sharing dataset.We will load the data first

content = requests.get("https://archive.ics.uci.edu/ml/machine-learning-databases/00275/Bike-Sharing-Dataset.zip").content
with zipfile.ZipFile(io.BytesIO(content)) as arc:
    raw_data = pd.read_csv(arc.open("day.csv"), header=0, sep=',', parse_dates=['dteday'], index_col='dteday')
#observe data structure
raw_data.head()

The same is shown in the train.py file where we can see the data loading option

Environment setup

Make sure we have a heroku account.Also we need active github account too.
The github repo has the following files.
1)requirements.txt
2)train.py
3)Procfile(for heroku)
4)runtime.txt

Steps

The required steps for proper deployment of the repo are as follows
1)make sure all requirements for mlflow and evidently library are met.
2)Load the data in train.py script file
3)Define column mapping
4)The logic to log the metrics
5)Define the comparison window

experiment_batches = [
    ('2011-02-01 00:00:00','2011-02-28 23:00:00'),
    ('2011-03-01 00:00:00','2011-03-31 23:00:00'),
    ('2011-04-01 00:00:00','2011-04-30 23:00:00'),
    ('2011-05-01 00:00:00','2011-05-31 23:00:00'),  
    ('2011-06-01 00:00:00','2011-06-30 23:00:00'), 
    ('2011-07-01 00:00:00','2011-07-31 23:00:00'), 
]

Also mention the reference dates for experiments

reference_dates = ('2011-01-01 00:00:00','2011-01-28 23:00:00')

6)Run and log experiments in Mlflow
7)view the results in mlflow webui

Exercise questions

Q1) To initiate the training in heroku what commands need to be provided in Procfile to initiate the python script in this exercise <br .> a)python main.py
b)python train.py
c)python ab.py
d)None of these

Answer b)
Q2)To initiate mlflow within heroku which command is passed with the training parameter
a)python main.py --host 0.0.0.0 --port ${PORT}
b)python main.py & mlflow --host 0.0.0.0
c)web: python train.py & mlflow ui --host 0.0.0.0 --port ${PORT}
d)None of these
Answer c)
update the answer in the procfile where it is mentioned ##inserthere## and save the file. Q3)What additonal should be added inside requirements.txt file such that it is able to retrieve the mlflow ui
a)mlflow ui
b)ui
c)mlflow
d)None of these
Answer c) update the same in the repo(github) inside requirements.txt file and save it.
Q4)lets focus on train.py file,we are missing essential import for mlflow.What are these
a)import mlflow
b)import ui
c)None of these
Answer a) Add the same in the ##insert here## option in the import section and save the file
Q5)in the train.py script file what is the way to get inside mlflow
a) c = mlclient()
b) client = MlflowClient()
c)None of these
Answer b) add the code within train.py file in the ##insertHere## where it is mentioned login inside mlflow client and save it.
Q6)in the train.py script file what will be the code to start mlflow experiment
a)mlflow.experiment
b)mlflow.start
c)mlflow.set_experiment('Experiment Name')
d)None of these
Answer c) Update the code in the train.py script under set experiment section where it is mentioned ##inserthere## and save the file.
##Solution
The solution file is hosted in github [a link] (https://github.com/AbhiLegend/mlflowsolution)
Login to heroku using the free account.Similarly as shown before create a new app.Deployment method would be github.Make sure the repo [a link] (https://github.com/AbhiLegend/mlflowsolution) is forked
Search and add the forked repo.Uploading and hosting the files would take time.You will see the update link.Goto the link and you will see the mlflow ui.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.