Giter VIP home page Giter VIP logo

lopezco / ml-flask-api Goto Github PK

View Code? Open in Web Editor NEW
14.0 8.0 7.0 5.74 MB

A simple template of a Python API (web-service) for real-time Machine Learning predictions, using scikitlearn-like models, Flask and Docker.

Home Page: https://lopezco.github.io/ml-flask-api

License: MIT License

Dockerfile 1.03% Makefile 1.60% Batchfile 2.23% Python 95.14%
flask-api docker interpretable-machine-learning prediction-api sklearn-compatible

ml-flask-api's Introduction

Flask template for Machine Learning model deployment

A simple example of a Python web service for real time machine learning model deployment. It is based on this post

This includes Docker integration and SHAP explanations for the deployed model.

Website | Source | Issues

Installation

Requirements

Before using

Make sure that you have a model in the main directory. You can launch the example using the following line in order to create a quick classification model.

$ python ./example/build_linear_binary.py

or one of the scripts in the ./example folder

Configuration

  • variables.env: Controls API parameters via environment variables

  • requirements.txt: Controls Python packages installed inside the container

  • model.joblib: Model saved inside a dictionary with this format

    {
        "model": trained_model,
        "metadata": {"features": [
            {"name": "feature1", "type": "numeric", "accepts_missing": True},
            {"name": "feature2", "type": "numeric", "default": -1, "accepts_missing": False},
            {"name": "feature3", "type": "category", "categories": ["A", "B"], "accepts_missing": True}]}
    }

Run the service

On Docker

Build the image (this has to be done every time the code or the model changes)

$ docker-compose build

Create and run the container

$ docker-compose up

On local Python environment

Create the environment

$ conda create -n flask_ml python=3
$ conda activate flask_ml

Install requirements

$ pip install -r ./requirements-service.txt  
$ pip install -r ./requirements.txt  

Run the API service

$ python service.py  

Usage of the API

This example considers that the API was launched locally without docker and with the default parameters (localhost at port 5000) and its calling the example model.

For /predict endpoint the JSON string in the payload of hte request can take two forms:

  1. The first, the payload is a record or a list of records with one value per feature. This will be directly interpreted as the input for the model.

  2. The second, the payload is a dictionary with 1 or 2 elements. The key _data is mandatory because this will be the input for the model and its format is expected to be a record or a list of records. On the other hand, the key _samples (optional) will be used to obtain different explanations.

If _samples is not given, then the explanations returned are the raw output of the trees, which varies by model (for binary classification in XGBoost this is the log odds ratio). On the contrary, if _samples is given, then the explanations are the output of the model transformed into probability space (note that this means the SHAP values now sum to the probability output of the model). See the SHAP documentation for details.

Check the API's health status

Endpoint: /health

$ curl -X GET http://localhost:5000/health
up

Is model ready?

Endpoint: /ready

$ curl -X GET http://localhost:5000/ready
ready

Get information about service

Endpoint: /service-info

$ curl -X GET http://localhost:5000/service-info
{
  "debug": true,
  "running-since": 1563355369.6482198,
  "serving-model-name": "model.joblib",
  "serving-model-type": "SKLEARN_MODEL",
  "version-template": "2.2.0"
}

Get information about the model

Endpoint: /info

$ curl -X GET http://localhost:5000/info
{
  "metadata": {
    "features": [
      {
        "default": -1,
        "importance": 0.2,
        "name": "feature1",
        "type": "numeric"
      },
      {
        "default": -1,
        "importance": 0.1,
        "name": "feature2",
        "type": "numeric"
      },
      {
        "default": -1,
        "importance": 0.3,
        "name": "feature3",
        "type": "numeric"
      }
    ]
  },
  "model": {
    "type": "<class 'sklearn.ensemble.forest.RandomForestClassifier'>",
    "predictor_type": "<class 'sklearn.ensemble.forest.RandomForestClassifier'>",
    "is_explainable": false,
    "task": "BINARY_CLASSIFICATION",
    "class_names": ["0", "1"]
  }
}

Compute predictions

Endpoint: /predict

$ curl -d '[{"feature1": 1, "feature2": 1, "feature3": 2}, {"feature1": 1, "feature2": 1, "feature3": 2}]' -H "Content-Type: application/json" -X POST http://localhost:5000/predict
{
  "prediction": [0, 0]
}

Predict probabilities

Endpoint: /predict?proba=1

$ curl -d '{"feature1": 1, "feature2": 1, "feature3": 2}' -H "Content-Type: application/json" -X POST "http://localhost:5000/predict?proba=1"
{
  "prediction": [{
    "0": 0.8,
    "1": 0.2
  }]
}

Get features of the Model with features importances

Endpoint: /features

$ curl -X GET "http://localhost:5000/features"
[
  {
    "default": -1,
    "importance": 0.2,
    "name": "feature1",
    "type": "numeric"
  },
  {
    "default": -1,
    "importance": 0.1,
    "name": "feature2",
    "type": "numeric"
  },
  {
    "default": -1,
    "importance": 0.3,
    "name": "feature3",
    "type": "numeric"
  }
]

Get SHAP explanations

Endpoint: /predict?proba=1&explain=1

$ curl -d '{"feature1": 1, "feature2": 1, "feature3": 2}' -H "Content-Type: application/json" -X POST "http://localhost:5000/predict?proba=1&explain=1"
{
  "explanation": {
    "feature1": 0.10000000149011613,
    "feature2": 0.03333333383003871,
    "feature3": -0.1666666691501935
  },
  "prediction": [{
    "0": 0.7,
    "1": 0.3
  }]
}

ml-flask-api's People

Contributors

amirpupko avatar lopezco avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ml-flask-api's Issues

Add categorical variable example model and in README file

Add a LightGBM model example with categorical variables.
Ex:

from lightgbm import LGBMClassifier
import pandas as pd
import joblib

df = pd.DataFrame({"feature1": [1.,1.,5.], "feature2": [2.,2.,5.], "feature3": ["B","B","A"]})
df['feature3'] = df['feature3'].astype('category')

model_path = './model.joblib'

model = LGBMClassifier()
model.fit(df, [0, 0, 1])
to_save = dict(model=model,
               metadata={"features": [
                   {"name": "feature1", "type": "numeric"},
                   {"name": "feature2", "type": "numeric", "default": -1},
                   {"name": "feature3", "type": "category", "categories": ["A", "B"]}]})

with open(model_path, 'wb') as fo:
    joblib.dump(to_save, fo)

Add something like the following in the README

{
    "model": trained_model,
    "metadata": {"features": [
                            {"name": "feature1", "type": "numeric"},
                            {"name": "feature2", "type": "numeric", "default": -1},
                            {"name": "feature3", "type": "category", "categories": ["A", "B"]}]}
}

Add support for LightGBM's native Data Structure API

Training LightGBM models on weighted datasets requires the native Data Structure API. The template, however, implements only the Scikit-learn API (LGBMClassifier and LGBMRegressor) but not the native API (lighgbm.Dataset and lightgbm.Booster). Maybe there is an analogue solution for weighting Pandas dataframes. If not, support for LightGBM's Data Structure API would allow to weight datasets prior to model training.

No error message if unknown categories are fed for predictions

If unknown categories are fed for predictions, the API will treat them as missing.
This may be a unwanted action.

Ex:

import pandas as pd
from pandas.api.types import CategoricalDtype
var_type = CategoricalDtype(categories=[1,2], ordered=True)
df = pd.DataFrame({'A': [3,2,3], 'B': [2,2,1]})
df
Out:
        A      B
0	3	2
1	2	2
2	3	1
df['A'].astype(var_type)
Out:
0    NaN
1      2
2    NaN
Name: A, dtype: category
Categories (2, int64): [1 < 2]

Two things can be done:

  • Leave like that but warn the user
  • Throw an error

Helper function to extract a basic version of the metadata from a pandas DataFrame.

Add helper function to extract a basic version of the metadata from a pandas DataFrame.
Ex:

from pandas.api.types import is_string_dtype, is_numeric_dtype, is_categorical_dtype

def metadata_from_dataframe(df):
  metadata = []
  for c in df.columns:
    if is_categorical_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'category',
          'categories': sorted(df[c].dtype.categories.values.tolist())})
    elif is_numeric_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'numeric'})
    elif is_string_dtype(df[c]):
      metadata.append({
          'name': c,
          'type': 'string'})
    else:
      raise ValueError('Unknown type for {}'.format(c))
  return metadata

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.