aslotte / mlops.net Goto Github PK

View Code? Open in Web Editor NEW

80.0 17.0 30.0 10.99 MB

A machine learning model operations and management tool for ML.NET

Home Page: https://www.nuget.org/packages/MLOps.NET

License: MIT License

C# 100.00%

mlops machinelearning mlnet dotnetcore modelrepository kubernetes-deployment csharp kubernetes-cluster azure aws

mlops.net's Introduction

⚡ MLOps.NET

MLOps.NET is a data science tool to track and manage the lifecycle of a ML.NET machine learning model.

Experiment tracking (SQLite, SQLServer, CosmosDb)
- Experiments
- Runs
- Training time
- Evaluation metrics
- Hyper parameters
Data tracking
- Data schema
- Data quantity
- Data hash
- Data distribution
Model repostiory (Azure Blob Storage, AWS S3, local)
- Run artifacts
- Versioned registered models
Model deployment (Azure Blob Storage, AWS S3, local)
- URI based deployment
- Containerized deployments to a Kubernetes cluster
- Manual deployment (in roadmap)

A client application to vizualize and manage the ML lifecycle is currently in the roadmap to be worked on.

Articles

Getting started

MLOps.NET revolves around an MLOpsContext. The MLOpsContext contains catalogs for e.g.Lifecycle, Data, Training, Evaluation and Deployment to access operations helpful to manage your model's lifecycle.

To create an MLOpsContext, use the MLOpsBuilder with your desired configuration. You can mix and match the location of your model repository and metadata store as you please.

Azure with CosmosDb

  IMLOpsContext mlOpsContext = new MLOpsBuilder()
    .UseCosmosDb("accountEndPoint", "accountKey")
    .UseAzureBlobModelRepository("connectionString")
    .Build();

SQL Server with Local model repository

  IMLOpsContext mlOpsContext = new MLOpsBuilder()
    .UseSQLServer("connectionString")
    .UseLocalFileModelRepository()
    .Build();

AWS with SQLite

  IMLOpsContext mlOpsContext = new MLOpsBuilder()
    .UseSQLite()
    .UseAWSS3ModelRepository("awsAccessKey", "awsSecretAccessKey", "regionName")
    .Build();

With a Container Registry and a Kubernetes Cluster

  IMLOpsContext mlOpsContext = new MLOpsBuilder()
    .UseLocalFileModelRepository()
    .UseSQLite()
    .UseContainerRegistry("RegistryName", "UserName", "Password")
    .UseKubernetes("kubeconfigPathOrContent")
    .Build();

Experiment tracking

To manage the lifecycle of a model, we'll need to track things such as the model's evaluation metrics, hyper-parameters used during training and so forth. We organize this under the concept of experiments and runs. An experiment is the logical grouping of a model we are trying to develop, e.g. a fraud classifier or recommendation engine. For a given experiment, we can create a number of runs. Each run represents one attempt to train a given model, which is associated with the run conditions and evaluation metrics achieved.

To create an Experiment and a Run, access the Lifecycle catalog on the MLOpsContext

  var experimentId = await mlOpsContext.LifeCycle.CreateExperimentAsync();

  var run = await mlOpsContext.LifeCycle.CreateRunAsync(experimentId, "{optional Git SHA}");

For simplicity, you can also create an experiment (if it does not yet exist) and a run in one line

  var run = await mlOpsContext.LifeCycle.CreateRunAsync(experimentName: "FraudClassifier", "{optional Git SHA}");

With an Experiment and a Run created, we can track the model training process.

Hyperparameters

You can access the operations necessary to track hyperparameters on the Training catalog. You can either track individual hyperparameters, such as number of epocs as follows:

  await mlOpsContext.Training.LogHyperParameterAsync(runId, "NumberOfEpochs", epocs);

Alternatively, you can pass in the entire appended trainer and MLOps.NET will automatically log all of the trainer's hyperparameters for you

  await mlOpsContext.Training.LogHyperParameterAsync<SdcaLogisticRegressionBinaryTrainer>(runId, trainer);

Evaluation metrics

You can access the operations necessary to track evaluation metrics on the Evaluation catalog. Similarly to tracking hyperparameters, you can either log individual evaluation metrics as follows:

  await mlOpsContext.Evaluation.LogMetricAsync(runId, "F1Score", 0.99d);

Alternatively, you can pass the entire ML.NET evaluation metric result and MLOps.NET will log all related evaluation metrics for you automatically.

  await mlOpsContext.Evaluation.LogMetricsAsync<CalibratedBinaryClassificationMetrics>(runId, metric);

Data tracking

There are a number of useful methods on the Data catalog to track the data used for training. This will give you a nice audit trail to understand what data was used to train a specific model, as well as how the data looked and if it has changed in between models.

To log the data schema and the data hash (to be used to compare data for two different models), you can use the LogDataAsync method

  await mlOpsContext.Data.LogDataAsync(runId, dataView);

To log the distribution of a given column, e.g. how many rows in a given dataset are positive and how many are negative, use the LogDistributionAsync method

  await mlOpsContext.Data.LogDataDistribution<bool>(run.RunId, dataView, nameof(Review.Sentiment));

Model repository

The end product of any model development effort is the actual model itself. MLOps.NET offers the ability to store your model either in a storage account in Azure, an S3 bucket in AWS or locally on a fileshare of your choosing.

To upload a model from a run

  var runArtifact = await mlOpsContext.Model.UploadAsync(runId, "pathToModel");

To register a model for deployment

  var registeredModel = await mlOpsContext.Model.RegisterModel(experimentId, runArtifact.RunArtifactId, registeredBy: "John Doe", description: "Altered weights");

Model deployment

Once a model has been registered, it's possible to deploy it to a given deployment target. A deployment target can be thought of as a specific environment from which you can serve your model, e.g. Test, Stage and Production. MLOps.NET currently supports deploying the model to an URI so that an ASP.NET Core application can consume it, or to a Kubernetes cluster so that the model can be consumed through a RESTful endpoint.

Methods to deploy a model can be found on the Deployment catalog. To deploy a model, start by creating a deployment target:

var deploymentTarget = await mlOpsContext.Deployment.CreateDeploymentTargetAsync(deploymentTargetName: "Test", isProduction: false);

Deploy a model to a URI

Given a deployment target and a registered model, you can then deploy the model to a URI

  var deployment = await mlOpsContext.Deployment.DeployModelToUriAsync(deploymentTarget, registeredModel, deployedBy: "John Doe");

The model is deployed to deployment.DeploymentUri, which can be used by a consuming application. It's also possible to get the URI/path to deployed model by doing the following:

  var deployment = await mlOpsContext.Deployment.GetDeployments()
    .FirstOrDefault(x => x.DeploymentTarget.Name == "Test");

  var deploymentUri = await mlOpsContext.Deployment.GetDeploymentUri(deployment);

Deploying a model for an experiment to a given deployment target, e.g. Test, will automatically overwrite the existing model, thus the consuming application will not need to update it's URI/path to the model it's consuming. ML.NET will automatically poll for changes to the file making it seamless and allowing the consuming application and the ML.NET model to have different release cycles.

Deploy a model to Kubernetes

To deploy a model to Kubernetes you'll need to configure a Container Registry and a Kubernetes cluster via the MLOpsBuilder. MLOps.NET is agnostic of cloud provider so you can have your container registry either live locally or in the cloud (private/public). You are free to host your Kubernetes cluster either in Azure, AWS or elsewhere, the tool simply finds it using the provided kubeconfig. Note that the UseKubernetes method either takes the absolute path to the kubeconfig or the content of the kubeconfig itself, which can be useful if we are configuring it via a CI pipeline.

  IMLOpsContext mlOpsContext = new MLOpsBuilder()
    .UseLocalFileModelRepository()
    .UseSQLite()
    .UseContainerRegistry("RegistryName", "UserName", "Password")
    .UseKubernetes("kubeconfigPathOrContent")
    .Build();

We can then deploy the model to the Kubernetes cluster

  var deployment = await sut.Deployment.DeployModelToKubernetesAsync<ModelInput, ModelOutput>(deploymentTarget, registeredModel, "deployedBy");
  
  deployment.deploymentUri
  //e.g. http://52.146.48.228/api/Prediction

If you don't know the ModelInput and ModelOutput at deployment time, you can register the model schema during the run

  await sut.LifeCycle.RegisterModelSchema<ModelInput, ModelOutput>(run.RunId);

This simplifies the call at deployment time

  var deployment = await sut.Deployment.DeployModelToKubernetesAsync(deploymentTarget, registeredModel, "deployedBy");
  
  deployment.deploymentUri
  //e.g. http://52.146.48.228/api/Prediction

Contribute

We welcome contributors! Before getting started, take a moment to read our contributing guidelines as well as the docs for new developers on how to set up your local environment.

Code of Conduct

Please take a moment to read our code of conduct

mlops.net's People

Contributors

Stargazers

Watchers

mlops.net's Issues

SQLite is not making use of Data folder to store the db file

Describe the bug
SQLite is not making use of Data folder to store the db file.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
Create the db in the Data folder

Add script to automatically move built packages to local nuget source

Add PR Template

Should contain

Which issue it fixes (e.g. Fixes #1234)
Description of change

Test SQLite implementation

@dcostea adding this issue so we remember to test it at some point :)

Tracking hyperparameters associated with a run.

Is your feature request related to a problem? Please describe.
Along with capturing the run and the corresponding metrics , we should capture Run duration as well.
Also is there a way to log hyperparameters used to train a model the same we log the metrics using a generic method? Is there a method or property on ml.net which returns a list of hyperparameters?

Describe the solution you'd like
Add 2 new properties for capturing run duration and hyperparameters on the Run class

Describe alternatives you've considered
N/A

Additional context
N/A

Separate out the Azure implementation into MLOps.NET.Azure

Add support for SQLite

Registering models and metadata in the cloud is not always feasible.
To that effect we want to be able to add a provider to store these things on-premise

Add support to get best run and metrics that are zero-optimized

Is your feature request related to a problem? Please describe.
Our current IsBestRun method tries to find the best run for a given metric by finding the largest value. This works well for 99% or so of cases but e.g. log loss entropy should be closer to zero to be a good value.

Describe the solution you'd like
Add ability to define what a good run looks like for a given metric

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add support to track the training time(Run duration) for a model.

Is your feature request related to a problem? Please describe.
Along with capturing the run and the corresponding metrics , we should capture training time as well. Also Refer to #76

Describe the solution you'd like
Add a new property to capture training time.

Describe alternatives you've considered
N/A

Additional context
N/A

Add license information to NuGet package

Add support to log metrics using key-value pairs

Add ability to associate run with git commit or comment

Is your feature request related to a problem? Please describe.
While out running (that's when my ideas come) I thought of Sammy's question on how we would know which code was used to train a specific model. Given that we currently would run an e.g. a GitHub Action on a training run, or when running locally, have no GitHub commit at all, I can see that we easily would lose track of which run/artifact belonged to what code commit, e.g. model pipeline.

Describe the solution you'd like
A simple solution for us to start with is to add two columns on the run entity, both optional.

GitHub commit hash
Comments field

This would allow us to add the GitHub commit hash to the run, and if running locally a comment on what e.g. what the changes are that one is trying. Ideally we would then be able to link the web client with the repo in which the model was trained so that you could click on the run and it would automatically bring up the Git commit (e.g. PR) for which this code was trained on.

Describe alternatives you've considered
Open for suggestions.

Add ability to get all experiments

Add method to get all experiments from various meta data stores

Add CI/CD pipeline for NuGet packaging and deployment

Background

We need to automate the build and deployment of our nuget packages.
Each package needs to have a consistent build number.

We should probably run the dotnet pack command with versioning after each CI build (we can have different versions for PR builds vs master builds).

The deployment to nuget.org should most likely be a separate workflow file as it should be manually triggered once we have a release.

Add multiclass classification example

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
Added an example of multi-class classification

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add workflow to publish packages to nuget.org

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
Add GitHub workflow to publish packages from a release branch to nuget.org

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Rename Methods Returning Task<T>

Some method names don't make it clear that they're async. For example RetrieveEntity method returns Task<TEntity>. To make it more clear that the method is async, consider renaming it RetrieveEntityAsync. Do the same for any other methods that return Task<T>.

Add support to log an artifact during run

During a run we want to upload a model to a container in Azure.

I'm envisioning the structure to look like a container name ModelRepository and each model being named the unique GUID for the run.

Add NuGet package information to MLOps.NET.SQLite

We need to add

License
Version number

Anything else required to be in line with other packages.

Add ability to log metrics of a regression model

Add the ability to pass in the evaluation result from a regression model and automatically log all values as key-value pairs

Treat warnings as errors

Add ability to log metrics from a binary classifier

Add the ability to pass in the evaluation result from a binary classifier and automatically log all values as key-value pairs.

Add ability to download artifact model from a given run

Add regression example

We should add an example solution on how this SDK can be used. It should probably go hand-in-hand with a page for documentation as well.

Add unit test project for Azure Storage Provider

We need some test automation to ensure that are properly setting values when creating experiments, runs and metrics

Add ability to get all runs for an experiment

Add a method to get all runs for an experiment from various metadata stores

Add ability to create experiment and run in one call

Is your feature request related to a problem? Please describe.
Currently, we need two lines for creating a run, first creating an experiment then one for creating a new run.

Describe the solution you'd like
It would be nice to have a method like:

public Task<Guid> CreateRunAsync(string experimentName) 
{
}

What the method would do is create the experiment and then a run and return the run id.

Describe alternatives you've considered
N/A

Add ability to get the best run in experiment based on a given metric

We need the ability to get the best run in an experiement based on a specific metric.
For example, let's say we value the F1 score for a given machine learning model we are training (contained in an experiment). Each time we train the model (run), we want to know if this model is better than a previously recorded model or not.

The reason we want this is so we don't need to upload every model that we don't care about.

I'm thinking something like this

public void IsBestRun(Guid runId, string metricName) 
{
    // 1. Fetch the given metric value for that run
    // 2. Given the experiment that run is apart of, fetch the best run for that metric
    // 3. Compare if this is the best run or not
}

SQLite integration tests are failing due to missing NotMapped attribute

Add web client to visualize experiments, runs and metrics

Create a simple Blazor App with a RESTful Web API backing it

Add support to get meta data for all registered models in an experiment

Add summary comments on public variables and methods for intellisense

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
In order to provide good intellisense for our nuget package we need to add good XML documentation for our public properties and methods.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add GitHub Actions status badge

Add batch for the current build status

Add support to log artifact on a file share or local path

Similarly to #3 we want the ability to upload models/artifacts to a local file share.

We can either make this the default case when using SQLite or we can create separate extension methods so a user can decide if they want to store this locally or not (e.g. you may want to mix Azure for meta data and local for models)

Add created date to IExperiment

Is your feature request related to a problem? Please describe.
I think it would be useful to know when one first started running an experiment.

Describe the solution you'd like
Add a new property for CreatedDate on IExperiment and it's associated implementations.
Note that this property should be immutable, e.g. when we run a new run for an experiment, this property should not be updated.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add xmldoc inheritdoc to interface implementations

We should add /// <inheritdoc/> to all interface implementation methods/properties so they are properly registered in xml doc comments with the base documentation when they are referenced directly instead of as an interface.

Add ability to log metrics from a multi-class classifer

Add the ability to pass in the evaluation result from a multi-class classifier and automatically log all values as key-value pairs.

Add ability to get all metrics for a run

Add method to get all metrics for a given run

Update MLLifecycleManager to expose builder overload for UseAzureBlobStorage

Instead of passing in the connectionstring in the constructor, let's use a method, e.g.

public MLLifecycleManger UseAzureBlobStorage(string connectionString) 
{
     //do stuff here
     return this;
}

CreateExperiment creates duplicates in Azure Storage

Describe the bug
When calling CreateExperiment duplicate experiments are creating if one with the same name already exists in Azure TableStorage

To Reproduce
Steps to reproduce the behavior:

Call CreateExperiment two times in a row using Azure Storage

Expected behavior
I would expect that if an experiment with the same name already exist that no new experiment is created and rather the existing one's id is returned

Add an MLOpsBuilder to create a configured instance of MLLifeCycleManager

Is your feature request related to a problem? Please describe.
The initial setup of MLLifeCycleManager for small test scenarios seems problematic in that you are required to set up a backing store before using or it throws an exception. It seems to me there should be a default implementation that requires zero configuration, perhaps an in-memory dictionary based store.

Describe the solution you'd like
Add in-memory store implementations of IMetaDataStore and IModelRepository - these will be useful for testing various scenarios in which the user does not need to persist the data in a permanent storage mechanism. Make these the default implementation instead of throwing an exception via EnsureStorageProviderConfigured on every call to MLLifeCycleManager

Describe alternatives you've considered
Alternatively could use SQLite implementation as default as it does not require any configuration by default, however it does persist on disk which might be undesirable for testing various scenarios.

Additional context
N/A

Add skeleton Blazor WebAssembly client

Add skeleton Blazor WebAssembly client that we can continue to build upon in smaller tickets.

Sketch simple design for client UI

The purpose of this ticket is to draw a very simple sketch on how the UI could look like for the web client.

I'm thinking the simplest possible.

We have the following layers

Experiment, Runs, Metrics.

In the future we want to register models and deploy them but we can think of that UI later

Add setup instructions on how to get started with the repo

Is your feature request related to a problem? Please describe.
To get started with the repo, we should provide instructions or scripts to install all dependencies needed to run unit/integration tests and build the project.

E.g.

Setup SQLite (script or instructions)
Setup Azure Storage Account (ARM Template)

Create and upload an alpha nuget package to nuget.org

Add integration tests for Azure Storage Provider

We would need to add integration tests to ensure that we can properly set up

Experiments
Runs
Metrics
Log models

The connection string could be kept as part of repo secrets

Rename MLLifecycleManager to MLOpsContext

Is your feature request related to a problem? Please describe.
MLLifecycleManager is a mouthful. That's a problem.

Describe the solution you'd like
An easier term to understand and annunciate is ModelContext and IModelContext
As part of this issue, we would rename MLLifecycleManager to ModelContext

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add ability to run unit tests in CI pipeline

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
Run all unit tests in the solution.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

Add Contribution Guidelines

How to get started contributing
Any coding style guidelines if we have that

aslotte / mlops.net Goto Github PK

mlops.net's Introduction

⚡ MLOps.NET

Articles

Getting started

Azure with CosmosDb

SQL Server with Local model repository

AWS with SQLite

With a Container Registry and a Kubernetes Cluster

Experiment tracking

Hyperparameters

Evaluation metrics

Data tracking

Model repository

Model deployment

Deploy a model to a URI

Deploy a model to Kubernetes

Contribute

Code of Conduct

mlops.net's People

Contributors

Stargazers

Watchers

Forkers

mlops.net's Issues

Background

Recommend Projects

Recommend Topics

Recommend Org