Giter VIP home page Giter VIP logo

awesome-seml's Introduction

Awesome Software Engineering for Machine Learning AwesomePRs Welcome

Software Engineering for Machine Learning are techniques and guidelines for building ML applications that do not concern the core ML problem -- e.g. the development of new algorithms -- but rather the surrounding activities like data ingestion, coding, testing, versioning, deployment, quality control, and team collaboration. Good software engineering practices enhance development, deployment and maintenance of production level applications using machine learning components.

⭐ Must-read

πŸŽ“ Scientific publication


Based on this literature, we compiled a survey on the adoption of software engineering practices for applications with machine learning components.

Feel free to take and share the survey and to read more!

Contents

Broad Overviews

These resources cover all aspects.

Data Management

How to manage the data sets you use in machine learning.

Model Training

How to organize your model training experiments.

Deployment and Operation

How to deploy and operate your models in a production environment.

Social Aspects

How to organize teams and projects to ensure effective collaboration and accountability.

Governance

Tooling

Tooling can make your life easier.

We only share open source tools, or commercial platforms that offer substantial free packages for research.

  • Airflow - Programmatically author, schedule and monitor workflows.
  • Archai - Neural architecture search.
  • Data Version Control (DVC) - DVC is a data and ML experiments management tool.
  • Facets Overview / Facets Dive - Robust visualizations to aid in understanding machine learning datasets.
  • FairLearn - A toolkit to assess and improve the fairness of machine learning models.
  • Git Large File System (LFS) - Replaces large files such as datasets with text pointers inside Git.
  • Great Expectations - Data validation and testing with integration in pipelines.
  • HParams - A thoughtful approach to configuration management for machine learning projects.
  • Kubeflow - A platform for data scientists who want to build and experiment with ML pipelines.
  • Label Studio - A multi-type data labeling and annotation tool with standardized output format.
  • LiFT - Linkedin fairness toolkit.
  • MLFlow - Manage the ML lifecycle, including experimentation, deployment, and a central model registry.
  • Model Card Toolkit - Streamlines and automates the generation of model cards; for model documentation.
  • Neptune.ai - Experiment tracking tool bringing organization and collaboration to data science projects.
  • Neuraxle - Sklearn-like framework for hyperparameter tuning and AutoML in deep learning projects.
  • OpenML - An inclusive movement to build an open, organized, online ecosystem for machine learning.
  • Robustness Metrics - Lghtweight modules to evaluate the robustness of classification models.
  • Spark Machine Learning - Spark’s ML library consisting of common learning algorithms and utilities.
  • TensorBoard - TensorFlow's Visualization Toolkit.
  • Tensorflow Extended (TFX) - An end-to-end platform for deploying production ML pipelines.
  • Weights & Biases - Experiment tracking, model optimization, and dataset versioning.

Contribute

Contributions welcomed! Read the contribution guidelines first

awesome-seml's People

Contributors

xserban avatar jstvssr avatar magielbruntink avatar guillaume-chevalier avatar nikenano avatar kvdblom avatar joaquinvanschoren avatar markhaakman avatar machawk1 avatar illihcs avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.