Giter VIP home page Giter VIP logo

ankursoni / kubernetes-operator-roiergasias Goto Github PK

View Code? Open in Web Editor NEW
7.0 3.0 0.0 95.19 MB

'Roiergasias' kubernetes operator is meant to address a fundamental requirement of any data science / machine learning project running their pipelines on Kubernetes - which is to quickly provision a declarative data pipeline (on demand) for their various project needs using simple kubectl commands. Basically, implementing the concept of No Ops. The fundamental principle is to utilise best of docker, kubernetes and programming language features to run a workflow with minimal workflow definition syntax. It is a Go based workflow running on command line or Kubernetes with the help of a custom operator for a quick and automated data pipeline for your machine learning projects (a flavor of MLOps).

License: GNU General Public License v3.0

Dockerfile 1.13% Go 80.28% HCL 10.61% Makefile 7.97%
golang kubernetes docker terraform helm aws-eks efs s3 noops kubernetes-operator

kubernetes-operator-roiergasias's Introduction

roi ergasias

roΓ­ ergasΓ­as as pronounced in greek means workflow.

This kubernetes operator is meant to address a fundamental requirement of any data science / machine learning project running their pipelines on Kubernetes - which is to quickly provision a declarative data pipeline (on demand) for their various project needs using simple kubectl commands. Basically, implementing the concept of No Ops.
The fundamental principle is to utilise best of docker, kubernetes and programming language features to run a workflow with minimal workflow definition syntax.

Go Reference

Run "Hello world" workflow locally

# clone to a local git directory, if not already done so
git clone https://github.com/ankursoni/kubernetes-operator-roiergasias.git

# change to the local git directory
cd kubernetes-operator-roiergasias

# set execute permissions to roiergasias cli
chmod +x cmd/linux/roiergasias cmd/osx/roiergasias

# run the hello world workflow
./cmd/linux/roiergasias run -f ./examples/hello-world/hello-world.yaml
# or, for mac osx
./cmd/osx/roiergasias run -f ./examples/hello-world/hello-world.yaml

Notice that the environment variables set globally and in previous steps are made available to subsequent steps:

hello-world

Run "Hello world" workflow via operator in Kubernetes

- Install Helm

- For local Kubernetes, install Kubernetes by Docker Desktop or Minikube

# install roiergasias operator
helm install --repo https://github.com/ankursoni/kubernetes-operator-roiergasias/raw/main/operator/helm/ \
  --version v0.1.2 \
  roiergasias-operator roiergasias-operator

# explore the contents of hello-world-kubernetes.yaml file
cat examples/hello-world/hello-world-kubernetes.yaml

# apply the manifest
kubectl apply -f examples/hello-world/hello-world-kubernetes.yaml

# browse workflow created by the manifest
kubectl get workflow
# should display "roiergasias-demo"

# browse configmap created by the workflow
kubectl get configmap
# should display "roiergasias-demo-hello-world"

# browse job created by the workflow
kubectl get job
# should display "roiergasias-demo"

# browse pod created by the job
kubectl get pod
# should display "roiergasias-demo-<STRING>"

# check pod logs for the output and wait till it is completed
kubectl logs roiergasias-demo-<STRING_FROM_PREVIOUS_STEP>

# delete the manifest
kubectl delete -f examples/hello-world/hello-world-kubernetes.yaml

# uninstall the operator (optional)
helm uninstall roiergasias-operator

Notice that the workflow yaml file is provided to the pod as a volume - 'yaml' automatically created by the operator using a generated config map:

hello-world-kubernetes

Why use Roiergasias?

The USP (unique selling point) of using Roiergasias workflow in Kubernetes is its ability to split workflow to run in multiple worker nodes as depicted briefly below:

hello-world-multi-node

hello-world-multi-node-kubernetes
Notice the sequence of actions:

1. Create config map 1 + job 1 for split workflow 1 on "node1"
2. Wait for job 1 to complete
3. Create config map 2 + job 2 for split workflow 2 on "node2"
4. Wait for job 2 to complete
5. Create config map 3 + job 3 for split workflow 3 on "node2"
6. Wait for job 3 to complete  

For more details, follow this README


Run "Machine learning" workflow locally

machine-learning-overview-local
machine-learning-workflow-kubernetes
For more details, follow this README


Run "Machine learning" workflow in AWS

aws-topology
topology
Notice the sequence of actions:

1. Create config map 1 + job 1 for split workflow - "process data" on "node1"
2. Wait for job 1 to complete
3. Create config map 2 + job 2 for split workflow - "train model" on "node2"
4. Wait for job 2 to complete
5. Create config map 3 + job 3 for split workflow - "evaluate model" on "node2"
6. Wait for job 3 to complete  

For more details, follow this README


Getting started with Roiergasias workflow

Core features of Roiergasias workflow:

  1. It is cloud agnostic as it can run in any Kubernetes in cloud or local.
  2. It is also language agnostic as it derives the capabilities of the system where it is running be it container or virtual machine.

For workflow yaml file syntax and command syntax, follow this README


Repository map

 πŸ“Œ -----------------------> you are here
┬
β”œβ”€β”€ cmd    ----------------> contains go main starting point for roiergasias workflow cli
β”‚   β”œβ”€β”€ linux   -----------> contains linux amd64 executable for roiergasias workflow cli
β”‚   └── osx   -------------> contains mac-osx amd64 executable for roiergasias workflow cli
β”œβ”€β”€ docs   ----------------> contains documentation / images
β”œβ”€β”€ examples  
β”‚   β”œβ”€β”€ hello-world   -----> contains both single node and multi node split workflow example
β”‚   β”œβ”€β”€ machine-learning
β”‚   β”‚   β”œβ”€β”€ aws   ---------> contains multi node split workflow in 2 node groups example
β”‚   β”‚   └── local   -------> contains single node workflow example
β”œβ”€β”€ infra   ---------------> contains terraform scripts for infrastructure as code
β”‚   └── aws
β”œβ”€β”€ operator   ------------> contains kubernetes operator code for roiergasias workflow
β”‚   β”œβ”€β”€ api
β”‚   β”œβ”€β”€ config
β”‚   β”œβ”€β”€ controllers
β”‚   β”œβ”€β”€ hack
β”‚   └── helm   ------------> contains kubernetes operator helm chart repository
└── pkg   -----------------> contains go packages for roiergasias workflow engine
    β”œβ”€β”€ lib
    β”œβ”€β”€ mocks
    β”œβ”€β”€ steps
    β”œβ”€β”€ tasks
    └── workflow

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.