Giter VIP home page Giter VIP logo

devops-for-ai-apps's Introduction

This repository contains samples showing how to build an AI application with DevOps in mind. For an AI application, there are always two streams of work, Data Scientists building machine learning models and App developers building the application and exposing it to end users to consume. test

In this tutorial we demonstrate how you can build a continous integration pipeline for an AI application. The pipeline kicks off for each new commit, run the test suite, if the test passes takes the latest build, packages it in a Docker container. The container is then deployed using Azure container service (ACS) and images are securely stored in Azure container registry (ACR). ACS is running Kubernetes for managing container cluster but you can choose Docker Swarm or Mesos.

The application securely pulls the latest model from an Azure Storage account and packages that as part of the application. Teh deployed application has the app code and ML model packaged as single container.

This decouples the app developers and data scientists, to make sure that their production app is always running the latest code with latest ML model.

Variation to this tutorial could be consuming the ML application as an endpoint instead of packaging it in the app. The goal is to show how easy it is do devops for an AI application.

For detailed instructions please refer to the tutorial

Details about the code repository

  • flaskwebapp - contains the application code.
  • images - contains images used in tutorial.
  • test - contains integration test
  • deploy.yaml - used while deploying on Kubernetes ACS cluster.
  • downloadblob.sh - script to download pretrained model and supporting files.
  • tutorial.md - Starting point and step by step instuctions on creating build and release definitions.

devops-for-ai-apps's People

Contributors

chrispat avatar jainr avatar microsoft-github-policy-service[bot] avatar microsoftopensource avatar msalvaris avatar msftgits avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

devops-for-ai-apps's Issues

Cannot access application deployed in Azure ACS Kubernetes Cluster using Azure CICD Pipeline

I am following your document code -
https://github.com/Azure/DevOps-For-AI-Apps/blob/master/Tutorial.md

The CICD pipeline works fine. But I want to validate the application using the external ip that is being deployed to Kubernete cluster.

Deploy.yaml [File containts]

apiVersion: v1
kind: Pod
metadata:
name: imageclassificationapp
spec:
containers:

  • name: model-api
    image: crrq51278013.azurecr.io/model-api:156
    ports:
    • containerPort: 88
      imagePullSecrets:
  • name: imageclassificationappdemosecret

Pod details

C:\Users\nareshkumar_h>kubectl describe pod imageclassificationapp
Name: imageclassificationapp
Namespace: default
Node: aks-nodepool1-97378755-2/10.240.0.5
Start Time: Mon, 05 Nov 2018 17:10:34 +0530
Labels: new-label=imageclassification-label
Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"v1","kind":"Pod","metadata":{"annotations":{},"name":"imageclassificationapp","namespace":"default"},"spec":{"containers":[{"image":"crr...
Status: Running
IP: 10.244.1.87
Containers:
model-api:
Container ID: docker://db8687866d25eb4311175c5ccb5a7205379168c64cdfe716b09557fc98e2bd6a
Image: crrq51278013.azurecr.io/model-api:156
Image ID: docker-pullable://crrq51278013.azurecr.io/model-api@sha256:766673989a59fe0b1e849469f38acda96853a1d84e4b4d64ffe07810dd5d04e9
Port: 88/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 05 Nov 2018 17:12:49 +0530
Ready: True
Restart Count: 0
Environment:
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-qhdjr (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-qhdjr:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-qhdjr
Optional: false
QoS Class: BestEffort
Node-Selectors:
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:

Service details:

C:\Users\nareshkumar_h>kubectl describe service imageclassification-service
Name: imageclassification-service
Namespace: default
Labels: run=load-balancer-example
Annotations:
Selector: run=load-balancer-example
Type: LoadBalancer
IP: 10.0.24.9
LoadBalancer Ingress: 52.163.191.28
Port: 88/TCP
TargetPort: 88/TCP
NodePort: 32672/TCP
Endpoints: 10.244.1.65:88,10.244.1.88:88,10.244.2.119:88
Session Affinity: None
External Traffic Policy: Cluster
Events:

I am hitting the below url but the request times out. http://52.163.191.28:88/
Can you please help?
Please let me know if you need any further details from my side.

/cc @MicahMcKittrick-MSFT

Wrong index in driver.py file for model loading

driver.py has wrong index 3 for model loading, however it should be 2. Using 3 its giving error `

python trainedModel = combine([trainedModel.outputs[3].owner])\nIndexError: tuple index out of range

I have used loop to determine the index and updated the index in driver.py file.

python for index in range(len(trainedModel.outputs)): print("Index {} for output: {}.".format(index, trainedModel.outputs[index].name))
Output

Index 0 for output: CE.
Index 1 for output: Err.
Index 2 for output: OutputNodes.z.

Will create a PR with changes i did.

Building the following dockerfile on Hosted VS2017 gives error

FROM ubuntu:latest
MAINTAINER Richin Jain <rijaimicrosoft.com>
RUN apt-get update -y
RUN apt-get install -y python-pip python-dev build-essential
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
ENTRYPOINT ["python"]
CMD ["app.py"]

Error in build task
2017-09-19T19:47:17.1261149Z ##[section]Starting: Build an image
2017-09-19T19:47:17.1261149Z ==============================================================================
2017-09-19T19:47:17.1261149Z Task : Docker
2017-09-19T19:47:17.1261149Z Description : Build, push or run Docker images, or run a Docker command. Task can be used with Docker or Azure Container registry.
2017-09-19T19:47:17.1261149Z Version : 0.2.6
2017-09-19T19:47:17.1261149Z Author : Microsoft Corporation
2017-09-19T19:47:17.1261149Z Help : More Information
2017-09-19T19:47:17.1261149Z ==============================================================================
2017-09-19T19:47:18.7161220Z [command]"C:\Program Files\Docker\docker.exe" build -f d:\a\1\s\web\Dockerfile -t blogacr.azurecr.io/helloworld-api:47 -t blogacr.azurecr.io/helloworld-api d:\a\1\s\web
2017-09-19T19:47:20.0830265Z Sending build context to Docker daemon 4.096 kB
2017-09-19T19:47:20.0830265Z
2017-09-19T19:47:20.0840268Z Step 1/9 : FROM ubuntu:16.04
2017-09-19T19:47:20.9341075Z 16.04: Pulling from library/ubuntu
2017-09-19T19:47:20.9361041Z no supported platform found in manifest list
2017-09-19T19:47:21.0177555Z ##[error]C:\Program Files\Docker\docker.exe failed with return code: 1
2017-09-19T19:47:21.0557087Z ##[section]Finishing: Build an image

Azure CICD Build Pipeline failing after agent Hosted Linux Preview Pool deprecation

Hi,
I am following below document for CICD pipeline using Azure DevOps.

https://github.com/Azure/DevOps-For-AI-Apps/blob/master/Tutorial.md

The CICD pipeline works fine. But from 01 Dec 'Hosted Linux Preview Pool' agent which was suggested in tutorial has been deprecated.
As per below recommendation from Microsoft, we replaced 'Hosted Linux Preview Pool' with 'Microsoft-hosted agent - Ubuntu 16.04ย '.

https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=vsts&tabs=yaml#hosted-linux-preview-pool-deprecation

After doing this, the step 'Starting Model Container' step is failing.

Original Code -
BUILD_CONTAINER_ID=$(docker ps --format "{{.ID}}")
docker run -d --network container:$BUILD_CONTAINER_ID acrforblog.azurecr.io/model-api:$(Build.BuildId)

Replaced with -
docker run -d crrq51278013.azurecr.io/model-api:$(Build.BuildId)

[Above changes were done as per recommendation on Microsoft site - that Ubuntu 16.04 agent does not run as container]

Post this, step 'Simple API test' is failing and API service is not returning anything now.
Code in this Simple API test -

echo "Waiting...."
sleep 10

Simple API Test

echo "Testing API"
reply=$(curl -s $(MODEL_API_URL)/)
echo "reply value"
echo $reply

expected="Healthy"
echo $expected

if [[ $reply == $expected ]]; then
echo -e "Successfully validated version API call"
else
echo "Basic API call Fail"
exit 1
fi

Please suggest how code in step 'Starting Model Container' should be changed in case of 'Microsoft-hosted agent - Ubuntu 16.04', so container can be started and API service can be tested.

Regards,
Parag Gurjar

/cc @MicahMcKittrick-MSFT

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.