ai-on-z-triton-is-examples's Introduction

Demonstrating inference pipeline creation using triton inference server on IBM zSystems and LinuxONE.

1. Scope

The purpose of this project is to provide to sample artifacts to create inference pipeline using triton inference server on IBM zSystems and LinuxONE.

All the materials are provided as examples. Provided dockerfiles build open-source based (not proprietary) images.

The maintainers of this repository do not assert to be experts in containers or container security. Resources include:

2. Usage

These build files commonly rely on base images from the IBM Z and LinuxONE Container Image Registry (ICR) . This will require free basic authentication. Details can be found at the ICR link above.

Steps to build and run docker image.

Run docker build using the provided dockerfile i.e docker build -f ./Dockerfile .. Using a Ubuntu base image, this will create an environment with both specific Triton Server, Python Backend release with SnapML version.
Create and run a docker container using the image created on the prior step. As part of this step, you should map the required triton server ports to a port on your local system. An example follows: docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v//<path>/models:/models <imageid> tritonserver --model-repository=/models. This states the image in interactive mode, tells docker to delete the container upon exit, and publishes container HTTP service port at 8000, GRPC Inference Service at 8001, Metrics Service at 8002.

Note: Please do refer each topic(e.g snapml-examples ) inside ai-on-z-triton-is-examples to create model reposity for the specific use cases.

3. Content

Folder(topic)	Description
snapml-examples	Serving SnapML models with Triton Inference Server on Linux on IBM zSystems

Note: We also support onnxmlir-triton-backend on Linux on IBM zSystems. This backend allows the usage of ONNX MLIR compiled models (model.so) with the Triton Inference Server.

4. Additional resources

Find out additional resources about Triton Inference Server in the below links.

5. License

If you would like to see the detailed LICENSE click here.

ai-on-z-triton-is-examples's People

Contributors

Stargazers

Watchers

ai-on-z-triton-is-examples's Issues

Installation is failing on s390x due to new version of scipy

../../scipy/meson.build:84:15: ERROR: Dependency "pybind11" not found, tried pkgconfig, config-tool and cmake

This line https://github.com/IBM/ai-on-z-triton-is-examples/blob/main/Dockerfile#L111 should use a constraints.txt file to pin the version numbers, otherwise the installation of scikit-learn will try to pull in the latest version of scipy and not 1.10.0

Recommend Projects

ibm / ai-on-z-triton-is-examples Goto Github PK

ai-on-z-triton-is-examples's Introduction

Demonstrating inference pipeline creation using triton inference server on IBM zSystems and LinuxONE.

1. Scope

2. Usage

Steps to build and run docker image.

3. Content

4. Additional resources

5. License

ai-on-z-triton-is-examples's People

Contributors

Stargazers

Watchers

Forkers

ai-on-z-triton-is-examples's Issues

Installation is failing on s390x due to new version of scipy

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent