Giter VIP home page Giter VIP logo

website's Introduction

KServe

go.dev reference Coverage Status Go Report Card OpenSSF Best Practices Releases LICENSE Slack Status

KServe provides a Kubernetes Custom Resource Definition for serving predictive and generative machine learning (ML) models. It aims to solve production model serving use cases by providing high abstraction interfaces for Tensorflow, XGBoost, ScikitLearn, PyTorch, Huggingface Transformer/LLM models using standardized data plane protocols.

It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and Canary Rollouts to your ML deployments. It enables a simple, pluggable, and complete story for Production ML Serving including prediction, pre-processing, post-processing and explainability. KServe is being used across various organizations.

For more details, visit the KServe website.

KServe

KFServing has been rebranded to KServe since v0.7.

Why KServe?

  • KServe is a standard, cloud agnostic Model Inference Platform for serving predictive and generative AI models on Kubernetes, built for highly scalable use cases.
  • Provides performant, standardized inference protocol across ML frameworks including OpenAI specification for generative models.
  • Support modern serverless inference workload with request based autoscaling including scale-to-zero on CPU and GPU.
  • Provides high scalability, density packing and intelligent routing using ModelMesh.
  • Simple and pluggable production serving for inference, pre/post processing, monitoring and explainability.
  • Advanced deployments for canary rollout, pipeline, ensembles with InferenceGraph.

Learn More

To learn more about KServe, how to use various supported features, and how to participate in the KServe community, please follow the KServe website documentation. Additionally, we have compiled a list of presentations and demos to dive through various details.

๐Ÿ› ๏ธ Installation

Standalone Installation

  • Serverless Installation: KServe by default installs Knative for serverless deployment for InferenceService.
  • Raw Deployment Installation: Compared to Serverless Installation, this is a more lightweight installation. However, this option does not support canary deployment and request based autoscaling with scale-to-zero.
  • ModelMesh Installation: You can optionally install ModelMesh to enable high-scale, high-density and frequently-changing model serving use cases.
  • Quick Installation: Install KServe on your local machine.

Kubeflow Installation

KServe is an important addon component of Kubeflow, please learn more from the Kubeflow KServe documentation. Check out the following guides for running on AWS or on OpenShift Container Platform.

๐Ÿ’ก Roadmap

๐Ÿงฐ Developer Guide

โœ๏ธ Contributor Guide

๐Ÿค Adopters

website's People

Contributors

alexagriffith avatar alexandrebrown avatar andyi2it avatar chinhuang007 avatar ckadner avatar dependabot[bot] avatar elukey avatar gavrissh avatar greenmoon55 avatar iamlovingit avatar jooho avatar js-ts avatar juhyung-son avatar lampajr avatar markwinter avatar pvaneck avatar rachitchauhan43 avatar rafvasq avatar retocode avatar sivanantha321 avatar suresh-nakkeran avatar taneem-ibrahim avatar terrytangyuan avatar tessapham avatar theofpa avatar tomcli avatar varunsh-xilinx avatar xfu83 avatar yuzisun avatar zoramt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

website's Issues

Add sample which explain how to deploy model on EKS cluster

/kind bug

What steps did you take and what happened:
I have installed Kubeflow-1.3. I could not figure out which version of kfserving it has installed by default.
Following were namespace created which I believe for KFServing

knative-eventing            Active   151m
knative-serving             Active   152m

What did you expect to happen:
I tried to run XGBoost sample. Everything worked fine except that my SERVICE_HOSTNAME=$(kubectl get inferenceservice xgboost-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3) became xgboost-iris.default.example.com. Which I believe is expected but I did not create any domain for this service.

Can we add a sample which can run vanila EKS cluster without worrying about domain or am I missing something?

Environment:
EKS cluster
K8s version-1.9

  • Istio Version:
  • Knative Version:
  • KFServing Version:
  • Kubeflow version: 1.3
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Update the doc for Kubernetes 1.25

Describe the change you'd like to see
Hi,

I have been using Kubernetes v1.25 (which is not in the Recommended Version Matrix), and tried to Install the KServe "Quickstart" environment (for v0.9). It didn't work since it requires Istio version > 1.14.0. I am not sure if it works for Istio v.1.15.0 but I just set the ISTIO_VERSIO=1.16.0 in quick.install.sh, and everything worked with no issue. FYI.

Additional context
Add any other context or screenshots about the feature request here.

Add swagger-ui to dataplane ui

Describe the change you'd like to see

Add swagger-ui for dataplane api within docs website

This would allow user to preview endpoints and request/response objects in a familiar user interface.

Document how to use STORAGE_URI especially in case of custom framework

/kind feature

Describe the solution you'd like
Describe how to use STORAGE_URI environment variable in InferenceService schema with custom framework.
It could be added in space: kfserving/docs/samples/custom/
This feature is not documented. I would like to know if this feature is official and if it is expected to be available in the future.

Anything else you would like to add:
Important information about feature:

  1. container schema can have STORAGE_URI environment variable,when used in InferenceService, it provides standard storage-initializer functionality.
  2. container name has to be : kfserving-container.
  3. The storage mounts to /mnt/models.

This feature can be very beneficial if someone needs to create a custom docker image with specific dependencies. There was a case in my organization where the serialized model had such dependencies that it was impossible to extract them into a transformer.

STORAGE_URI feature basic locations:
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1alpha2/framework_custom.go#L30
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1beta1/predictor_custom.go#L57

other examples where STORAGE_URI is used:
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/torchserve-custom-pv.yaml
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/bert-sample/bert.yaml
https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1alpha2/triton/bert

I'd like to offer my help, but first I need some initial answer and maybe some instructions.

Docs about using nodeSelector

/kind feature

I am wondering if it is possible to set nodeSelector option for Tensorflow serving (or any other serving different that custom).
I found some information about this in some issues, but they were not very specific.
If it is possible, it would be great to include any example in docs.

AWS IAM Role for Service Account

What is changing? (Please include as many details as possible.)

Adds explicit support for aws IRSA credential method for download models

How will this impact our users?

More secure aws credential management (no static/long lived creds needed in secrets)

In what release will this take happen (to the best of your knowledge)?

v0.10.0

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#2113
  2. kserve/kserve#2373

Additional info

Administration - Serverless installation guide includes two istio installation

Knative should be installed before Istio
Knative itself requires a networking layer, a better way is to tell user to choose istio as the networking layer for Knative instead of treating Knative and Istio as two separate components.

Problem seen
If I install Istio before Knative, Kserve simply doesn't work. It could be due to the unsupported versions or network configured in wrong order.

Run mkdocs gh-deploy before mike deploy on the Github workflow

Expected Behavior

By #70 , 404.html was updated. So it was supposed that we could see the error page properly.

Actual Behavior

The error page is still broken.

Steps to Reproduce the Problem

Please see a wrong URL like https://kserve.github.io/website/0.7/admin/.

Additional Info

Cause

Proposal

Run mkdocs gh-deploy command before we run mike deploy on Github workflow. So we can sync contents on the root with the latest contents.

Thanks!

TorchServe doc updates for 0.8

What is changing? (Please include as many details as possible.)

How will this impact our users?

TorchServe is now updated to 0.5.2

  • KServe migration: deprecate service_envelope in config.properties and use enable_envvars_config: true to enable service envelop at runtime.
  • KServe v2 REST protocol support

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#1944
  2. kserve/kserve#1870

Additional info

No license statement

Expected Behavior

The website and the contained documentation should contain a clear license statement.

Actual Behavior

There is no license statement present.

Recommendation

A common choice for this type of content is CC-BY (Creative Commons 'Attribution'), although CC-BY-SA ('Attribution' plus 'ShareAlike') can also be used if the authors want to ensure that the content remains licensed under the licensed when it is reused or derivative works are produced.

Document kserve dependency version support

Describe the change you'd like to see
Document the dependency version matrix for Kubernetes/Knative/Istio, we won't be able to test each version combinations but like to give recommendations for the version set which has been tested and certified.

Additional context
Add any other context or screenshots about the feature request here.

Versioning the website docs

Describe the change you'd like to see
A clear and concise description of what you want to happen.

Additional context
Add any other context or screenshots about the feature request here.

Document user migration steps

/kind feature

Describe the solution you'd like
Document user migration steps from kfserving to kserve, this migration guide will be referenced on the blog.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

KServe "Quickstart" environment script file is wrong

Expected Behavior

By executing the quick-install script referred to Getting Started page, users should be able to install KServe on their local environment.

The script referenced in the quick start page should be https://raw.githubusercontent.com/kserve/kserve/master/hack/quick_install.sh

Actual Behavior

The current link is https://raw.githubusercontent.com/kserve/kserve/release-0.7/hack/quick_install.sh, which refers to a local file for installing kserve. (https://github.com/kserve/kserve/blob/9112a5d81395cf4b32cd9c9b35f10e030c23b77e/hack/quick_install.sh#L106)

It will only raise error.

Steps to Reproduce the Problem

  1. Visit the links

Additional Info

Additional context
Add any other context about the problem here.

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.):
  • KServe Version: 0.7.0

Add a provision to make changes across multiple versions of the document

**Provision to add common changes across versions like banner **

Banners, popups, etc should usually appear across multiple live versions of the document.
Currently this requires deploying or updating old docs. This would make it more difficult as this new banner / other changes have to be updated in each version of the doc manually.

Proposing a new solution to update common things like banner, popup, etc across multiple documents.

Deploy a common js file in gh-pages repo
Add the common js file to docs

Any future common doc changes would only require changes in common.js

Documentation Improvements for v0.10.0

What is changing? (Please include as many details as possible.)

We are working on the KServe 2023 Roadmap kserve/kserve#2526 and the v0.10.0 release, as well as preparing for our eventual v1.0 ๐Ÿ˜„ . For all of this and to keep up with our changes, we need to improve our documentation and website!

Here are the current objectives to improve the Kserve website documentation

  • Unify the data plane v1 and v2 page formats (one has table one has long form text)
  • Improve the v2 page to be more succinct and clearly show what can be used by splitting the page into two: (1) that tells the story of why and what changed and (2) to explain how to use v2 and what it provides
  • document fastapi updates for model server
  • document serving runtime spec changes
  • clean up the examples in kserve repo and unify them with the website's by creating one source of truth for example documentation
  • update any out-of-date documentation and make sure the website as a whole is consistent and cohesive
  • add monitoring setup documentation & documentation for knative serverless install #24
  • implement spellcheck in repo #82
  • move queue proxy extension documentation to website

In what release will this take happen (to the best of your knowledge)?

v0.10

Update logger/batcher/autoscaler/canary examples

/kind feature

Describe the solution you'd like
Update examples for api groups and kfserving name reference for
https://github.com/kubeflow/kfserving/tree/master/docs/samples/batcher
https://github.com/kubeflow/kfserving/tree/master/docs/samples/autoscaling
https://github.com/kubeflow/kfserving/tree/master/docs/samples/kafka
https://github.com/kubeflow/kfserving/tree/master/docs/samples/logger

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

kfctl v1.3.0 link broken for kubeflow deployment on Azure

/kind bug

What steps did you take and what happened:

  1. kubeflow setup on azure using the link https://www.kubeflow.org/docs/distributions/azure/deploy/install-kubeflow/
  2. the link is broken for v1.3 : Download the kfctl v1.3.0 release from the Kubeflow releases page.
    so have installed the available https://github.com/kubeflow/kfctl/releases

What did you expect to happen:

I need to use kfserving with torchserve on azure. What i should do for this to work ?

LightGBM example fails

/kind bug

What steps did you take and what happened:

When I followed https://kserve.github.io/website/0.8/modelserving/v1beta1/lightgbm/ the request fails with {"error": "Input data must be 2 dimensional and non empty."}. I think I narrowed it down to lightgbm.Dataset failing to autodetect the feature names, the model appears to just have generic "Column_0", "Column_1", etc feature names.

I got the example to work by changing this part:

iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = lgb.Dataset(X, label=y)

...to this:

iris = load_iris()
y = iris['target']
X = iris['data']
feature_name = iris['feature_names']
dtrain = lgb.Dataset(X, label=y, feature_name=feature_name)

Here's a reproduction:

import lightgbm as lgb
import pandas as pd
from sklearn.datasets import load_iris


def main():
    explicit_model = train(make_explicit_dataset())
    auto_model = train(make_auto_dataset())

    print("Trying explicit model")
    print(predict(explicit_model))
    print("\n")
    print("Trying auto model")
    print(predict(auto_model))


def train(dataset):
    params = {
        'objective':'multiclass',
        'metric':'softmax',
        'num_class': 3
    }
    return lgb.train(params=params, train_set=dataset)


def make_auto_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    return lgb.Dataset(X, label=y)


def make_explicit_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    feature_name = iris['feature_names']
    return lgb.Dataset(X, label=y, feature_name=feature_name)


def predict(model):
    request = {'sepal_width_(cm)': {0: 3.5}, 'petal_length_(cm)': {0: 1.4}, 'petal_width_(cm)': {0: 0.2},'sepal_length_(cm)': {0: 5.1} }

    # Simulate kserve's lgbserver behavior:
    df = pd.DataFrame(request, columns=model.feature_name())
    inputs = pd.concat([df], axis=0)
    return model.predict(inputs)


if __name__ == "__main__":
    main()

The output I got, with Python 3.7, scikit-learn == 1.0.1, lightgbm == 3.3.2:

Trying explicit model
[[9.99985204e-01 1.38238969e-05 9.72063744e-07]]


Trying auto model
Traceback (most recent call last):
  File "train.py", line 52, in <module>
    main()
  File "train.py", line 14, in main
    print(predict(auto_model))
  File "train.py", line 47, in predict
    return model.predict(inputs)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 3540, in predict
    data_has_header, is_reshape)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 820, in predict
    data = _data_from_pandas(data, None, None, self.pandas_categorical)[0]
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 566, in _data_from_pandas
    raise ValueError('Input data must be 2 dimensional and non empty.')
ValueError: Input data must be 2 dimensional and non empty.

Add ServingRuntime docs

What is changing? (Please include as many details as possible.)

Introduce serving runtimes custom resource and new model spec, this eliminates the need to change KServe controller code every time adding the new serving runtimes.

How will this impact our users?

User can still use the existing way to specify model frameworks, new model spec is introduced to support existing serving runtimes as well as user defined serving runtime.

In what release will this take happen (to the best of your knowledge)?

v0.8

Context

Link to associated PRs or issues from other repos here.

  1. Introduce serving runtime and new model spec
  2. Default serving runtime installations
  3. Auto serving runtime selection

Additional info

Pod termination takes a very long time

/kind feature

Describe the solution you'd like
I have followin InferenceService

port_inference = client.V1ContainerPort(7070, protocol='TCP', name='h2c')
port_management = client.V1ContainerPort(7071, protocol='TCP', name='h2c')
pytorch_predictor=V1beta1PredictorSpec(
    pytorch=V1beta1TorchServeSpec(
        runtime_version='0.5.3-gpu',
        storage_uri='pvc://torchserve-claim/models',
        resources=V1ResourceRequirements(
            requests={'cpu':'4000m', 'memory':'8Gi', 'nvidia.com/gpu': '1'},
            limits={'cpu':'4000m', 'memory':'16Gi', 'nvidia.com/gpu': '1'}
        ),
         ports=[port_inference]
    )
)
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                               kind=constants.KSERVE_KIND,
                               metadata=client.V1ObjectMeta(name=service_name, namespace=namespace),
                               spec=V1beta1InferenceServiceSpec(predictor=pytorch_predictor))

The service deploys quickly and the POD is ready to go in a few seconds.

kubectl get pods -n kubeflow-user-example-com
POD_BERT = kubectl get pods -n kubeflow-user-example-com | grep -Eo "(model-[_A-Za-z0-9-]+)"

When I delete the InferenceService, the POD terminates for a very long time - about five minutes!
What is the reason and how can termination be accelerated?

Update the REST/gRPC V2 API with load/unload endpoints

/kind feature

Describe the solution you'd like
There's a new load/unload endpoint that's supported in MLserver and Triton. Would be good to have this formally documented in the API docs in the repo.

Anything else you would like to add:

Remove obsolete website document

Expected Behavior

The old links should no longer work and ideally redirect to the new website url.
Eg: The following link https://kserve.github.io/website/get_started/ should redirect to https://kserve.github.io/website/latest/get_started/

Actual Behavior

Its pointing to old doc which is applicable only to 0.7 version of kserve.

Additional Info

Additional context
Add any other context about the problem here.

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.):
  • KServe Version:

onnx example not working with triton

/kind bug

What steps did you take and what happened:
I closely followed https://github.com/kserve/kserve/blob/release-0.8/docs/samples/v1beta1/onnx/README.md
to test kserve with the onnx model provided ( storageUri: "gs://kfserving-examples/onnx/style") as I want to use kserve with .onnx models.

The problem here is that the style model provided is a single .onnx file which was needed for the onnx runtime i suppose.
Now with onnx runtime replaced by triton this exampel does not work anymore as triton will not load from a single onnx file (/mnt/models/model.onnx).

Log from the triton server in kserve-container:


+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 14:04:33.356075 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 14:04:33.356082 1 server.cc:589]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0629 14:04:33.356195 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Triton requires a the onnx file to arranged something like this:

my_model/
  -- my_model.pbtxt
  -- 1/
       -- my_model.onnx

After creating my own triton compatible model triton loaded it correctly:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "my_model"
spec:
  predictor:
    minReplicas: 0
    onnx:
      storageUri: "http://192.168.4.252:8000/my_model.tar.gz"
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 13:48:31.691292 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 13:48:31.691309 1 server.cc:589]
+------------+---------+--------+
| Model      | Version | Status |
+------------+---------+--------+
| my_model | 1       | READY  |
+------------+---------+--------+

I0629 13:48:31.691780 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

So I assume that the example needs to be updated.

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version: 0.8
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Kubernetes version: (use kubectl version): 1.22 (microk8s)
  • OS (e.g. from /etc/os-release):

Consolidate CIFAR-10 Outlier Sample

Should keep one copy of the CIFAR-10 Outlier detection sample:

    Hi @rafvasq, thanks for the update! We also have the same website example [here](https://github.com/kserve/website/blob/main/docs/modelserving/detect/alibi_detect/cifar10_outlier.ipynb), should we keep one place instead of two copies?

Originally posted by @yuzisun in kserve/kserve#2472 (comment)

"Build the custom image with Buildpacks" section needs more info.

Describe the change you'd like to see
Build the custom image with Buildpacks might need to update. Currently pack build --builder=heroku/buildpacks:20 ${DOCKER_USER}/custom-model:v1 won't work without specifying the correct python version at runtime.txt. heroku/buildpacks:20 comes with python-3.10 but ray[serve]=1.10.0 , kserve=0.9.0 requires python-3.9.

Additional context
The additional file runtime.txt file with python-3.9.13 is required to successfully build a docker image for a custom model serving.

Cert manager version should be restricted

Expected Behavior

Following https://kserve.github.io/website/admin/serverless/ should result in a successful install

Actual Behavior

Following https://kserve.github.io/website/admin/serverless/ I installed cert-manager at the latest 1.6 release https://kserve.github.io/website/admin/serverless/#3-install-cert-manager. Then when trying to install kserve https://kserve.github.io/website/admin/serverless/#4-install-kserve it failed with

unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Certificate" in version "cert-manager.io/v1alpha2"
unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Issuer" in version "cert-manager.io/v1alpha2"

I had to reduce the cert-manager version to 1.3 then everything worked

Install information:

  • Platform (GKE, IKS, AKS, OnPrem, etc.): k3d 1.20.1
  • KServe Version: 0.7.0

404-page is broken

Hello, KServe team!
I found a bug about 404-page!

Expected Behavior

When we try to access a wrong URL path, a 404-page show properly

Actual Behavior

The 404-page is broken, which does not read assets (JS, CSS, Images).

Screenshot from 2021-12-29 19-20-02

Steps to Reproduce the Problem

Thanks!

Add ModelMesh documentation to website

Some information about ModelMesh should be added to the KServe website. Can probably make a modelmesh folder here for now with a few md files.

Need to later determine if we can use the docs located https://github.com/kserve/modelmesh-serving/tree/main/docs as the main source of the modelmesh docs without having to duplicate a lot of the docs in both places. Or if mkdocs even supports that. Otherwise, we can just keep a subset of the docs on the website.

FYI @animeshsingh

Model framework version updates in 0.8

What is changing? (Please include as many details as possible.)

scikit-learn is updated to 1.0
xgboost is updated to 1.5

How will this impact our users?

only affects the user if runtimeVersion is not pinned on inference service yaml, otherwise the model trained with scikit-learn == 0.23.0 may not work with sklearnserver 0.8 which is now updated with scikit-learn == 1.0

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8.0

Context

Link to associated PRs or issues from other repos here.

  1. kserve/kserve#1954

Additional info

Broken links in the kserve documentation

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
If you click on the link for "production installation... Administrator's Guide", the link sends you to a broken page.
Screen Shot 2022-05-09 at 12 43 14

Screen Shot 2022-05-09 at 12 44 07

What did you expect to happen:
A link to a page for the production installation.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
There are several broken or outdated links in the documentation.

Environment:

  • Istio Version:
  • Knative Version:
  • KFServing Version:
  • Kubeflow version:
  • Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
  • Minikube version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):

Screen alignment issue in 0.8 and master versions of the doc

An extra empty sidebar appears on the right side of the home page. It reduces the available space for main content and messes up the alignment.

Expected Behavior

image

Actual Behavior

image

Steps to Reproduce the Problem

Load the 0.8 or the master version of the document and navigate to the home page.

Add a provision to show a version for master branch

Add master as a dropdown option for versions in website

While everyone adds new feature and fix bugs in master branch of KServe, they would ideally like to update the document for it as well. However, since 0.7 is the default deployment in website, every time when a doc was updated, it updates them in 0.7. However 0.7 version of KServe is already released and some of these document changes may not be applicable to it and rather be applicable to a newer version or the master.

Add master version in dropdown and make that default for new doc changes.
Retain latest release version (0.7) as the default option in dropdown.

Write a AWS Cognito Guide

/kind feature

Describe the solution you'd like
There is not a clear write up on how to configure AWS Cognito for Kfserving. The current
kubeflow e2e does not cover kfserving configuration.

There is a good writeup for GCloud IAP.
That guide could be used as a model for the AWS Cognito. This would go a long way to avoid having people struggle with the setup for kfserving on AWS.

Recent issues on this:
kserve/kserve#1154
kubeflow/website#2378

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.