kserve / website Goto Github PK

User documentation for KServe.

Home Page: https://kserve.github.io/website/

License: Apache License 2.0

CSS 41.32% HTML 56.18% Shell 2.49%

canary-deployment cloud-native documentation drift-detection explainable-ai hacktoberfest hacktoberfest2023 inference machine-learning mlops serverless

website's Introduction

KServe

KServe provides a Kubernetes Custom Resource Definition for serving predictive and generative machine learning (ML) models. It aims to solve production model serving use cases by providing high abstraction interfaces for Tensorflow, XGBoost, ScikitLearn, PyTorch, Huggingface Transformer/LLM models using standardized data plane protocols.

It encapsulates the complexity of autoscaling, networking, health checking, and server configuration to bring cutting edge serving features like GPU Autoscaling, Scale to Zero, and Canary Rollouts to your ML deployments. It enables a simple, pluggable, and complete story for Production ML Serving including prediction, pre-processing, post-processing and explainability. KServe is being used across various organizations.

For more details, visit the KServe website.

KFServing has been rebranded to KServe since v0.7.

Why KServe?

KServe is a standard, cloud agnostic Model Inference Platform for serving predictive and generative AI models on Kubernetes, built for highly scalable use cases.
Provides performant, standardized inference protocol across ML frameworks including OpenAI specification for generative models.
Support modern serverless inference workload with request based autoscaling including scale-to-zero on CPU and GPU.
Provides high scalability, density packing and intelligent routing using ModelMesh.
Simple and pluggable production serving for inference, pre/post processing, monitoring and explainability.
Advanced deployments for canary rollout, pipeline, ensembles with InferenceGraph.

Learn More

To learn more about KServe, how to use various supported features, and how to participate in the KServe community, please follow the KServe website documentation. Additionally, we have compiled a list of presentations and demos to dive through various details.

🛠️ Installation

Standalone Installation

Serverless Installation: KServe by default installs Knative for serverless deployment for InferenceService.
Raw Deployment Installation: Compared to Serverless Installation, this is a more lightweight installation. However, this option does not support canary deployment and request based autoscaling with scale-to-zero.
ModelMesh Installation: You can optionally install ModelMesh to enable high-scale, high-density and frequently-changing model serving use cases.
Quick Installation: Install KServe on your local machine.

Kubeflow Installation

KServe is an important addon component of Kubeflow, please learn more from the Kubeflow KServe documentation. Check out the following guides for running on AWS or on OpenShift Container Platform.

🛫 Create your first InferenceService

💡 Roadmap

📘 InferenceService API Reference

🧰 Developer Guide

✍️ Contributor Guide

🤝 Adopters

website's People

Contributors

Stargazers

Watchers

Forkers

suresh-nakkeran yuzisun chinhuang007 pvaneck xcjason markwinter tomcli marrrcin tedhtchang zillow bhakti087 timkleinloog shunpoco alexandrebrown andyi2it benjamintanweihao nakamasato js-ts swapkh91 pradithya laurentgrangeau samj1912 rushtehrani pdhinwa isabella232 juhyung-son rachitchauhan43 pmcfadin alexagriffith surya-smart619 shawnl1225 iamlovingit trent122 varunsh-xilinx aluu317 safoinme jakeneyer raahulraa justredneval andrewroblox xfu83 matty-rose park12sj db-ironmountain ckadner rafvasq mihow andyluo7 muzaffersenkal calebu246 terrytangyuan apo-ger jeroenpeterbos hyoungjun-noh ankitdipto kwould dhavalrepo18 msradam andreeamun jyono vignesh-murugani2i sivanantha321 rhuss houshengbo retocode hkabbay taneem-ibrahim elukey lapetitesouris tessapham jfelix87 greenmoon55 saphire001 nkamath5 guohaoyu110 rajakavitha1 umka1332 paravatha nilakshi104 saikalyan9981 devarsh10 franexmo81 hbelmiro vaibhavjainwiz jooho zoramt davidbellemare bmopuri syntax-error-1337 murata-yu gavrissh kuzm1ch lampajr peterj joffref tjandy98 leduckhc titmuthuraj spolti nielstenboom

website's Issues

Run spelling checks for website repo

Expected Behavior

No spelling errors in website, will run the spelling checks as we have done for the kserve repo kserve/kserve#1970

Actual Behavior

Steps to Reproduce the Problem

https://github.com/check-spelling/check-spelling#basic-configuration
proposed change jsoref/kserve@032bfa6

Additional Info

Additional context
Add any other context about the problem here.

Install information:

Platform (GKE, IKS, AKS, OnPrem, etc.):
KServe Version:

Update storage examples

/kind feature

Describe the solution you'd like
update api group and kfserving name reference
https://github.com/kubeflow/kfserving/tree/master/docs/samples/storage

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Add sample which explain how to deploy model on EKS cluster

/kind bug

What steps did you take and what happened:
I have installed Kubeflow-1.3. I could not figure out which version of kfserving it has installed by default.
Following were namespace created which I believe for KFServing

knative-eventing            Active   151m
knative-serving             Active   152m

What did you expect to happen:
I tried to run XGBoost sample. Everything worked fine except that my SERVICE_HOSTNAME=$(kubectl get inferenceservice xgboost-iris -o jsonpath='{.status.url}' | cut -d "/" -f 3) became xgboost-iris.default.example.com. Which I believe is expected but I did not create any domain for this service.

Can we add a sample which can run vanila EKS cluster without worrying about domain or am I missing something?

Environment:
EKS cluster
K8s version-1.9

Istio Version:
Knative Version:
KFServing Version:
Kubeflow version: 1.3
Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
Minikube version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

Update python sdk doc

/kind feature

Describe the solution you'd like

deprecate v1alpha2 python sdk docs
update SDK docs and examples with kserve api group
https://github.com/kubeflow/kfserving/tree/master/docs/samples/client

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Update the doc for Kubernetes 1.25

Describe the change you'd like to see
Hi,

I have been using Kubernetes v1.25 (which is not in the Recommended Version Matrix), and tried to Install the KServe "Quickstart" environment (for v0.9). It didn't work since it requires Istio version > 1.14.0. I am not sure if it works for Istio v.1.15.0 but I just set the ISTIO_VERSIO=1.16.0 in quick.install.sh, and everything worked with no issue. FYI.

Additional context
Add any other context or screenshots about the feature request here.

Add swagger-ui to dataplane ui

Describe the change you'd like to see

Add swagger-ui for dataplane api within docs website

This would allow user to preview endpoints and request/response objects in a familiar user interface.

Update drift and outlier detection examples

/kind feature

Describe the solution you'd like
https://github.com/kubeflow/kfserving/tree/master/docs/samples/drift-detection/alibi-detect/cifar10
https://github.com/kubeflow/kfserving/tree/master/docs/samples/outlier-detection/alibi-detect/cifar10

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Document how to use STORAGE_URI especially in case of custom framework

/kind feature

Describe the solution you'd like
Describe how to use STORAGE_URI environment variable in InferenceService schema with custom framework.
It could be added in space: kfserving/docs/samples/custom/
This feature is not documented. I would like to know if this feature is official and if it is expected to be available in the future.

Anything else you would like to add:
Important information about feature:

container schema can have STORAGE_URI environment variable,when used in InferenceService, it provides standard storage-initializer functionality.
container name has to be : kfserving-container.
The storage mounts to /mnt/models.

This feature can be very beneficial if someone needs to create a custom docker image with specific dependencies. There was a case in my organization where the serialized model had such dependencies that it was impossible to extract them into a transformer.

STORAGE_URI feature basic locations:
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1alpha2/framework_custom.go#L30
https://github.com/kubeflow/kfserving/blob/master/pkg/apis/serving/v1beta1/predictor_custom.go#L57

other examples where STORAGE_URI is used:
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/torchserve-custom-pv.yaml
https://github.com/kubeflow/kfserving/blob/master/docs/samples/custom/torchserve/bert-sample/bert.yaml
https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1alpha2/triton/bert

I'd like to offer my help, but first I need some initial answer and maybe some instructions.

Docs about using nodeSelector

/kind feature

I am wondering if it is possible to set nodeSelector option for Tensorflow serving (or any other serving different that custom).
I found some information about this in some issues, but they were not very specific.
If it is possible, it would be great to include any example in docs.

AWS IAM Role for Service Account

What is changing? (Please include as many details as possible.)

Adds explicit support for aws IRSA credential method for download models

How will this impact our users?

More secure aws credential management (no static/long lived creds needed in secrets)

In what release will this take happen (to the best of your knowledge)?

v0.10.0

Context

Link to associated PRs or issues from other repos here.

Additional info

Administration - Serverless installation guide includes two istio installation

Knative should be installed before Istio
Knative itself requires a networking layer, a better way is to tell user to choose istio as the networking layer for Knative instead of treating Knative and Istio as two separate components.

Problem seen
If I install Istio before Knative, Kserve simply doesn't work. It could be due to the unsupported versions or network configured in wrong order.

Run mkdocs gh-deploy before mike deploy on the Github workflow

Expected Behavior

By #70 , 404.html was updated. So it was supposed that we could see the error page properly.

Actual Behavior

The error page is still broken.

Steps to Reproduce the Problem

Please see a wrong URL like https://kserve.github.io/website/0.7/admin/.

Additional Info

Cause

On the Github workflow, we use mike deploy command to deploy our docs to Github-pages. Although mike updates versioned contents (/0.7/ and /latest/), it does not update contents on the root such like 404.html.
If we access a wrong URL, Github-pages uses 404.html on the root as an error page. So currently we still see a broken error page if we access a wrong path.
In addition, users can access not-version-controlled documents. If users access https://kserve.github.io/website/admin/serverless/, Github-pages return a content on the root. Because index.html redirects https://kserve.github.io/website/ to a specific version (in this case, https://kserve.github.io/website/0.7), but other paths are not redirected to a specific version. It might cause confusion.

Proposal

Run mkdocs gh-deploy command before we run mike deploy on Github workflow. So we can sync contents on the root with the latest contents.

Thanks!

TorchServe doc updates for 0.8

What is changing? (Please include as many details as possible.)

How will this impact our users?

TorchServe is now updated to 0.5.2

KServe migration: deprecate service_envelope in config.properties and use enable_envvars_config: true to enable service envelop at runtime.
KServe v2 REST protocol support

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8

Context

Link to associated PRs or issues from other repos here.

Additional info

No license statement

Expected Behavior

The website and the contained documentation should contain a clear license statement.

Actual Behavior

There is no license statement present.

Recommendation

A common choice for this type of content is CC-BY (Creative Commons 'Attribution'), although CC-BY-SA ('Attribution' plus 'ShareAlike') can also be used if the authors want to ensure that the content remains licensed under the licensed when it is reused or derivative works are produced.

Document kserve dependency version support

Describe the change you'd like to see
Document the dependency version matrix for Kubernetes/Knative/Istio, we won't be able to test each version combinations but like to give recommendations for the version set which has been tested and certified.

Additional context
Add any other context or screenshots about the feature request here.

Versioning the website docs

Describe the change you'd like to see
A clear and concise description of what you want to happen.

Additional context
Add any other context or screenshots about the feature request here.

Document user migration steps

/kind feature

Describe the solution you'd like
Document user migration steps from kfserving to kserve, this migration guide will be referenced on the blog.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

KServe "Quickstart" environment script file is wrong

Expected Behavior

By executing the quick-install script referred to Getting Started page, users should be able to install KServe on their local environment.

The script referenced in the quick start page should be https://raw.githubusercontent.com/kserve/kserve/master/hack/quick_install.sh

Actual Behavior

The current link is https://raw.githubusercontent.com/kserve/kserve/release-0.7/hack/quick_install.sh, which refers to a local file for installing kserve. (https://github.com/kserve/kserve/blob/9112a5d81395cf4b32cd9c9b35f10e030c23b77e/hack/quick_install.sh#L106)

It will only raise error.

Steps to Reproduce the Problem

Visit the links

Additional Info

Additional context
Add any other context about the problem here.

Install information:

Platform (GKE, IKS, AKS, OnPrem, etc.):
KServe Version: 0.7.0

Add Inference Service API Reference doc

Update v1beta1 inference examples

/kind feature

Describe the solution you'd like
Update api group and kfserving name references in v1beta1 samples.
https://github.com/kubeflow/kfserving/tree/master/docs/samples/v1beta1

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Add a provision to make changes across multiple versions of the document

**Provision to add common changes across versions like banner **

Banners, popups, etc should usually appear across multiple live versions of the document.
Currently this requires deploying or updating old docs. This would make it more difficult as this new banner / other changes have to be updated in each version of the doc manually.

Proposing a new solution to update common things like banner, popup, etc across multiple documents.

Deploy a common js file in gh-pages repo
Add the common js file to docs

Any future common doc changes would only require changes in common.js

Documentation Improvements for v0.10.0

What is changing? (Please include as many details as possible.)

We are working on the KServe 2023 Roadmap kserve/kserve#2526 and the v0.10.0 release, as well as preparing for our eventual v1.0 😄 . For all of this and to keep up with our changes, we need to improve our documentation and website!

Here are the current objectives to improve the Kserve website documentation

Unify the data plane v1 and v2 page formats (one has table one has long form text)
Improve the v2 page to be more succinct and clearly show what can be used by splitting the page into two: (1) that tells the story of why and what changed and (2) to explain how to use v2 and what it provides
document fastapi updates for model server
document serving runtime spec changes
clean up the examples in kserve repo and unify them with the website's by creating one source of truth for example documentation
update any out-of-date documentation and make sure the website as a whole is consistent and cohesive
add monitoring setup documentation & documentation for knative serverless install #24
implement spellcheck in repo #82
move queue proxy extension documentation to website

In what release will this take happen (to the best of your knowledge)?

v0.10

Update logger/batcher/autoscaler/canary examples

/kind feature

Describe the solution you'd like
Update examples for api groups and kfserving name reference for
https://github.com/kubeflow/kfserving/tree/master/docs/samples/batcher
https://github.com/kubeflow/kfserving/tree/master/docs/samples/autoscaling
https://github.com/kubeflow/kfserving/tree/master/docs/samples/kafka
https://github.com/kubeflow/kfserving/tree/master/docs/samples/logger

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

kfctl v1.3.0 link broken for kubeflow deployment on Azure

/kind bug

What steps did you take and what happened:

kubeflow setup on azure using the link https://www.kubeflow.org/docs/distributions/azure/deploy/install-kubeflow/
the link is broken for v1.3 : Download the kfctl v1.3.0 release from the Kubeflow releases page.
so have installed the available https://github.com/kubeflow/kfctl/releases

What did you expect to happen:

I need to use kfserving with torchserve on azure. What i should do for this to work ?

LightGBM example fails

/kind bug

What steps did you take and what happened:

When I followed https://kserve.github.io/website/0.8/modelserving/v1beta1/lightgbm/ the request fails with {"error": "Input data must be 2 dimensional and non empty."}. I think I narrowed it down to lightgbm.Dataset failing to autodetect the feature names, the model appears to just have generic "Column_0", "Column_1", etc feature names.

I got the example to work by changing this part:

iris = load_iris()
y = iris['target']
X = iris['data']
dtrain = lgb.Dataset(X, label=y)

...to this:

iris = load_iris()
y = iris['target']
X = iris['data']
feature_name = iris['feature_names']
dtrain = lgb.Dataset(X, label=y, feature_name=feature_name)

Here's a reproduction:

import lightgbm as lgb
import pandas as pd
from sklearn.datasets import load_iris


def main():
    explicit_model = train(make_explicit_dataset())
    auto_model = train(make_auto_dataset())

    print("Trying explicit model")
    print(predict(explicit_model))
    print("\n")
    print("Trying auto model")
    print(predict(auto_model))


def train(dataset):
    params = {
        'objective':'multiclass',
        'metric':'softmax',
        'num_class': 3
    }
    return lgb.train(params=params, train_set=dataset)


def make_auto_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    return lgb.Dataset(X, label=y)


def make_explicit_dataset():
    iris = load_iris()
    y = iris['target']
    X = iris['data']
    feature_name = iris['feature_names']
    return lgb.Dataset(X, label=y, feature_name=feature_name)


def predict(model):
    request = {'sepal_width_(cm)': {0: 3.5}, 'petal_length_(cm)': {0: 1.4}, 'petal_width_(cm)': {0: 0.2},'sepal_length_(cm)': {0: 5.1} }

    # Simulate kserve's lgbserver behavior:
    df = pd.DataFrame(request, columns=model.feature_name())
    inputs = pd.concat([df], axis=0)
    return model.predict(inputs)


if __name__ == "__main__":
    main()

The output I got, with Python 3.7, scikit-learn == 1.0.1, lightgbm == 3.3.2:

Trying explicit model
[[9.99985204e-01 1.38238969e-05 9.72063744e-07]]


Trying auto model
Traceback (most recent call last):
  File "train.py", line 52, in <module>
    main()
  File "train.py", line 14, in main
    print(predict(auto_model))
  File "train.py", line 47, in predict
    return model.predict(inputs)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 3540, in predict
    data_has_header, is_reshape)
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 820, in predict
    data = _data_from_pandas(data, None, None, self.pandas_categorical)[0]
  File "/usr/local/lib/python3.7/site-packages/lightgbm/basic.py", line 566, in _data_from_pandas
    raise ValueError('Input data must be 2 dimensional and non empty.')
ValueError: Input data must be 2 dimensional and non empty.

Add ServingRuntime docs

What is changing? (Please include as many details as possible.)

Introduce serving runtimes custom resource and new model spec, this eliminates the need to change KServe controller code every time adding the new serving runtimes.

How will this impact our users?

User can still use the existing way to specify model frameworks, new model spec is introduced to support existing serving runtimes as well as user defined serving runtime.

In what release will this take happen (to the best of your knowledge)?

v0.8

Context

Link to associated PRs or issues from other repos here.

Additional info

add inspur logo to home page

Describe the change you'd like to see
Add inspur logo to home page

Pod termination takes a very long time

/kind feature

Describe the solution you'd like
I have followin InferenceService

port_inference = client.V1ContainerPort(7070, protocol='TCP', name='h2c')
port_management = client.V1ContainerPort(7071, protocol='TCP', name='h2c')
pytorch_predictor=V1beta1PredictorSpec(
    pytorch=V1beta1TorchServeSpec(
        runtime_version='0.5.3-gpu',
        storage_uri='pvc://torchserve-claim/models',
        resources=V1ResourceRequirements(
            requests={'cpu':'4000m', 'memory':'8Gi', 'nvidia.com/gpu': '1'},
            limits={'cpu':'4000m', 'memory':'16Gi', 'nvidia.com/gpu': '1'}
        ),
         ports=[port_inference]
    )
)
isvc = V1beta1InferenceService(api_version=constants.KSERVE_V1BETA1,
                               kind=constants.KSERVE_KIND,
                               metadata=client.V1ObjectMeta(name=service_name, namespace=namespace),
                               spec=V1beta1InferenceServiceSpec(predictor=pytorch_predictor))

The service deploys quickly and the POD is ready to go in a few seconds.

kubectl get pods -n kubeflow-user-example-com
POD_BERT = kubectl get pods -n kubeflow-user-example-com | grep -Eo "(model-[_A-Za-z0-9-]+)"

When I delete the InferenceService, the POD terminates for a very long time - about five minutes!
What is the reason and how can termination be accelerated?

Update the REST/gRPC V2 API with load/unload endpoints

/kind feature

Describe the solution you'd like
There's a new load/unload endpoint that's supported in MLserver and Triton. Would be good to have this formally documented in the API docs in the repo.

Anything else you would like to add:

Remove obsolete website document

Expected Behavior

The old links should no longer work and ideally redirect to the new website url.
Eg: The following link https://kserve.github.io/website/get_started/ should redirect to https://kserve.github.io/website/latest/get_started/

Actual Behavior

Its pointing to old doc which is applicable only to 0.7 version of kserve.

Additional Info

Additional context
Add any other context about the problem here.

Install information:

Platform (GKE, IKS, AKS, OnPrem, etc.):
KServe Version:

onnx example not working with triton

/kind bug

What steps did you take and what happened:
I closely followed https://github.com/kserve/kserve/blob/release-0.8/docs/samples/v1beta1/onnx/README.md
to test kserve with the onnx model provided ( storageUri: "gs://kfserving-examples/onnx/style") as I want to use kserve with .onnx models.

The problem here is that the style model provided is a single .onnx file which was needed for the onnx runtime i suppose.
Now with onnx runtime replaced by triton this exampel does not work anymore as triton will not load from a single onnx file (/mnt/models/model.onnx).

Log from the triton server in kserve-container:


+------------------+------+
| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 14:04:33.356075 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 14:04:33.356082 1 server.cc:589]
+-------+---------+--------+
| Model | Version | Status |
+-------+---------+--------+
+-------+---------+--------+

I0629 14:04:33.356195 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Triton requires a the onnx file to arranged something like this:

my_model/
  -- my_model.pbtxt
  -- 1/
       -- my_model.onnx

After creating my own triton compatible model triton loaded it correctly:

apiVersion: "serving.kserve.io/v1beta1"
kind: "InferenceService"
metadata:
  name: "my_model"
spec:
  predictor:
    minReplicas: 0
    onnx:
      storageUri: "http://192.168.4.252:8000/my_model.tar.gz"

| Repository Agent | Path |
+------------------+------+
+------------------+------+

I0629 13:48:31.691292 1 server.cc:546]
+-------------+-----------------------------------------------------------------+--------+
| Backend     | Path                                                            | Config |
+-------------+-----------------------------------------------------------------+--------+
| pytorch     | /opt/tritonserver/backends/pytorch/libtriton_pytorch.so         | {}     |
| tensorflow  | /opt/tritonserver/backends/tensorflow1/libtriton_tensorflow1.so | {}     |
| onnxruntime | /opt/tritonserver/backends/onnxruntime/libtriton_onnxruntime.so | {}     |
| openvino    | /opt/tritonserver/backends/openvino/libtriton_openvino.so       | {}     |
+-------------+-----------------------------------------------------------------+--------+

I0629 13:48:31.691309 1 server.cc:589]
+------------+---------+--------+
| Model      | Version | Status |
+------------+---------+--------+
| my_model | 1       | READY  |
+------------+---------+--------+

I0629 13:48:31.691780 1 tritonserver.cc:1836]
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Option                           | Value                                                                                                                                                                                  |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| server_id                        | triton                                                                                                                                                                                 |
| server_version                   | 2.14.0                                                                                                                                                                                 |
| server_extensions                | classification sequence model_repository model_repository(unload_dependents) schedule_policy model_configuration system_shared_memory cuda_shared_memory binary_tensor_data statistics |
| model_repository_path[0]         | /mnt/models                                                                                                                                                                            |
| model_control_mode               | MODE_NONE                                                                                                                                                                              |
| strict_model_config              | 1                                                                                                                                                                                      |
| rate_limit                       | OFF                                                                                                                                                                                    |
| pinned_memory_pool_byte_size     | 268435456                                                                                                                                                                              |
| cuda_memory_pool_byte_size{0}    | 67108864                                                                                                                                                                               |
| min_supported_compute_capability | 6.0                                                                                                                                                                                    |
| strict_readiness                 | 1                                                                                                                                                                                      |
| exit_timeout                     | 30                                                                                                                                                                                     |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

So I assume that the example needs to be updated.

Environment:

Istio Version:
Knative Version:
KFServing Version: 0.8
Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
Kubernetes version: (use kubectl version): 1.22 (microk8s)
OS (e.g. from /etc/os-release):

Consolidate CIFAR-10 Outlier Sample

Should keep one copy of the CIFAR-10 Outlier detection sample:

    Hi @rafvasq, thanks for the update! We also have the same website example [here](https://github.com/kserve/website/blob/main/docs/modelserving/detect/alibi_detect/cifar10_outlier.ipynb), should we keep one place instead of two copies?

Originally posted by @yuzisun in kserve/kserve#2472 (comment)

Update kserve docs with autoscale feature spec

Describe the change you'd like to see
Autoscaling options at component was recently added. The document has to be updated corresponding to it.

"Build the custom image with Buildpacks" section needs more info.

Describe the change you'd like to see
Build the custom image with Buildpacks might need to update. Currently pack build --builder=heroku/buildpacks:20 ${DOCKER_USER}/custom-model:v1 won't work without specifying the correct python version at runtime.txt. heroku/buildpacks:20 comes with python-3.10 but ray[serve]=1.10.0 , kserve=0.9.0 requires python-3.9.

Additional context
The additional file runtime.txt file with python-3.9.13 is required to successfully build a docker image for a custom model serving.

Adding model serve web app standalone installation doc

/kind feature

It will be really great to have the model serve web app as a standalone installation steps to the KServe administration doc.

Very similar steps already exist for kubeflow's model serve UI as an overlay.

Add monitoring doc for knative serverless install

Describe the change you'd like to see
Add Monitoring doc for knative serverless install

Additional context
Knative has moved the monitoring dashboards to https://github.com/knative-sandbox/monitoring

Cert manager version should be restricted

Expected Behavior

Following https://kserve.github.io/website/admin/serverless/ should result in a successful install

Actual Behavior

Following https://kserve.github.io/website/admin/serverless/ I installed cert-manager at the latest 1.6 release https://kserve.github.io/website/admin/serverless/#3-install-cert-manager. Then when trying to install kserve https://kserve.github.io/website/admin/serverless/#4-install-kserve it failed with

unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Certificate" in version "cert-manager.io/v1alpha2"
unable to recognize "https://github.com/kserve/kserve/releases/download/v0.7.0/kserve.yaml": no matches for kind "Issuer" in version "cert-manager.io/v1alpha2"

I had to reduce the cert-manager version to 1.3 then everything worked

Install information:

Platform (GKE, IKS, AKS, OnPrem, etc.): k3d 1.20.1
KServe Version: 0.7.0

404-page is broken

Hello, KServe team!
I found a bug about 404-page!

Expected Behavior

When we try to access a wrong URL path, a 404-page show properly

Actual Behavior

The 404-page is broken, which does not read assets (JS, CSS, Images).

Steps to Reproduce the Problem

Access this inappropriate URL: https://kserve.github.io/website/0.7/admin/

Thanks!

Update multi model serving docs

/kind feature

Describe the solution you'd like
Update multi model serving docs for api group and kfserving name reference
https://github.com/kubeflow/kfserving/tree/master/docs/samples/multimodelserving

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Add ModelMesh documentation to website

Some information about ModelMesh should be added to the KServe website. Can probably make a modelmesh folder here for now with a few md files.

Need to later determine if we can use the docs located https://github.com/kserve/modelmesh-serving/tree/main/docs as the main source of the modelmesh docs without having to duplicate a lot of the docs in both places. Or if mkdocs even supports that. Otherwise, we can just keep a subset of the docs on the website.

FYI @animeshsingh

Model framework version updates in 0.8

What is changing? (Please include as many details as possible.)

scikit-learn is updated to 1.0
xgboost is updated to 1.5

How will this impact our users?

only affects the user if runtimeVersion is not pinned on inference service yaml, otherwise the model trained with scikit-learn == 0.23.0 may not work with sklearnserver 0.8 which is now updated with scikit-learn == 1.0

In what release will this take happen (to the best of your knowledge)?

Ex. v0.8.0

Context

Link to associated PRs or issues from other repos here.

kserve/kserve#1954

Additional info

Broken links in the kserve documentation

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]
If you click on the link for "production installation... Administrator's Guide", the link sends you to a broken page.

What did you expect to happen:
A link to a page for the production installation.

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
There are several broken or outdated links in the documentation.

Environment:

Istio Version:
Knative Version:
KFServing Version:
Kubeflow version:
Kfdef:[k8s_istio/istio_dex/gcp_basic_auth/gcp_iap/aws/aws_cognito/ibm]
Minikube version:
Kubernetes version: (use kubectl version):
OS (e.g. from /etc/os-release):

Move website to root path

Currently the website is only available at subpath /website https://kserve.github.io/website

I think it would be better if it was under the rootpath / so it loads when visiting https://kserve.github.io

Or at least set up a redirect like the knative docs seems to do https://knative.dev/

Screen alignment issue in 0.8 and master versions of the doc

An extra empty sidebar appears on the right side of the home page. It reduces the available space for main content and messes up the alignment.

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Load the 0.8 or the master version of the document and navigate to the home page.

Update SDK class name changes for 0.8 release

Describe the change you'd like to see
see previous PR #80

Additional context
Add any other context or screenshots about the feature request here.

Update explanation docs

/kind feature

Describe the solution you'd like
Update api group and kfserving name reference in explanation example docs
https://github.com/kubeflow/kfserving/tree/master/docs/samples/explanation

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Debugging guide has to moved to website from kserve repo

Debug guide is currently not available in kserve website. Since other dev guides are moved to website, we have to move the debug guide as well to this repo.

Refer https://github.com/kserve/kserve/blob/master/docs/KFSERVING_DEBUG_GUIDE.md

Add a provision to show a version for master branch

Add master as a dropdown option for versions in website

While everyone adds new feature and fix bugs in master branch of KServe, they would ideally like to update the document for it as well. However, since 0.7 is the default deployment in website, every time when a doc was updated, it updates them in 0.7. However 0.7 version of KServe is already released and some of these document changes may not be applicable to it and rather be applicable to a newer version or the master.

Add master version in dropdown and make that default for new doc changes.
Retain latest release version (0.7) as the default option in dropdown.

Update AIF explainer example with the latest knative API.

Expected Behavior

Since KServe is updated with a later version of knative. Knative service v1alpha1 is not longer supported, so we need to update the AIF example to use the latest knative API.

Write a AWS Cognito Guide

/kind feature

Describe the solution you'd like
There is not a clear write up on how to configure AWS Cognito for Kfserving. The current
kubeflow e2e does not cover kfserving configuration.

There is a good writeup for GCloud IAP.
That guide could be used as a model for the AWS Cognito. This would go a long way to avoid having people struggle with the setup for kfserving on AWS.

Recent issues on this:
kserve/kserve#1154
kubeflow/website#2378

kserve / website Goto Github PK

website's Introduction

KServe

Why KServe?

Learn More

🛠️ Installation

Standalone Installation

Kubeflow Installation

🛫 Create your first InferenceService

💡 Roadmap

📘 InferenceService API Reference

🧰 Developer Guide

✍️ Contributor Guide

🤝 Adopters

website's People

Contributors

Stargazers

Watchers

Forkers

website's Issues

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

What is changing? (Please include as many details as possible.)

How will this impact our users?

In what release will this take happen (to the best of your knowledge)?

Context

Additional info

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

Cause

Proposal

What is changing? (Please include as many details as possible.)

How will this impact our users?

In what release will this take happen (to the best of your knowledge)?

Context

Additional info

Expected Behavior

Actual Behavior

Recommendation

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Additional Info

What is changing? (Please include as many details as possible.)

In what release will this take happen (to the best of your knowledge)?

What is changing? (Please include as many details as possible.)

How will this impact our users?

In what release will this take happen (to the best of your knowledge)?

Context

Additional info

Expected Behavior

Actual Behavior

Additional Info

Expected Behavior

Actual Behavior

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

What is changing? (Please include as many details as possible.)

How will this impact our users?

In what release will this take happen (to the best of your knowledge)?

Context

Additional info

Expected Behavior

Actual Behavior

Steps to Reproduce the Problem

Expected Behavior

Recommend Projects

Recommend Topics

Recommend Org